Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support FASTA format #184

Open
sehilyi opened this issue Feb 5, 2021 · 3 comments
Open

Support FASTA format #184

sehilyi opened this issue Feb 5, 2021 · 3 comments
Labels
enhancement New feature or request P? Priority needs to be decided
Projects

Comments

@sehilyi
Copy link
Member

sehilyi commented Feb 5, 2021

https://github.com/higlass/higlass-sequence

@sehilyi sehilyi added enhancement New feature or request P? Priority needs to be decided labels Feb 5, 2021
@sehilyi sehilyi added this to Unscheduled in Roadmap Feb 5, 2021
@sehilyi sehilyi changed the title Support fa format Support FASTA format Feb 8, 2021
@sehilyi sehilyi moved this from Unscheduled to v1.1 in Roadmap Sep 21, 2021
@etowahadams
Copy link
Contributor

Currently sequences data can be used in gosling through data "type": "multivec" . An example of this is in multiscale sequence track. It shows the sequence as a track with "mark": "bar" and overlays the letters below a certain zoom threshold.

We also want to be able to import data as fasta files. The gosling schema should accept a "type": "fasta" The HiGlass sequence viewer also requires that the fasta file be indexed and the chromosome sizes be specified. See below for an example from HiGlass:

 "data": {
   "type": "fasta",
   "fastaUrl": "https://aveit.s3.amazonaws.com/higlass/data/sequence/hg38.fa",
   "faiUrl": "https://aveit.s3.amazonaws.com/higlass/data/sequence/hg38.fa.fai",
   "chromSizesUrl": "https://aveit.s3.amazonaws.com/higlass/data/sequence/hg38.mod.chrom.sizes"
 },

Decisions to make:

  1. Should we always require the user to index fasta files? Previous discussions on gff and vcf suggest this may be the path we should go down.
  2. Should we have a dedicated sequence viewer track the way that HiGlass does? Or should we continue to have "mark": "bar" with overlaid text be the main way that sequences are viewed?

@sehilyi
Copy link
Member Author

sehilyi commented Apr 18, 2023

Should we always require the user to index fasta files? #764 (comment) on gff and vcf suggest this may be the path we should go down.

I think we need to make an index file required -- I am unsure if it would be possible to support smooth zooming/panning w/o it. The required properties can be like the following

{ // type of FastaData
  type: 'fasta',
  url: string,
  indexUrl: string
}

FYI, we do not use HiGlass' chromSizesUrl. Instead we let users define assembly.

Should we have a dedicated sequence viewer track the way that HiGlass does? Or should we continue to have "mark": "bar" with overlaid text be the main way that sequences are viewed?

I think we want to let users choose the visual mark that they wish to use, although I believe bar/text will be the typical mark types for such data.

@sehilyi
Copy link
Member Author

sehilyi commented Apr 18, 2023

For the second question, we could implement a track template for fasta data, e.g., template: 'fasta', to make the construction of a fasta-based sequence track easier. You can find some examples in our Editor (the very last example in the gallery).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P? Priority needs to be decided
Projects
Archived in project
Development

No branches or pull requests

2 participants