Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are species and their chromosomes hardcoded ? #23

Open
vidal-adrien opened this issue Oct 4, 2023 · 3 comments
Open

Why are species and their chromosomes hardcoded ? #23

vidal-adrien opened this issue Oct 4, 2023 · 3 comments

Comments

@vidal-adrien
Copy link

Rather than having a species argument referring to genome data in code, why not let this argument take a two column file (chromosone name, chromosome length) and parse it to let users process data of any genome.

In its current state the tool is limited to the set of species defined in GenomeData.py, making editing the source code necessary to treat any other species.

Cordially,
Adrien V.

@kelly-sovacool
Copy link

Perhaps parsing a JSON file into a python dictionary in the same format as those in GenomeData.py would allow a simple way to provide a custom genome.

@vidal-adrien
Copy link
Author

vidal-adrien commented Oct 5, 2023

Even Json seems needlessly complicated. It's only two variables: chromosome name and length. 1 column less than a bed file. These are the only variables that are retrieved from the species argument from what i can tell.

A parser that reads the first two columns of a file would even be able to use samtools fasta indexes as an input.

Even easier for the user would be to just have the fasta be the input file and parse it to get chromosome name and length.

The preset species could also be kept for compatibility. 1 file each in a folder in the library and then look first in this folder for speciesName.tab and if not found then treat it as a path to a custom species table.

@vidal-adrien
Copy link
Author

In case anybody wants that feature, this project does it and more:
https://github.com/biocore-ntnu/epic2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants