Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

downloading multiple datasets in bulk? #4

Closed
flamholz opened this issue Oct 26, 2017 · 1 comment
Closed

downloading multiple datasets in bulk? #4

flamholz opened this issue Oct 26, 2017 · 1 comment

Comments

@flamholz
Copy link

I can't tell whether this is possible using jgi-query or with the JGI API in general. I would like to download all of their bacterial genomes if at all possible but can't find a way to get a list by kingdom.

Can you provide any guidance here?
A

@glarue
Copy link
Owner

glarue commented Oct 26, 2017

@flamholz It unfortunately doesn't seem possible to do in one fell swoop, as best I can determine. Because of the way the API works, you need to provide jgi-query with the organism abbreviation found in the URL of the associated Download page. I poked around a little bit and can't find a master page for "bacteria" (unlike, for example, "fungi", which does work).

This might be of some use, however: if you go to the main Genome Portal page, you can do an advanced search for Groups containing "bacteria" which gives 33 results:
jgi_group_search

If you click through to the Download page for each of those, you can get URLs that will work with jgi-query. For example, the first result in that list has the download page URL https://genome.jgi.doe.gov/acidobacteria/acidobacteria.download.html, which you can feed directly into jgi-query and it'll get all the files for that group. To help with this, I have just added an option to the file selection dialog to download all found files (2551f8c). This should avoid the additional headache of having to specify numerical ranges for every file found for each group. It's not ideal, but perhaps it'll get you the data you want?

Alternatively, you can of course use JGI's bulk downloading tool Globus to download to a local computer and then upload to a server (assuming that's your use-case).

Hope that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants