-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.faa.gz files not being downloaded for bacteria #136
Comments
The main thing that comes to mind is that right now there's 127 plant assemblies, 333 fungal assemblies, 1,157 archaeal assemblies, and 200,357 bacterial assemblies in RefSeq. So there are massively more bacterial assemblies. |
|
Hm, I don't think you should be running out of memory on a restricted download set like this. So much for that theory. Could you run one of your download commands with the added |
Executed command @kblin Is this error reproducible on your side? If not I can try to dig into the python code myself, it looks pretty clean. I can really post the last lines because it takes quite a while ... i.e. I have no idea how long my request would take. |
The full set of complete bacteria is a big download. I've just started a download with 12 parallel server connections, and extrapolating from the speed I'm getting the I just noticed that I didn't release the progress bar changes that tell you about the |
Ahh ok, so I will first download all |
Hello.
For the past week, I have been attempting to download protein fasta files for all bacteria using the following command:
ncbi-genome-download -F 'protein-fasta' -p 5 -r 3 -v 'bacteria'
This creates the directory structure as ./refseq/bacteria/GCF* containing only the MD5SUMS file in each directory.
Strangely enough, the same command run for other groups (archaea, fungi, plants, etc.) runs just fine and downloads the desired .faa.gz files.
What am I missing here?
The text was updated successfully, but these errors were encountered: