Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUSCO v2 database download fails, but gets marked as done #161

Open
eburgueno opened this issue Mar 4, 2020 · 0 comments
Open

BUSCO v2 database download fails, but gets marked as done #161

eburgueno opened this issue Mar 4, 2020 · 0 comments

Comments

@eburgueno
Copy link

Trying to download any BUSCO databases at the moment fails because curl is not following HTTP redirections:

$ dammit databases --database-dir /scratch/dammit/databases --install --busco-group eukaryota
# dammit
## a tool for easy de novo transcriptome annotation
by Camille Scott
**v1.2**, 2018
(...)

- [ ] download_and_untar:busco2db-eukaryota: 
    * Cmd: `mkdir -p /scratch/dammit/databases/busco2db; curl https://busco.ezlab.org/v2/datasets/eukaryota_odb9.tar.gz | tar -xz -C /scratch/dammit/databases/busco2db`
    * Cmd: `touch /scratch/dammit/databases/busco2db/download_and_untar:busco2db-eukaryota.done`
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    81  100    81    0     0    205      0 --:--:-- --:--:-- --:--:--   205

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

When running the same command using curl -L the database gets downloaded correctly:

$ curl -L https://busco.ezlab.org/v2/datasets/eukaryota_odb9.tar.gz | tar -xz -C /scratch/dammit/databases/busco2db
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    81  100    81    0     0    197      0 --:--:-- --:--:-- --:--:--   197
100 12.6M  100 12.6M    0     0  1606k      0  0:00:08  0:00:08 --:--:-- 2766k

This might belong as a separate issue, but I'll also mention it here because it's related: download_and_untar:busco2db-eukaryota.done gets created despite the curl | tar pipeline having failed. This makes dammit think that the databases are actually there when in fact they are not.

eburgueno added a commit to eburgueno/dammit that referenced this issue Mar 4, 2020
- Add `-L` flag to `curl` so that remote server redirects are followed
- Only create `.done` file if the download and extraction was successful
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant