Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gunzip silently fails when running get_data_dependencies.sh #137

Open
dgaston-nsh opened this issue Jun 27, 2023 · 1 comment
Open

gunzip silently fails when running get_data_dependencies.sh #137

dgaston-nsh opened this issue Jun 27, 2023 · 1 comment

Comments

@dgaston-nsh
Copy link

Sometimes curl over ftp of the GRCh38 reference fasta file from NCBI results in a file with a small corruption that prevents gunzip from unzipping the file. While it isn't totally silent this isn't obvious while the script is running and doesn't halt downstream execution. Cat works (although may warn of a non-existent file) and we end up with a composite fasta that contains only the covid genome reference. This causes dehosting to work improperly but produce no errors and resulting fastqs are not dehosted

@jaleezyy
Copy link
Owner

Unfortunately, I can't exactly recreate this issue but have been hearing of this curl to the NCBI FTP being variable as of late. If you run the individual lines from the get_data_dependencies.sh but with added verbose flags:

curl -s --verbose "ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_for_alignment_pipelines.ucsc_ids/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz" > ./GRC38_no_alt_analysis_set.fna.gz

gunzip --verbose ./GRC38_no_alt_analysis_set.fna.gz

And come across specific errors, please let me know here so I can try and pinpoint a resolution. You can run the lines above as is, without the actual get_data_dependencies.sh script.

The inconsistency of the error may be attributed to curl itself or NCBI and its FTP server, but I need more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants