All grabseqs SRA downloads failing #55

cdiener · 2022-06-28T18:02:15Z

Looks like some changes on the NCBI side lead to failures in SRA downloads:

grabseqs sra SRR11733975
Traceback (most recent call last):
  File "/users/cdiener/miniconda3/envs/sra/bin/grabseqs", line 11, in <module>
    sys.exit(main())
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/__init__.py", line 58, in main
    metadata_agg = process_sra(args, zip_func)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 31, in process_sra
    metadata_agg)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 97, in get_sra_acc_metadata
    run_col = lines[0].index("Run")
ValueError: 'Run' is not in list

This seems to be caused by a hardcoded address to download the SRA manifest that is not reachable anymore.

AntonioBaeza · 2022-06-29T00:37:29Z

Having exactly the same issue (tried a few min ago)

Zeroo11 · 2022-06-30T01:16:54Z

same issue

louiejtaylor · 2022-06-30T20:18:23Z

Thanks for reporting the issue! Looks like @cdiener is right on, http://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?save=efetch&db=sra&rettype=runinfo&term= redirects to https://www.ncbi.nlm.nih.gov/sviewer/?db=sra&1%3Fdb=sra&rettype=runinfo&save=efetch&term= and no longer returns metadata. I'll try to figure out the proper endpoint for their API to hit for the SRA metadata. (and see if I can get the tests passing in the meantime).

This is probably due to NCBI retiring Trace.

Looking through the NCBI E-utils API documentation, I should be able to get the same metadata by:

Finding the identifiers associated with esearch, e.g. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=sra&term=PRJNA836386&retmax=999
Passing that id list to efetch, e.g. https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=sra&id=22439955&rettype=fasta&retmode=text

I'll just have to move it from XML to tab-separated since it looks like the e-utils love XML. This approach also has the advantage of using a defined API, rather than that trace URL (which worked great but I think I found it originally on StackOverflow or something).

cdiener · 2022-07-01T17:54:38Z

You can also request JSON from esearch which should be easier to convert with Python, for instance for your example: https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=sra&term=PRJNA836386&retmax=999&retmode=json .

GitUser42 · 2022-07-09T17:26:41Z

Hello :) Is there any workaround until this will be fixed?

zhengjxj · 2022-07-10T15:47:18Z

Looks like some changes on the NCBI side lead to failures in SRA downloads:

grabseqs sra SRR11733975
Traceback (most recent call last):
  File "/users/cdiener/miniconda3/envs/sra/bin/grabseqs", line 11, in <module>
    sys.exit(main())
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/__init__.py", line 58, in main
    metadata_agg = process_sra(args, zip_func)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 31, in process_sra
    metadata_agg)
  File "/users/cdiener/miniconda3/envs/sra/lib/python3.7/site-packages/grabseqslib/sra.py", line 97, in get_sra_acc_metadata
    run_col = lines[0].index("Run")
ValueError: 'Run' is not in list

This seems to be caused by a hardcoded address to download the SRA manifest that is not reachable anymore.

Try replacing /usr/local/lib/python3.6/site-packages/grabseqslib/sra.py line 94 with
metadata = requests.get("https://trace.ncbi.nlm.nih.gov/Traces/sra-db-be/sra-db-be.cgi?rettype=runinfo&term="+pacc)

AntonioBaeza · 2022-07-17T00:31:45Z

Thanks [zhengjxj] (https://github.com/zhengjxj). I replaced the info in the file you indicated and is working again!

chansigit · 2022-07-18T13:52:14Z

thank you.
it seems that the ncbi api changed.

xiachenrui · 2023-06-11T07:07:32Z

Thanks ! @zhengjxj

AMMHasan · 2024-05-13T17:26:48Z

Hi, is grabseqs sra facing the same problem? what would be the solution this time?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All grabseqs SRA downloads failing #55

All grabseqs SRA downloads failing #55

cdiener commented Jun 28, 2022

AntonioBaeza commented Jun 29, 2022

Zeroo11 commented Jun 30, 2022

louiejtaylor commented Jun 30, 2022 •

edited

cdiener commented Jul 1, 2022

GitUser42 commented Jul 9, 2022

zhengjxj commented Jul 10, 2022

AntonioBaeza commented Jul 17, 2022 •

edited

chansigit commented Jul 18, 2022

xiachenrui commented Jun 11, 2023

AMMHasan commented May 13, 2024

All grabseqs SRA downloads failing #55

All grabseqs SRA downloads failing #55

Comments

cdiener commented Jun 28, 2022

AntonioBaeza commented Jun 29, 2022

Zeroo11 commented Jun 30, 2022

louiejtaylor commented Jun 30, 2022 • edited

cdiener commented Jul 1, 2022

GitUser42 commented Jul 9, 2022

zhengjxj commented Jul 10, 2022

AntonioBaeza commented Jul 17, 2022 • edited

chansigit commented Jul 18, 2022

xiachenrui commented Jun 11, 2023

AMMHasan commented May 13, 2024

louiejtaylor commented Jun 30, 2022 •

edited

AntonioBaeza commented Jul 17, 2022 •

edited