Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to look up metadata for an SRR file #17

olgabot opened this issue Dec 28, 2015 · 1 comment

How to look up metadata for an SRR file #17

olgabot opened this issue Dec 28, 2015 · 1 comment


Copy link

olgabot commented Dec 28, 2015

Thanks to you, I am able to download all the SRA files associated with a project and convert them to FASTQ. My question is how can I get the metadata information, say for all the files under SRP011546 . Under GEO, there is a metadata file but all the identifiers are GEO-based, i.e. GSE36552 and the like. And there's no mapping of GEO to SRA ids (e.g. SRR490990 in this project). What I'd like to get is the equivalent of Title: "Oocyte #1" from GEO or Library name: GSM922167: 8-cell embryo#2 -Cell#6 from DNA Nexus Can the bionode-ncbi search function be used to find this information?

Copy link

olgabot commented Dec 28, 2015

nvm got it!!

bionode-ncbi search sra SRR490990

returns a big ole json that I can parse:

{u'createdate': u'2013/08/12',
 u'expxml': {u'Bioproject': u'SRP011546',
  u'Experiment': {u'acc': u'SRX144369',
   u'name': u'GSM922175: 8-cell embryo#3 -Cell#6; Homo sapiens; RNA-Seq',
   u'status': u'public',
   u'ver': u'1'},
  u'Instrument': {u'ILLUMINA': u'Illumina HiSeq 2000'},
  u'Library_descriptor': {u'LIBRARY_LAYOUT': {u'SINGLE': u''},
   u'LIBRARY_NAME': u'GSM922175: 8-cell embryo#3 -Cell#6',
  u'Organism': {u'ScientificName': u'Homo sapiens', u'taxid': u'9606'},
  u'Sample': {u'acc': u'SRS310960', u'name': u''},
  u'Study': {u'acc': u'SRP011546',
   u'name': u'Tracing pluripotency of human early embryos and embryonic stem cells by single cell RNA-seq'},
  u'Submitter': {u'acc': u'SRA050912',
   u'center_name': u'GEO',
   u'contact_name': u'Gene Expression Omnibus (GEO), NCBI, NLM, NIH, htt',
   u'lab_name': u''},
  u'Summary': {u'Platform': {u'_': u'ILLUMINA',
    u'instrument_model': u'Illumina HiSeq 2000'},
   u'Statistics': {u'cluster_name': u'public',
    u'load_done': u'true',
    u'total_bases': u'4750085700',
    u'total_runs': u'1',
    u'total_size': u'3178301878',
    u'total_spots': u'47500857'},
   u'Title': u'GSM922175: 8-cell embryo#3 -Cell#6; Homo sapiens; RNA-Seq'}},
 u'extlinks': u'    ',
 u'runs': {u'Run': [{u'acc': u'SRR490990',
    u'cluster_name': u'public',
    u'is_public': u'true',
    u'load_done': u'true',
    u'static_data_available': u'true',
    u'total_bases': u'4750085700',
    u'total_spots': u'47500857'}]},
 u'uid': u'174404',
 u'updatedate': u'2013/09/23'}


@olgabot olgabot closed this as completed Dec 28, 2015
@bmpvieira bmpvieira self-assigned this Apr 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

No branches or pull requests

2 participants