Skip to content

Michelle's repo to keep track of how to submit to the sequencing read archive aka short read archive

Notifications You must be signed in to change notification settings

pinskylab/SRA-submission

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SRA Submission

The repo provides information on how to upload files to NCBI's Sequence Read Archive. This is a key step to progressing the goal of open science and easy access to data. Please ensure this guide is up-to-date before use. All data described in this repo regarding Amphiprion clarkii was collected by the Pinsky lab group.

In the event that additional help is needed in the SRA submission process, the National Library of Medicine uploaded a helpful tutorial to YouTube.

How to Submit to SRA, A Beginner's Guide

All A. clarkii raw sequence data discussed in Pinsky guide can be found under BioProject accession PRJNA563695 and on FigShare. The associated metadata can be found in this repo through the following links and guides:

BioProject accession PRJNA563695

Uploaded Sep 2019

Uploaded Nov 2023

Found on FigShare, not uploaded to SRA

The following data was not uploaded to SRA for a number of potential reasons:

  • Failure to associate data
  • Non Amphiprion clarkii samples
  • Metadata in unaccessible form
  • Unable to locate metadata

To access the metadata for the files on FigShare, please follow the following guide:

  1. Access FigShare, choose sample of interest
    • Identify the Ligation_ID (LXXX) within the sample file name
    • If there are multiple Ligation_IDs in the sample name it is likely that the data failed to associate and the metadata must be manually associated
  2. Access the Sample Data old sheet
  3. Under the Ligations sheet, identify the relavant Ligation_ID of sample(s) of interest
    • Column I contains information on the associated Pool_ID
    • Column Q contains a sample traceback name
      • Ex. APCL13_014
      • Example describes taxonomic identifiers for the sample (APCL) and the year-specific collection data (13_014)
    • If the Ligation_ID is not discoverable please skip to step 5
  4. In the Samples sheet search for the year-specific collection data information
    • This information is specific to each sample
    • If the sample is locatable through this method, review the sample taxonomic identifier to verify APCL
      • In the event the sample name does not contain APCL, it is a non Amphiprion clarkii sample
  5. Refer to Michelle Stuart's lab notebook
    • Access the Ligations page and locate the range that contains the Ligation_ID
  6. Overlay the platemaps present for the Digestion_IDs (DXXX) and the Ligation_IDs
    • The separate platemap files are meant to act as a transition from one naming scheme to the next step (i.e. DXXX to LXXX)
  7. The previous association step is to be repeated with Digestion_IDs (DXXX) and Extraction_IDs (EXXX) under the Digestion page
  8. Similar to the previous two steps, after obtaining the Extraction_ID (EXXX), access the range page for the sample(s) of interest
  9. The extraction.pdf file will contain Sample_ID for all samples within the range and will need to be overlayed once again
    • This will ultimately provide you with the Sample_ID (APCLXX_XXX)
  10. Follow the instructions to create the database in LeyteBuildDB
    • This repo houses metadata collected during field seasons
    • Database created with R
  11. The Sample_ID (APCLXX_XXX) can be searched in this database to find all associated metadata

About

Michelle's repo to keep track of how to submit to the sequencing read archive aka short read archive

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages