Skip to content

Latest commit

 

History

History
77 lines (49 loc) · 3.26 KB

data.md

File metadata and controls

77 lines (49 loc) · 3.26 KB

Information

The data contains paired-end FASTQ formatted Illlumina read files for each of the three strains (SHTV, SHHI, and PAER). All RNA-Seq data can be found at iMicrobe.

Environmental Conditions

Sample Name, Strain Habitat, Collection site
MMETSP0359 Scrippsiella Hangoei, SHTV-5 Sediment, Baltic Sea

Each biological replicate (eg. SRR129...) contains a pair of fastq files (eg. SRR129.._1.fastq.gz for the 'left/forward' and SRR129..._2.fastq.gz for the 'right'/reverse read of the paired end sequences).

Experimental salinity level for each strain

SRR Salinity (Sample #) 0 PSU 3 PSU 30 PSU Done Read Length (F&R) Number of Sequence (F&R)**
SRR1296786 SHTV-5_0 (59) - 100 -> 50 27365859
SRR1296972 SHTV-5_3 (60) - Y 50 16785889
SRR1294400 SHTV-5_30 (61) - Y 101 20841201
SRR1296793 SHHI-4_0 (67)* - 50 23109623
SRR1296794 SHHI-4_3 (68) - Y 50 22831746
SRR1296796 SHHI-4_30 (69) - Y 50 24488163
SRR1294439 PAER-2_0 (70) - X X 50 20622130
SRR1294440 PAER-2_3 (71) - Y 50 21274591

Key:

** counting total number of sequences
zgrep -c '@SRR' SHTV-5_3_2.fastq.gz
  • 67 = Click on 'All runs'*
  • PSU = Practical Salinity Unit; 1 g salt per 1000 grams of water = 1 PSU Source
  • SH = Scrippsiella Hangoei // (Habitat: Sediment, previously known as A. malmogiense)
  • PA = Peridinium aciculiferum (Habitat: Freshwater, previously known as A. aciculiferum)

Data for Analyses

1.SRR (SRA Run Accession)** numbers were obtained from NCBI (Example).

  • SRR numbers corresponds to IDs listed at iMicrobe.
  • FASTQ files, for both forward and reverse reads, were downloaded from ENA (Example).

** You can read about accession types here.

<<< Please see analyses.md for information on programs and running processes >>>

File Structure

  1. MMETSP03 folder
  • Analysis

    • fastq folder

      • fastq.gz files as symbolic links pointing to /Data/fastq/SRRCode_1(or SRRCode_2).fastq.gz
    • fastqc folder

      • fastqc analyses with HTML files and fastqc.zip files
    • multiQC (folder)

      • multiqc_report.html

      • multiQC_data (folder): log/stats files relative to multiQC run

  • Analysis_2

    • fastq files for all samples with strain names i.e. SHTV, SHHI, etc
  • Data

    • fastqc folder

      • fastq.gz files - downloaded between 20-21 March
  • assembly