Skip to content
yvan edited this page Apr 11, 2014 · 3 revisions

summary of all the wiki/issues for Smash:

Smash wiki

  • Link to the original fast files for:

  • Link to Smash Download and QC

  • Link to Smash Downloadable Material

    • Just links to the amazonaws
  • A summary of which files belong to what sample

  • Methodology we will use for the project.

    1. Downloading BAM files (whole genome sequencing data) from SMaSH data
    2. Extracting reads from coding exon regions (+/- 20 bp)
    3. Extracting truth variants from the same regions as step2
    4. Converting BAM back to fastq files
    5. Running Variant Calling pipeline
    6. Evaluating results

##Issues

Summary of the different steps taken for the different files

  • NA19240.bam

    1. `java -Xmx6g -jar ValidateSamFile.jar INPUT=NA19240.bam MODE=SUMMARY \`
       `VALIDATION_STRINGENCY=STRICT`
    
    2. ` java -Xmx6g -jar CleanSam.jar INPUT=NA19240.bam OUTPUT=NA19240_cleaned.bam `
      
    3. `java -Xmx6g -jar ValidateSamFile.jar INPUT=NA19240_cleaned.bam MODE=SUMMARY \`        
       `VALIDATION_STRINGENCY=STRICT` 
    
    4. ` java -Xmx6g -jar FixMateInformation.jar INPUT=NA19240_cleaned.bam \`
      `OUTPUT=NA19240_cleaned_fixed.bam VALIDATION_STRINGENCY=LENIENT \`
      `TMP_DIR=/Volumes/ `
    
    5. `java -Xmx6g -jar ValidateSamFile.jar INPUT=NA19240_cleaned_fixed.bam\ `
       ` MODE=SUMMARY VALIDATION_STRINGENCY=STRICT` 
       
    6. `samtools  view -b -L uscs_all_exons_sorted+-20.bed NA19240_cleaned_fixed.bam > NA19240_cleaned_fixed+-20.bam`
    
    7. `java -Xmx6g -jar ValidateSamFile.jar INPUT=NA19240_cleaned_fixed+-20.bam\ `
       ` MODE=SUMMARY VALIDATION_STRINGENCY=STRICT`
    
    8. `java -Xmx8g -jar FixMateInformation.jar INPUT=NA19240_cleaned_fixed+-20.bam \`
       `OUTPUT=NA19240_cleaned_fixed+-20_posiSrt.bam `
       `CREATE_INDEX=TRUE VALIDATION_STRINGENCY=STRICT SORT_ORDER=queryname `
    
    9. `java -Xmx8g -jar FixMateInformation.jar INPUT=NA19240_cleaned_fixed+-20.bam `
       `OUTPUT=NA19240_cleaned_fixed+-20_posiSrt.bam CREATE_INDEX=TRUE `
       `VALIDATION_STRINGENCY=STRICT SORT_ORDER=coordinate TMP_DIR=Seagate\ Exp/`
    
    1. samtofastaq

| File | ERROR INVALID INDEXING_BIN | ERROR INVALID MAPPING_QUALITY | ERROR MISMATCH FLAG_MATE_NEG_STRAND | ERROR MISMATCH FLAG_MATE_UNMAPPED | | :----: | :----: | :----: | :----: | :----: | :----: | | NA19240.bam | 58471 | 2009 | 13980965 | 2045 | |NA19240_cleaned.bam | 58471 | 0 | 13980965 | 2045 | |NA19240_cleaned_fixed.bam | 0 | 0 | 0 | 0 | |NA19240_cleaned_fixed+-20.bam| 0 | 0 | 0 | 21667367| |NA19240_cleaned_fixed+-20_posiSrt.bam | 0 | 0 | 0 | 21667367 |

  • Contaminated_NA12878

Clone this wiki locally