Skip to content

Latest commit

 

History

History
65 lines (55 loc) · 3.68 KB

README.md

File metadata and controls

65 lines (55 loc) · 3.68 KB

ONT human workflow

General info

Here we are implementing a current best practice workflow for processing human Oxford Nanopore Technologies (ONT) sequencing data implemented by Miles Benton and the Singapore - New Zealand Applications Project Support Team, Oxford Nanopore Technologies. A nextflow pipeline wrapping up these scripts is on the horizon, but since we want to run this code now, this repository serves to adapt the original scripts to run on ESR's production network, including deploying to SLURM. Running this analysis on other compute infrastructures is not supported by the code/documentation in this repository, but is certainly possible and achievable with some bioinformatics knowledge and bit of legwork. See the original work this repository is based on in the ./docs/original_work/workflow_outline/ directory.

Workflow overview

graph TD;
    fast5{fast5 / pod5 / slow5}-->Guppy/Remora/alignment([Guppy + Remora + alignment]);
    Guppy/Remora/alignment-->modbam{bam};
    Guppy/Remora/alignment-->seqsum([sequencing_summary.csv]);
    seqsum-->duplex_tools([duplex-tools]);
    Guppy/Remora/alignment-->fastq{fastq};
    fastq-->duplex_tools;
    fast5-->duplex_tools;
    duplex_tools-->read_pairs([duplex read pairs]);
    read_pairs-->guppy_duplex([guppy_basecaller_duplex]);
    guppy_duplex-->duplex_reads{duplex reads}
    modbam-->Clair3_nextflow;
    GRCh38_ref_genome{reference genome}-->Guppy/Remora/alignment;
    GRCh38_ref_genome{reference genome}-->Clair3_nextflow([wf-human-variation - clair3 + sniffles2]);
    GRCh38_ref_genome{reference genome}-->Whatshap
    modbam{bam}-->Whatshap;
    Whatshap-->modbam_phased{bam - phased};
    Clair3_nextflow-->vcf_phased;
    modbam_phased-->view([genome browser]);
    view-->IGV;
    vcf_phased{vcf}-->Whatshap;
    vcf_phased{vcf}-->variant_annotation([variant annotation]);
    variant_annotation-->VEP;
    variant_annotation-->dbNSFP;
    modbam_phased-->CNV_analysis([CNV analysis]);
    CNV_analysis-->qdna-seq([wf-cnv]);
    qdna-seq-->output(report + qdnaseq ouput);
    modbam_phased-->modbam2bed;
    modbam2bed-->mod_bed.gz;
    GRCh38_ref_genome{reference genome}-->modbam2bed;
    mod_bed.gz{bed}-->methylation_analysis([methylation analysis]);
    Clair3_nextflow-->sniffles2([additional SV analysis - cuteSV, SVIM, SVision]);
    GRCh38_ref_genome{reference genome}-->sniffles2;
    Clair3_nextflow-->SV_data;
    sniffles2-->SV_data{SV data};
    SV_data-->SV_vcf.gz{vcf};
    VEP-->annotated_vcf.gz{vcf};
    dbNSFP-->annotated_vcf.gz;
    annotated_vcf.gz-->SNV_indel_analysis([SNV/indel analysis]);

Analysis workflow run this analysis on the ESR production network

Access

  • The github repository for this project is public
  • For other general queries about this project beyond the bioinformatics, contact the current (2022) representatives of the ESR genomics group, Donia Macartney-Coxson (Donia.Macartney-Coxson@esr.cri.nz) and/or Joep de Ligt (joep.deligt@esr.cri.nz)

Github repository

Find the github repository for this project at: https://github.com/leahkemp/ont_human_workflow