Update to CNV workflow #1

sirselim · 2022-12-13T00:40:43Z

I had some time recently for another job to better implement the CNV module/workflow. It's still not great, the main branch needs some more love, but the below should get you 99% of the way towards working code.

#!/bin/bash -l

# check and grab latest wf (if wanted)
nextflow pull epi2me-labs/wf-cnv

# define variables
WKDIR='/data/basecalled/AGRF'
SAMPLE='Diabetes_A'
REFERENCE='/public-data/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna'
NFCONFIG='/public-data/configs/nextflow_local_overide.config'
THREADS='48'

# EPI2ME Labs CNV workflow
# nextflow -c "${NFCONFIG}" run epi2me-labs/wf-cnv \
#   -resume \
#   --threads "$THREADS" \
#   -profile standard,local \
#   --fastq "${WKDIR}"/"${SAMPLE}"/pass/ \
#   --sample "${SAMPLE}" \
#   --fasta "${REFERENCE}" \
#   --genome hg38 --bin_size 500 \
#   --map_threads 24


# EPI2ME Labs CNV workflow
nextflow -c "${NFCONFIG}" run epi2me-labs/wf-cnv \
  -resume \
  --threads "$THREADS" \
  -profile standard,local \
  --fastq "${WKDIR}"/"${SAMPLE}"/pass/fastq/ \
  --sample_sheet sample_sheet.csv \
  --fasta "${REFERENCE}" \
  --genome hg38 --bin_size 500 \
  --map_threads 24


# Notes:
# EPI2ME Labs workflow for ONT CNV analysis: https://github.com/epi2me-labs/wf-cnv
#  CNV pipeline will run on the basecalled fastq files, it will run an alignment against 
# the provided reference genome, and then perform CNV analysis using QDNAseq.

# there is currently a fun little issue with the CNV pipeline needing a sample sheet
# it also seems to ignore non-compressed fastq files, so run:
# for file in *.fastq ; do 
#   echo -e "... processing $file ...";
#   bgzip "$file";
#   echo -e "... done ...";
# done
# NOTE: it also seems that the gz fastq files have to be in a folder called fastq...

# sample sheet looks like this:
# barcode,sample_id,type
# barcode01,Diabetes_A,AGRF_Diabetes

leahkemp · 2022-12-14T23:40:27Z

Wicked thanks for sharing your code Miles, I've implemented it and will test/refine it when I get to that section of the code :)

leahkemp · 2022-12-16T03:06:12Z

5daf211

sirselim · 2023-02-06T23:21:22Z

Reopening this due to a big update in the wf-cnv workflow. You no longer are required to use a samplesheet or just fastqs. Bams are now a valid entry point which massively speeds up the CNV calling process. Some example code:

#!/bin/bash -l

# check and grab latest wf (if wanted)
nextflow pull epi2me-labs/wf-cnv

# define variables
WKDIR='/data/basecalled/AGRF'
SAMPLE='Control'
REFERENCE='/public-data/references/GRCh38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna'
NFCONFIG='/public-data/configs/nextflow_local_overide.config'
THREADS='48'


# EPI2ME Labs CNV workflow
nextflow -c "${NFCONFIG}" run epi2me-labs/wf-cnv \
  -resume \
  --threads "$THREADS" \
  -profile standard,local \
  --bam "${WKDIR}"/"${SAMPLE}"/bam/"${SAMPLE}"_sorted_merged.hp.bam \
  --sample "${SAMPLE}" \
  --fasta "${REFERENCE}" \
  --genome hg38 --bin_size 500 \
  --map_threads 24


# Notes:
# EPI2ME Labs workflow for ONT CNV analysis: https://github.com/epi2me-labs/wf-cnv
# CNV pipeline will run on the basecalled fastq files or aligned bams, 
# and then perform CNV analysis using QDNAseq.

sirselim added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 13, 2022

sirselim assigned leahkemp Dec 13, 2022

leahkemp closed this as completed Dec 14, 2022

sirselim reopened this Feb 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update to CNV workflow #1

Update to CNV workflow #1

sirselim commented Dec 13, 2022

leahkemp commented Dec 14, 2022

leahkemp commented Dec 16, 2022

sirselim commented Feb 6, 2023

Update to CNV workflow #1

Update to CNV workflow #1

Comments

sirselim commented Dec 13, 2022

leahkemp commented Dec 14, 2022

leahkemp commented Dec 16, 2022

sirselim commented Feb 6, 2023