CUT&RUN is a revolutionary genomic mapping strategy developed by the group of Dr. Steven Henikoff. It builds on Chromatin ImmunoCleavage (ChIC) from Dr. Ulrich Laemmli, whereby a fusion of Proteins A and/or G to Micrococcal Nuclease (pAG-MNase) is recruited to selectively cleave antibody-bound chromatin in situ3. In CUT&RUN, cells or nuclei are immobilized to a solid support, with pAG-MNase cleaved target DNA fragments isolated from solution. The workflow is compatible with nextgeneration sequencing (NGS) to provide high resolution genome-wide profiles of histone posttranslational modifications (PTMs) and chromatin-associated proteins (e.g. TFs and chromatin remodelers;
This pipeline was developed by Michael J. Bale, MSc at Weill Cornell Medical College for principal use in processing CUT&RUN and ChIPseq data generated by the lab of Steven Z. Josefowicz, PhD. Although some attempts at generalizing for wider use was made, some extra hacking might be required to get this pipeline to run on your specific machine and/or server. Use at your own risk; but I'm happy to help if I can!
=================================================
C U T & R U N P I P E L I N E v0.5
=================================================
Author: Michael J. Bale (mib4004@med.cornell.edu)
The typical command for running the pipeline is as follows:
nextflow run michaelbale/JLab-IPSeqFlow --input 'project/*_R{1,2}.fastq.gz' --genome mouse -profile singularity
Mandatory arguments:
--input Path to input data (must be surrounded with quotes)
--genome Name of Genomes reference (current supported: mouse -- mm10, human -- hg38)
-profile Name of package manager system (available: docker, singularity, conda);
for WCM default -- singularity is recommended, but conda works while docker
does not. For minimal - use conda.
Options:
--executorConfig Path to config file that contains specifics for execution.
Will default to WCM SCU-specific parameters. N.B. for
single-threaded running use --executorConfig conf/minimal.config
--singleSample Specifies that the input is a single sample and will not generate a PCA graph
--PCATitle Title to be included in PCA graph; must be surrounded with quotes
--catLanes Tells CnRFlow to take input files and concatenate lanes into single fastq files
--name Project Name; cannot have whitespace characters
--addBEDFilesProfile Path to csv file with info on additional BED files for generating
Sunset-style profile plots; csv format: rName,BEDPath
--addBEDFilesRefPoint Path to csv file with info on additional BED files for generating
Torndao-style region profile plots; csv format: pName,BEDPath,PlusMinus,pLabel
--workDir Name of folder to output concatenated fastq files to (not used unless --catLanes)
--outdir Name of folder to output all results to (Default: results)
--genomeAssets Home directory of where genome-specific files are.
Defaults to /athena/josefowiczlab/scratch/szj2001/JLab-Flow_Genomes
Requirements:
nextflow v20 or higher
java 8+
Some combination of singularity, docker, or conda. If using -profile debug, all programs specified in the environment.yml file will have to be installed and accessible to your path.