CRISPR_Screen Instructions, UMass Cluster

CRISPRi/a/ko Screen Data Generation Tips

Most sgRNA libraries are housed in the LentiCRISPRv1 or v2 backbones.

i. Guides are inserted into typeIIS restriction sites (see below) downstream of U6

ii. Transduce with pool, proceed with positive or negative selection based on desired phenotype, and isolate gDNA from your filtered population. Pre-selection cells are used as a control.

iii. Libraries are made in a 1- or 2-step PCR reaction priming the guide-containing region from the lenti insert. See below for priming sites. Template + product ≈ 350bp

iv. Primers contain the whole illumina adapter needed for sequencing.

The P5 primer (fwd) is usually not barcoded. It’s also ordered as a pool, staggered by 1nt for library complexity. Read 1 uses this primer.
The P7 primer (rev) contains the barcode (index 1).

v. I recommend using a Nextseq 500/550 75-cycle high-output kit (or NextSeq 2000 P3 50 cycle kit) for sequencing multiple libraries. Minimum coverage should == #guides x 200. Allocate cycles as 75 | 8 | 0 | 0 (R1 | I1 |I2 |R2). You’re only sequencing from P5 on the U6 end.

vi. Run sequencer in Manual mode, opting for monitoring AND storage on basespace. Upload a sample sheet saved as CSV (below is for Nextseq 500/550, Miniseq. Nextseq 2000 has different sample sheets). Sheet below corresponds to primers above. Experiment name should be unique—you’ll need this to download your FASTQs

CRISPR Library Data Analysis

Consult the relevant publicatioin before processing to see whether your libraries are compatible with MAGeCK. It should work with all the lenti backbones. It has a published protocol, but I don’t find it particularly helpful.

You need a 3-column sgRNA library table. ID should be unique. Check the supplementary information of the article describing your library. They should have a table like this. Get these three columns, name the columns, save as “library.csv” and upload to the cluster.

Log in to the cluster and start an interactive job:

#Here is a job with 15 cores, 30G RAM

bsub -n 15 -R "span[hosts=1]" -R "rusage[mem=2048]" -W 4:00 -q interactive -Is bash

Get your sequencing data onto the cluster. I recommend the illumina commandline uitility. Follow setup instructions if first time. Find your project and download.

#Find your project and id. It’s the experiment name above
bin/bs list projects

# Download and name output folder
bin/bs download projects -i 378654281 -o CRISPRScreen_output

Your files should be named something like “Sample01_R1_L001_001.fastq.gz”. If you used a Nextseq, you’ll need to combine files from all 4 lanes before proceeding. This isn’t an issue on 1-lane machines like the Miniseq, or FASTQs from outside vendors.

cd CRISPRScreen_output
#Schedule jobs to merge the lanes (all one line)
for i in *L001_*gz; do bsub -n 4 -R "span[hosts=1]" -R "rusage[mem=1024]" -o myjob.out -e myjob.err -oo myjob.out -eo myjob.err -W 3:59 -q short -J "$i" "zcat ${i%_L00*}*.gz > ${i%_S*}_R1.fq && gzip ${i%_S*}_R1.fq";done
#Remove input FASTQs
rm *L00*.gz

MAGeCK isn’t on the cluster as a module, so I made a docker image compatible with singularity. Enter the image and start the software. First time will take a couple minutes to download the container from docker and convert to singularity image.

singularity shell docker://umasstr/mageckvispr:latest
source activate mageck-vispr

Align and count sgRNAs for all samples. Make sure you have your library file from step 1.

# mageck count -l <library csv file> -n <name of analysis> --sample-label <comma separated sample names>  --fastq <space separated FASTQs>
mageck count -l library.csv -n Screen --sample-label Sample1,Sample2,Sample3,Sample4,Sample5 --fastq Sample1_R1.fq.gz Sample2_R1.fq.gz Sample3_R1.fq.gz Sample4_R1.fq.gz Sample5_R1.fq.gz
#Your output is:
# Screen.count.txt
# Screen.count_normalized.txt
# Screen.count_report.Rmd
# Screen.countsummary.txt
# Screen.log
# Screen_countsummary.R
# Screen_countsummary.Rnw

Now, run the comparisons between your experimental and control samples. Here, Sample01 is experimental and Sample02 is a control.

# mageck test -k <count table> -t <EXP> -c <CTRL> -n <name output>
mageck test -k Screen.count.txt -t Sample01 -c Sample02 -n 01vs02

The output file “01vs02.gene_summary.txt” contains enrichment information for your guides. They’re ranked by positive and negative selection:

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CRISPR_Library_barcodes.xlsx		CRISPR_Library_barcodes.xlsx
README.md		README.md
SampleSheet.csv		SampleSheet.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CRISPR_Screen Instructions, UMass Cluster

CRISPRi/a/ko Screen Data Generation Tips

CRISPR Library Data Analysis

About

Uh oh!

Releases

Packages

umasstr/CRISPR_Screen

Folders and files

Latest commit

History

Repository files navigation

CRISPR_Screen Instructions, UMass Cluster

CRISPRi/a/ko Screen Data Generation Tips

CRISPR Library Data Analysis

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages