No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data_files
.gitignore
CRISPRiaDesign_example_notebook.ipynb
CRISPRiaDesign_example_notebook.md
Library_design_walkthrough.ipynb
Library_design_walkthrough.md
README.md
sgRNA_learning.py

README.md

CRISPRiaDesign

This site hosts the sgRNA machine learning scripts used to generate the Weissman lab's next-generation CRISPRi and CRISPRa library designs (Horlbeck et al., eLife 2016). These are currently implemented as interactive scripts along with iPython notebooks with step-by-step instructions for creating new sgRNA libraries. Future plans include adding command line functions to make library design more user-friendly. Note that all sgRNA designs for CRISPRi/a human/mouse protein-coding gene libraries are included as supplementary tables in the eLife paper, so cloning of individual sgRNAs or construction of any custom sublibraries targeting protein-coding genes can simply refer to those tables. These scripts are primarily useful for the design of sgRNAs targeting novel or non-coding genes, or for organisms beyond human and mouse.

To apply the exact quantitative models used to generate the CRISPRi-v2 or CRISPRa-v2 libraries, follow the steps outlined in the Library_design_walkthrough (included as a Jupyter notebook or web page).

To see full example code for de novo machine learning, prediction of sgRNA activity for desired loci, and construction of new genome-scale CRISPRi/a libraries, see the CRISPRiaDesign_example_notebook (included as Jupyter notebook or web page).

Dependencies

External command line applications required:

  • ViennaRNA
  • Bowtie (not Bowtie2)

Large genomic data files required:

Links are to human genome files relied upon for the hCRISPRi-v2 and hCRISPRa-v2 machine learning--and required for the Library_design_walkthrough--but any organism/assembly may be used for design of new libraries or de novo machine learning. For convenience, the files referenced in Library_design_walkthrough in the folder "large_data_files" are also available here.

  • Genome sequence as FASTA (hg19)
  • FANTOM5 TSS annotation as BED (TSS_human)
  • Chromatin data as BigWig (MNase, DNase, FAIRE-seq)
  • HGNC table of gene aliases (not strictly required for the Library_design_walkthrough but useful in some steps)
  • Ensembl annotation as GTF (not strictly required for the Library_design_walkthrough but useful in some steps and in other functions; release 74 used for the published library designs)