No description, website, or topics provided.
Jupyter Notebook Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data_files
.gitignore
CRISPRiaDesign_example_notebook.ipynb
CRISPRiaDesign_example_notebook.md
Library_design_walkthrough.ipynb
Library_design_walkthrough.md
README.md
sgRNA_learning.py

README.md

CRISPRiaDesign

This site hosts the sgRNA machine learning scripts used to generate the Weissman lab's next-generation CRISPRi and CRISPRa library designs (Horlbeck et al., eLife 2016). These are currently implemented as interactive scripts along with iPython notebooks with step-by-step instructions for creating new sgRNA libraries. Future plans include adding command line functions to make library design more user-friendly. Note that all sgRNA designs for CRISPRi/a human/mouse protein-coding gene libraries are included as supplementary tables in the eLife paper, so cloning of individual sgRNAs or construction of any custom sublibraries targeting protein-coding genes can simply refer to those tables. These scripts are primarily useful for the design of sgRNAs targeting novel or non-coding genes, or for organisms beyond human and mouse.

To apply the exact quantitative models used to generate the CRISPRi-v2 or CRISPRa-v2 libraries, follow the steps outlined in the Library_design_walkthrough (included as a Jupyter notebook or web page).

To see full example code for de novo machine learning, prediction of sgRNA activity for desired loci, and construction of new genome-scale CRISPRi/a libraries, see the CRISPRiaDesign_example_notebook (included as Jupyter notebook or web page).

###Dependencies

External command line applications required:

  • ViennaRNA
  • Bowtie (not Bowtie2)

Large genomic data files required:

Links are to human genome files relied upon for the hCRISPRi-v2 and hCRISPRa-v2 machine learning--and required for the Library_design_walkthrough--but any organism/assembly may be used for design of new libraries or de novo machine learning. For convenience, the files referenced in Library_design_walkthrough in the folder "large_data_files" are also available here.

  • Genome sequence as FASTA (hg19)
  • FANTOM5 TSS annotation as BED (TSS_human)
  • Chromatin data as BigWig (MNase, DNase, FAIRE-seq)
  • HGNC table of gene aliases (not strictly required for the Library_design_walkthrough but useful in some steps)
  • Ensembl annotation as GTF (not strictly required for the Library_design_walkthrough but useful in some steps and in other functions; release 74 used for the published library designs)