Skip to content


Repository files navigation


Genome-wide functional screens enable the prediction of high activity CRISPR-Cas9 and -Cas12a guides in Yarrowia lipolytica



  • python 3.6.10
  • keras 2.4.3
  • tensorflow 2.2.0
  • scipy 1.4.1
  • scikit-learn 0.23.1
  • numpy 1.18.5
  • pandas 1.0.5

Installation Guide

The easiest way to get the needed prerequisites to run DeepGuide is through Anaconda. If you have Anaconda installed already you can skip this step, otherwise go to to learn how to install conda on your system. We have used Anaconda version 4.8.5. Higher version is expected to work. Once Anaconda is correctly installed, You can run the following command to install requirements for DeepGuide

conda create -n deepguide python=3.6.10 ipykernel matplotlib pandas=1.0.5 numpy=1.18.5 scipy=1.4.1 tensorflow=2.2.0 keras=2.4.3 scikit-learn=0.23.1 biopython=1.71

Running the software

For cas12a

Assuming you have installed the prerequisites in a conda environment called deepguide, you can run the software for cas12a guides using following command

git clone
cd DeepGuide/src/
conda activate deepguide
python path_of_fasta_file

See data/seq_sample.fasta for FASTA format. Just remember that the program needs at least 32 nucleotides to fit the full target.

context (1nt) -- PAM (4nt, TTTV) -- target (25nt) -- context (2nt)

You will get an output file called activity_score_cas12a.csv in data directory. This file will contain the predicted cutting score by DeepGuide for each guide.

For cas9(Sequence + Nucleosome Occupancy as Input)

You can run DeepGuide to get the prediction scores for cas9 guides using the sequence of guides and Nucleotide Occupancy by the following command

git clone
cd DeepGuide/src/
conda activate deepguide
python path_of_fasta_file path_of_NucleosomeOccupancy_file

See data/seq_sample.fasta for FASTA format. Just remember that the program needs at least 28 nucleotides to fit the full target. See data/nu_sample.csv for nucleosome occupancy file. Here each number in the file represents nucleosome occupancy for each nucleotide potition of the fasta file. Remember that total number of nuclesome occupanies has to be equal the total number of nucleotides in the fasta file

context (2nt) -- target (20nt) -- PAM (3nt, NGG) --  context (3nt)

For cas9(only Sequence as Input)

To get prediction scores for cas9 guides using sequence only use the following command

git clone
cd DeepGuide/src/
conda activate deepguide
python path_of_fasta_file

Example run

For cas12a

git clone
cd DeepGuide/src/
conda activate deepguide
python ../data/seq_sample.fasta

Then you will get an output file called activity_score_cas12a.csv in data directory. The format of the output is bellow:

Guide Score

For cas9(Sequence + Nucleosome Occupancy as Input)

git clone
cd DeepGuide/src/
conda activate deepguide
python  ../data/seq_sample.fasta  ../data/nu_sample.csv

Then you will get an output file called activity_score_cas9.csv in data directory. The format of the output is same as before.

Guide Score

For cas9(only Sequence as Input)

git clone
cd DeepGuide/src/
conda activate deepguide
python  ../data/seq_sample.fasta

Then you will get an output file called activity_score_cas9_seq.csv in data directory. The format of the output is same as above

For cas9(only Sequence as Input)

If one needs the version that was used to generate the scores used for Figure 5 in the paper, they can download that older version from DeepGuide_v0.9/saved_weights.


If you have used this tool in your publication please cite this

Genome-wide functional screens enable the prediction of high activity CRISPR-Cas9 and -Cas12a guides in Yarrowia lipolytica. Dipankar Baisya, Adithya Ramesh, Cory Schwartz, Stefano Lonardi, and Ian Wheeldon. Nature Communication, 2022