
# 🧬 CRISPR Base Editing Practical with BEstimate

Welcome! This practical will walk you through:

✅ Preparing your environment (in Colab)  

✅ Preparing the input and running BEstimate

✅ Interpreting the results

✅ Library design  

---



## 1️⃣ Pre-Preparation (Run This Before the Practical)

> **Important:** Run these two sections to install everything you will need for the training.


In [3]:
import os

# Install BEstimate directly from GitHub
!git clone https://github.com/CansuDincer/BEstimate.git
os.chdir("/content/BEstimate/")


Cloning into 'BEstimate'...
remote: Enumerating objects: 826, done.[K
remote: Counting objects: 100% (30/30), done.[K
remote: Compressing objects: 100% (23/23), done.[K
remote: Total 826 (delta 11), reused 16 (delta 7), pack-reused 796 (from 2)[K
Receiving objects: 100% (826/826), 5.96 MiB | 16.71 MiB/s, done.
Resolving deltas: 100% (466/466), done.


In [4]:
!pip3 install -r /content/BEstimate/requirements.txt

Collecting pandas==2.2.3 (from -r /content/BEstimate/requirements.txt (line 1))
  Downloading pandas-2.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (89 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m89.9/89.9 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting argparse==1.1 (from -r /content/BEstimate/requirements.txt (line 2))
  Downloading argparse-1.1.zip (151 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m151.5/151.5 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting numpy==2.2.5 (from -r /content/BEstimate/requirements.txt (line 4))
  Downloading numpy-2.2.5-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.0/62.0 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting biopython==1.85 (from -r /content/BEstimate/requirements.txt (line 5))
  

**Please restart your session so the packages will be installed!**

**Important, I will add the index file separately since downloading and indexing the genome in the practical session are infeasible. However you can use the code as below (x_genome.py) on your own in a linux environment**

In [1]:
import os, pandas

In [3]:
# Make an output folder inside content directory
#os.mkdir("/content/output/")
os.mkdir("/content/BEstimate/offtargets/")

# Change the path to inside BEstimate folder
os.chdir("/content/BEstimate/BEstimate/")

In [50]:
#!python3 x_genome.py --pamseq NGG --assembly GRCh38 --ensembl_version 113

## 2️⃣ Designing gRNAs for Base Editors

🧬 To find the most appropriate gRNA for our experiments, we should decide:

1. Length of the protospacer and PAM sequences
  - Typically the protospacer sequence is 20.
  - PAM is more divergent however the most frequently used one is NGG or NGN
2. The sequence interval of the activity window.
  - Typically activity window reside between 4-8 or 3-9 nucleotides on the protospacer sequence.
3. The editable nucleotides
  - CBE or ABE
  - For a novel base editor, you can specify any nucleotide change

After the information related to Base Editors, you should also decide which gene is your interest:

1. Hugo symbol of the gene
2. (Optionally) Ensembl Transcript ID
3. (Optionally) Uniprot ID
4. Any variants you want to incorporate (HGVS structure)



##3️⃣ Running BEstimate on Example Genes

Let's design base editor guides for *SRY* as practice.


### **Mutagenesis on *SRY* gene**

In [17]:
# Run BEstimate with example input
!python3 BEstimate.py -gene SRY -assembly GRCh38 -pamseq NGG -pamwin 21-23 -actwin 4-8 -protolen 20 -edit A -edit_to G -vep -o /content/output/ -ofile SRY_ABE_NGG



--------------------------------------------------------------                                                                                         
		   B E s t i m a t e                                      

	       Wellcome Sanger Institute          

--------------------------------------------------------------
    

The given arguments are:
Gene: SRY
Assembl: GRCh38
Ensembl transcript ID: None
Uniprot ID: None
PAM sequence: NGG
PAM window: 21-23
Protospacer length: 20
Activity window: 4-8
Nucleotide change: A>G
VEP and Uniprot analysis: True
Mutation on genome: 
Off target analysis: False



-------------------------------------------------------------- 
		Ensembl Gene Information
-------------------------------------------------------------- 
    

Request to Ensembl REST API for Ensembl Gene ID:
>ENSG00000184895.8 chromosome:GRCh38:Y:2786855:2787682:-1
Ensembl Gene ID: ENSG00000184895

Request to Ensembl REST API for sequence information
The location of the interested gene

### **Reverting sickle-cel disease associated variant**

Sickle cell disease - mutation β-globin gene (*HBB*): g.5227002A>T in GRCh38, p.Glu7Val

In [4]:
# Generate a mutation file
f = open("/content/sicle_cell_variant.txt", "w")
f.writelines("11:g.5227002A>T")
f.close()

In [15]:
# Run BEstimate with example input
!python3 BEstimate.py -gene HBB -assembly GRCh38 -transcript ENST00000335295 -mutation_file /content/sicle_cell_variant.txt -pamseq NGN -pamwin 21-23 -actwin 3-9 -protolen 20 -edit A -edit_to G -o /content/output/ -ofile HBB_variant_specific_ABE_NGN



--------------------------------------------------------------                                                                                         
		   B E s t i m a t e                                      

	       Wellcome Sanger Institute          

--------------------------------------------------------------
    

The given arguments are:
Gene: HBB
Assembl: GRCh38
Ensembl transcript ID: ENST00000335295
Uniprot ID: None
PAM sequence: NGN
PAM window: 21-23
Protospacer length: 20
Activity window: 3-9
Nucleotide change: A>G
VEP and Uniprot analysis: False
Mutation on genome: 11:g.5227002A>T
Off target analysis: False



-------------------------------------------------------------- 
		Ensembl Gene Information
-------------------------------------------------------------- 
    

Request to Ensembl REST API for Ensembl Gene ID:
>ENSG00000244734.4 chromosome:GRCh38:11:5225464:5229395:-1
Ensembl Gene ID: ENSG00000244734

Request to Ensembl REST API for sequence information
The loc


## 4️⃣ Exploring BEstimate Outputs and Interpreting Results

Your results are saved in the `/content/output/` folder.

To check what was generated, run:


In [6]:
# List results
!ls -lh /content/output/

total 824K
-rw-r--r-- 1 root root 201K Jul 15 13:04 HBB_variant_specific_ABE_NGN_crispr_df.csv
-rw-r--r-- 1 root root 618K Jul 15 13:05 HBB_variant_specific_ABE_NGN_edit_df.csv



**What to look for:**

- Summary `.csv` tables listing guides

- Editable nucleotides with annotations of predicted edits


You can download these files or open them directly in Colab for inspection.


### SRY mutagenesis results

**Let's start with the *edit table*, including gRNAs and their editable nucleotides and sequence information**

In [52]:
edit_df = pandas.read_csv("/content/output/SRY_ABE_NGG_edit_df.csv")
edit_df[:5]

Unnamed: 0,Hugo_Symbol,CRISPR_PAM_Sequence,gRNA_Target_Sequence,Location,Edit_Location,Direction,Strand,Gene_ID,Transcript_ID,Exon_ID,...,gRNA_flanking_sequences,Edit_in_Exon,Edit_in_CDS,GC%,# Edits/guide,Poly_T,mutation_on_guide,guide_change_mutation,mutation_on_window,mutation_on_PAM
0,SRY,GTAAAATAAGTTTCGAACTCTGG,GTAAAATAAGTTTCGAACTC,Y:2787642-2787664,2787661,left,-1,ENSG00000184895,ENST00000383070,ENSE00001494622,...,,True,False,30.0,4,False,False,False,False,False
1,SRY,GTAAAATAAGTTTCGAACTCTGG,GTAAAATAAGTTTCGAACTC,Y:2787642-2787664,2787660,left,-1,ENSG00000184895,ENST00000383070,ENSE00001494622,...,,True,False,30.0,4,False,False,False,False,False
2,SRY,GTAAAATAAGTTTCGAACTCTGG,GTAAAATAAGTTTCGAACTC,Y:2787642-2787664,2787659,left,-1,ENSG00000184895,ENST00000383070,ENSE00001494622,...,,True,False,30.0,4,False,False,False,False,False
3,SRY,GTAAAATAAGTTTCGAACTCTGG,GTAAAATAAGTTTCGAACTC,Y:2787642-2787664,2787657,left,-1,ENSG00000184895,ENST00000383070,ENSE00001494622,...,,True,False,30.0,4,False,False,False,False,False
4,SRY,AAGAGAATATTCCCGCTCTCCGG,AAGAGAATATTCCCGCTCTC,Y:2787517-2787539,2787536,left,-1,ENSG00000184895,ENST00000383070,ENSE00001494622,...,,True,True,45.0,3,False,False,False,False,False


In [53]:
# Check the information with edit file
edit_df.columns

Index(['Hugo_Symbol', 'CRISPR_PAM_Sequence', 'gRNA_Target_Sequence',
       'Location', 'Edit_Location', 'Direction', 'Strand', 'Gene_ID',
       'Transcript_ID', 'Exon_ID', 'guide_in_CDS', 'gRNA_flanking_sequences',
       'Edit_in_Exon', 'Edit_in_CDS', 'GC%', '# Edits/guide', 'Poly_T',
       'mutation_on_guide', 'guide_change_mutation', 'mutation_on_window',
       'mutation_on_PAM'],
      dtype='object')

In [54]:
# The number of editable gRNAs
len(edit_df.CRISPR_PAM_Sequence.unique())

62

In [55]:
# Number of gRNAs within the coding sequence
len(edit_df[edit_df.guide_in_CDS].CRISPR_PAM_Sequence.unique())

53

In [56]:
# Number of gRNAs with editable nucleotide within the coding sequence
len(edit_df[edit_df.Edit_in_CDS].CRISPR_PAM_Sequence.unique())

52

In [57]:
# Number of gRNAs with editable nucleotide within the coding sequence w/out polyT
len(edit_df[(edit_df.Edit_in_CDS) & (~edit_df.Poly_T)].CRISPR_PAM_Sequence.unique())

50

**Let's continue with the *protein table*, including VEP, Uniprot and Interactome Insider annotations**

In [58]:
protein_df = pandas.read_csv("/content/output/SRY_ABE_NGG_protein_df.csv", index_col=0)
protein_df[:5]

Unnamed: 0_level_0,Hugo_Symbol,Edit_Type,CRISPR_PAM_Sequence,CRISPR_PAM_Location,gRNA_Target_Sequence,gRNA_Target_Location,Total_Edit,Edit_Location,Direction,Transcript_ID,...,Protein_Position,is_disruptive_interface_EXP,is_disruptive_interface_MOD,is_disruptive_interface_PRED,disrupted_PDB_int_partners,disrupted_I3D_int_partners,disrupted_Eclair_int_partners,disrupted_PDB_int_genes,disrupted_I3D_int_genes,disrupted_Eclair_int_genes
Unnamed: 0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,SRY,individual,AAAATGGCCATTCTTCCAGGAGG,Y:2787267-2787289,AAAATGGCCATTCTTCCAGG,Y:2787264-2787289,1,2787286,left,ENST00000383070,...,106,False,False,False,,,,,,
1,SRY,individual,AAACAGTAAAGGCAACGTCCAGG,Y:2787432-2787454,AAACAGTAAAGGCAACGTCC,Y:2787429-2787454,2,2787447,left,ENST00000383070,...,53,False,False,False,,,,,,
2,SRY,individual,AAACAGTAAAGGCAACGTCCAGG,Y:2787432-2787454,AAACAGTAAAGGCAACGTCC,Y:2787429-2787454,2,2787450,left,ENST00000383070,...,52,False,False,False,,,,,,
3,SRY,multiple,AAACAGTAAAGGCAACGTCCAGG,Y:2787432-2787454,AAACAGTAAAGGCAACGTCC,Y:2787429-2787454,2,2787447-2787451,left,ENST00000383070,...,52;53,False,False,False,,,,,,
4,SRY,individual,AACGGGACCGCTACAGCCACTGG,Y:2787001-2787023,AACGGGACCGCTACAGCCAC,Y:2786998-2787023,1,2787017,left,ENST00000383070,...,196,False,False,False,,,,,,


In [59]:
# Check the information with protein file
protein_df.columns

Index(['Hugo_Symbol', 'Edit_Type', 'CRISPR_PAM_Sequence',
       'CRISPR_PAM_Location', 'gRNA_Target_Sequence', 'gRNA_Target_Location',
       'Total_Edit', 'Edit_Location', 'Direction', 'Transcript_ID', 'Exon_ID',
       'guide_in_CDS', 'gRNA_flanking_sequences', 'Edit_in_Exon',
       'Edit_in_CDS', 'mutation_on_guide', 'guide_change_mutation',
       'mutation_on_window', 'mutation_on_PAM', '# Edits/guide', 'Poly_T',
       'GC%', 'HGVS', 'Protein_ID', 'VEP_input', 'allele',
       'variant_classification', 'most_severe_consequence',
       'consequence_terms', 'variant_biotype', 'Regulatory_ID', 'Motif_ID',
       'TFs_on_motif', 'cDNA_Change', 'Edited_Codon', 'New_Codon',
       'CDS_Position', 'Protein_Position_ensembl', 'Protein_Change',
       'Edited_AA', 'Edited_AA_Prop', 'New_AA', 'New_AA_Prop', 'is_Synonymous',
       'is_Stop', 'proline_addition', 'swissprot_vep', 'uniprot_provided',
       'polyphen_score', 'polyphen_prediction', 'sift_score',
       'sift_prediction', 'c

**!!!Since there can be several editable nucleotide, multiple edits on the sequence with a gRNA is possible.**

In [60]:
# The most severe consequences from the *SRY* gene targteing gRNAs
protein_df.most_severe_consequence.unique()

array(['synonymous_variant', 'missense_variant', '3_prime_UTR_variant',
       '5_prime_UTR_variant', 'start_lost', 'stop_lost'], dtype=object)

In [61]:
# Protein positions of the potential edits
protein_df.Protein_Position.unique()

array(['106', '53', '52', '52;53', '196', '160', '24', '23', '23;24',
       '155', '161', '160;161', '190', '128', '127', '127;128', nan,
       '200', '112', '65', '105', '90', '89', '88', '89;90', '43', '48',
       '172', '170', '203', '202', '93', '92', '92;93', '156', '129',
       '176', '175', '175;176', '73', '44', '43;44', '102', '34', '137',
       '152', '151', '151;152', '40', '69', '68', '68;69', '116', '124',
       '191', '110', '109', '109;110', '198', '101', '32', '145', '144',
       '144;145', '25', '154', '1'], dtype=object)

In [62]:
# Targeted functional domains
protein_df.curated_Domain.unique()

array(['Sufficient for interaction with KPNB1', nan, 'Disordered',
       'Sufficient for interaction with EP300',
       'Necessary for interaction with SLC9A3R2',
       'Required for nuclear localization',
       'Necessary for interaction with ZNF208 isoform KRAB-O'],
      dtype=object)

In [63]:
# Whether any gRNAs with clinical consequences
protein_df[~pandas.isna(protein_df.is_clinical) & (protein_df.is_clinical)][[
    'Hugo_Symbol', 'gRNA_Target_Sequence', 'most_severe_consequence', 'Edited_AA','New_AA', 'clinical_id']]

Unnamed: 0_level_0,Hugo_Symbol,gRNA_Target_Sequence,most_severe_consequence,Edited_AA,New_AA,clinical_id
Unnamed: 0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,SRY,AAAATGGCCATTCTTCCAGG,synonymous_variant,K,K,rs2124486056
2,SRY,AAACAGTAAAGGCAACGTCC,missense_variant,S,G,rs1223685980
6,SRY,AAGAGAATATTCCCGCTCTC,missense_variant,N,S,CD095255
15,SRY,ATTATAAGTATCGACCTCGT,missense_variant,K,R,rs375342012
17,SRY,ATTATAAGTATCGACCTCGT,missense_variant,Y,C,rs104894973
24,SRY,CCATGAACGCATTCATCGTG,missense_variant,N,D,CM136852
26,SRY,CGAAAAATGGCCATTCTTCC,synonymous_variant,K,K,rs2124486056
27,SRY,CGAAAAATGGCCATTCTTCC,missense_variant,K,R,CM920650
28,SRY,CGAAAAATGGCCATTCTTCC,missense_variant,K,E,rs2124486060
32,SRY,CTCAGAGATCAGCAAGCAGC,missense_variant,E,G,CM1210299


In [64]:
protein_df[protein_df.most_severe_consequence == "missense_variant"][[
    'Hugo_Symbol', 'gRNA_Target_Sequence', 'most_severe_consequence', 'Protein_Position','Protein_Change','curated_Domain']]

Unnamed: 0_level_0,Hugo_Symbol,gRNA_Target_Sequence,most_severe_consequence,Protein_Position,Protein_Change,curated_Domain
Unnamed: 0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
1,SRY,AAACAGTAAAGGCAACGTCC,missense_variant,53,K/E,
2,SRY,AAACAGTAAAGGCAACGTCC,missense_variant,52,S/G,
3,SRY,AAACAGTAAAGGCAACGTCC,missense_variant,52;53,SK/GE,
4,SRY,AACGGGACCGCTACAGCCAC,missense_variant,196,D/G,Disordered
5,SRY,AACTGGACAACAGGTTGTAC,missense_variant,160,D/G,
...,...,...,...,...,...,...
130,SRY,GGAATATTCTCTTGCACAGC,missense_variant,25,I/T,
132,SRY,GGAATATTCTCTTGCACAGC,missense_variant,25,NI/NT,
136,SRY,GGTGAGCTGGCTGCGTTGAT,missense_variant,191,S/P,Disordered
142,SRY,GTGCTCCATTCTTGAGTGTG,missense_variant,176,M/T,Disordered


In [None]:
# When you run BEstimate with off targets
grna_df = pandas.read_csv("/content/SRY_ABE_NGG_ot_annotated_summary_df.csv",)
grna_df[:5]

In [None]:
# Find gRNAs without any off targets
grna_df[(grna_df.exact == 1) & (grna_df.mm1 == 0) & (grna_df.mm2 == 0) & (grna_df.mm3 == 0)]

### Sickle cell reversion results

In [16]:
hbb_mut_df = pandas.read_csv("/content/output/HBB_variant_specific_ABE_NGN_edit_df.csv", index_col=0)
hbb_mut_df[:5]

Unnamed: 0_level_0,CRISPR_PAM_Sequence,gRNA_Target_Sequence,Location,Edit_Location,Direction,Strand,Gene_ID,Transcript_ID,Exon_ID,guide_in_CDS,gRNA_flanking_sequences,Edit_in_Exon,Edit_in_CDS,GC%,# Edits/guide,Poly_T,mutation_on_guide,guide_change_mutation,mutation_on_window,mutation_on_PAM
Hugo_Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
HBB,ACCATTGGAAAAGCAACCCCTGC,ACCATTGGAAAAGCAACCCC,11:5229377-5229399,5229391,left,-1,ENSG00000244734,,,False,,False,False,50.0,1,False,False,False,False,False
HBB,TGGAAAAGCAACCCCTGCCTTGA,TGGAAAAGCAACCCCTGCCT,11:5229372-5229394,5229391,left,-1,ENSG00000244734,,,False,,False,False,55.0,4,False,False,False,False,False
HBB,TGGAAAAGCAACCCCTGCCTTGA,TGGAAAAGCAACCCCTGCCT,11:5229372-5229394,5229390,left,-1,ENSG00000244734,,,False,,False,False,55.0,4,False,False,False,False,False
HBB,TGGAAAAGCAACCCCTGCCTTGA,TGGAAAAGCAACCCCTGCCT,11:5229372-5229394,5229389,left,-1,ENSG00000244734,,,False,,False,False,55.0,4,False,False,False,False,False
HBB,TGGAAAAGCAACCCCTGCCTTGA,TGGAAAAGCAACCCCTGCCT,11:5229372-5229394,5229388,left,-1,ENSG00000244734,,,False,,False,False,55.0,4,False,False,False,False,False


In [17]:
# Find the gRNA changing the variation
hbb_mut_df[hbb_mut_df.guide_change_mutation]

Unnamed: 0_level_0,CRISPR_PAM_Sequence,gRNA_Target_Sequence,Location,Edit_Location,Direction,Strand,Gene_ID,Transcript_ID,Exon_ID,guide_in_CDS,gRNA_flanking_sequences,Edit_in_Exon,Edit_in_CDS,GC%,# Edits/guide,Poly_T,mutation_on_guide,guide_change_mutation,mutation_on_window,mutation_on_PAM
Hugo_Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1
HBB,ACTTCTCCACAGGAGTCAGATGC,ACTTCTCCACAGGAGTCAGA,11:5226994-5227016,5227002,right,-1,ENSG00000244734,ENST00000335295,ENSE00001829867,True,,True,True,50.0,1,False,True,True,True,False


**Important: WT codon is GAG and the mutant one is GTG
Mutation is on 5227002 and mutant sequence between 5227001-5227003 is CAC --> ABE --> CGC (+1 strand)
GCG --> Ala --> naturally occurring, non-sickling variant hemoglobin "Makassar" (HbG)**

### Key points to review in your output tables



- **Base Change**: Given your experiments of interest, you may highlight gRNA targeting specific domains, post translational modification sites, splice sites or clinically important locations.
  - gRNAs that target coding regions resulting in functional consequences like amino acid changes. You can eliminate gRNAs only generating synonymous alterations.
  - gRNAs can also edit non-coding regions, you may want to work with a regulatory region, such as promoters, splice sites. (*If unintended, avoid gRNAs that disrupt known splice sites unless this is the intended effect.*)
  - gRNAs can replicate or revert known pathogenic SNPs, you may want to investigate disease models or corrections.
  - gRNAs targeting highly conserved sequences tend to have more severe functional consequences. You can check the functional consequences and select gRNAs of your interest.

- **Off-targets**: It is a good practice to choose gRNAs with minimal off-target effects.

Note: On-Target Efficiency: You may want to select gRNAs with a high on-target efficiency which can you obtain through BE-Hive. (*It is not provided by BEstimate*)


## 5️⃣ Controls in library design


When generating a gRNA library for base editing, incorporating proper controls is essential for ensuring the reliability and interpretability of your experimental results. The controls help validate the functional outcomes of your gRNAs.

1. Positive Controls which help confirm that your base editing system is working efficiently and that the experimental conditions are optimal.

  - gRNAs targeting essential genes that are essential for cell viability (such as housekeeping genes) where editing should have measurable phenotypic effects like cell death or reduced growth.

2. Negative Controls which are critical to assess the background levels of editing and off-target effects. They ensure that observed changes are due to base editing rather than random or non-specific effects.

  - Non-targeting gRNAs help establish the baseline for off-target activity and general effects of transfection or editing. These controls are typically random sequences with no homology to the genome but are designed to resemble real gRNAs in structure.

  - gRNAs targeting non-essential genes that are expected to result in no significant phenotypic effect from the base editing.




## 🛠️ Troubleshooting Tips

❗ **No module named BEstimate** → Rerun the installation cell at the top and do not forget to restart the session! Then you should not run it again.


❗ **Permission errors** → Make sure you’re running in a writable Colab notebook.  


## 🎉 Wrap-up

With this practical course, you have now:

✅ Set up your environment

✅ Designed base editor gRNAs with BEstimate

✅ Learned how to interpret your results

✅ Learned things to consider while selecting your gRNAs and designing your library



**Next steps:** You can try using your own genes or variants as input!



Questions? Ask during the live session or contact me at cd7@sanger.ac.uk
