# Analysis for 9p21.3 Modifier Screen

EnAsCas12-expressing KP4 cells were transduced with a dCas9-KRAB targeting either Ch2-2 or PELO (gRNA #1/2), then a Cas12a library targeting 9p21.3 genes (and positive/negative controls) was applied.  The initial pDNA pool was sequenced as a day 0 reference, with additional timepoints being taken at day 5, 9, 13, 17 and 21.  In this notebook, we use raw readcounts to compute naive logfold change (LFC) as well as run Chronos to obtain gene effect scores.  

NOTE: Chronos is non-deterministic, so its outputs will minorly fluctuate from those presented in the manuscript.

In [1]:
import pandas as pd
import numpy as np
from chronos.model import normalize_readcounts, Chronos, calculate_fold_change
from helper_funcs import *

# Import data

In [2]:
redownload = False
reads = read_in_data("readcounts_9p21_3", index_col=0, redownload=redownload)
guide_map = read_in_data("guide_map_9p21_3", redownload=redownload).rename(columns={"sgRNA1":"sgrna", "Target":"gene"})
seq_map = read_in_data("sequence_map_9p21_3", redownload=redownload)

Fetching local copy of readcounts_9p21_3
Fetching local copy of guide_map_9p21_3
Fetching local copy of sequence_map_9p21_3


# Remove multitargeting guides

In [3]:
multitargeting = guide_map.groupby('sgrna')['gene'].count().loc[lambda x: x > 1].index
print(multitargeting)
guide_map = guide_map.loc[~guide_map["sgrna"].isin(multitargeting)]
reads = reads.loc[~reads.index.isin(multitargeting)]
reads.shape

Index(['ACTGAACTTTACCAGCAACTGAA', 'CCATTTGTGCCAGGAGTATCAAG',
       'TCTGGGCTGTGATCTGCCTCAGA'],
      dtype='object', name='sgrna')


(92, 46)

# Compute LFCs

In [4]:
lfc = np.log2(calculate_fold_change(reads.T, seq_map))

# Train Chronos

Due to small scale of this library, kernel_width and cell_efficacy_guide_quantile were tweaked from their default whole-genome settings.

In [5]:
model = Chronos(
    sequence_map={"9p21.3-genes": seq_map},
    guide_gene_map={"9p21.3-genes": guide_map},
    readcounts={"9p21.3-genes": reads.T},
    kernel_width=3,
    cell_efficacy_guide_quantile=0.15
)
model.train()

normalizing readcounts
Readcounts has less than 2000 guides, using median normalization


Finding all unique guides and genes
found 92 unique guides and 27 unique genes in 9p21.3-genes
found 92 unique guides and 27 unique genes overall

finding guide-gene mapping indices

finding all unique sequenced replicates, cell lines, and pDNA batches
found 45 unique sequences (excluding pDNA) and 3 unique cell lines in 9p21.3-genes
found 45 unique replicates and 3 unique cell lines overall

finding replicate-cell line mappings indices

finding replicate-pDNA mappings indices


assigning float constants
Estimating or aligning variances
Creating excess variance tensors
	Created excess variance tensor for 9p21.3-genes with shape [45, 1]
initializing graph

building gene effect mask

building doubling vectors
made days vector of shape [45, 1] for 9p21.3-genes

building late observed timepoints
	built normalized timepoints for 9p21.3-genes with shape (45, 92) (replicates X guides)

building t0 reads


2024-10-18 15:41:59.475450: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:388] MLIR V1 optimization pass is not enabled


initializing variables
estimating initial screen efficacy and gene effect
	 9p21.3-genes
	verifying graph integrity
verifying user inputs
verifying variables
verifying calculated terms
	9p21.3-genes _gene_effect
	9p21.3-genes _selected_efficacies
	9p21.3-genes_predicted_readcounts_unscaled
	9p21.3-genes _predicted_readcounts


2024-10-18 15:41:59.799731: W tensorflow/c/c_api.cc:305] Operation '{name:'excess_variance/9p21.3-genes/Assign' id:6 op device:{requested: '', assigned: ''} def:{{{node excess_variance/9p21.3-genes/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_DOUBLE, validate_shape=false](excess_variance/9p21.3-genes, excess_variance/9p21.3-genes/Initializer/initial_value)}}' was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don't modify nodes after running them or create a new session.


	9p21.3-genes _normalized_readcounts
	9p21.3-genes _cost_presum
sess run
	9p21.3-genes _cost
	9p21.3-genes _full_costs
ready to train
NB2 cost 0.4140102137294511
Full cost 0.41598662502482053
relative_growth_rate
	9p21.3-genes max 1.027, min 0.98626
mean guide efficacy 0.990853919829184
t0_offset SD: [('9p21.3-genes', 7.477999527876274e-05)]

gene mean 0.0017752704167814711
SD of gene means 0.3417455449072775
Mean of gene SDs 0.05221446186476038



51 epochs trained, time taken 0:00:00, projected remaining 0:00:01
NB2 cost 0.24770009478089539
Full cost 0.26738688934908655
relative_growth_rate
	9p21.3-genes max 1.057, min 0.91008
mean guide efficacy 0.9600038241879154
t0_offset SD: [('9p21.3-genes', 0.13501652635274505)]

gene mean -0.0035153813355761417
SD of gene means 0.6038182107386302
Mean of gene SDs 0.07434223062749276



101 epochs trained, time taken 0:00:00, projected remaining 0:00:01
NB2 cost 0.2083492957086144
Full cost 0.22163034032535395
relative_growth_rate
	9p21.3-genes

# Scale gene effect

In [6]:
neg_con = guide_map.loc[guide_map['Function'] == 'Negative Control', 'gene'].unique()
pos_con = guide_map.loc[guide_map['Function'] == 'Lethal Control', 'gene'].unique()

In [7]:
gene_effect = model.gene_effect
gene_effect = gene_effect.subtract(model.gene_effect.loc[:, neg_con].median(axis=1).median(), axis=0)
gene_effect = gene_effect.divide(model.gene_effect.loc[:, pos_con].median(axis=1).abs().median(), axis=0)
gene_effect

gene,AAVS,BLM,CD47,CD63,CDKN2A,Ch22,FOCAD,HACD4,IFNA10,IFNA14,...,POLR1C,POLR2D,RAD51,RPA1,RPA2,RecQL1,RecQL4,RecQL5,SF3B1,WRN
cell_line_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Ch2-2,0.19821,-0.26419,-0.019702,-0.202424,0.161687,0.086504,-0.159614,-0.115506,0.030651,-0.073329,...,-1.109926,-1.337034,-1.09935,-1.598327,-0.746743,0.005388,-0.209613,0.024706,-1.357318,0.022348
PELO1,0.210632,-0.260169,-0.033406,-0.183425,0.161934,0.054452,-0.478578,-0.090828,0.034737,-0.056125,...,-1.036756,-1.31039,-1.109442,-1.6131,-0.682663,-0.000709,-0.138383,0.043668,-1.343628,0.04351
PELO2,0.219337,-0.250749,-0.072253,-0.192066,0.16057,0.063674,-0.541295,-0.075546,0.036401,-0.055231,...,-1.044134,-1.273908,-1.052479,-1.624128,-0.686925,0.0,-0.145139,0.042951,-1.361224,0.054788


# Save outputs

In [8]:
lfc.to_csv("outputs/logfold_change_9p21_3.csv")
gene_effect.to_csv("outputs/gene_effect_9p21_3.csv")