# Compute probabilities of escape
In some experiments that involve antibody selections, it is possible to spike in a "neutralization standard", which is a set of variants known not to be affected by the antibody.
In such cases, it is then possible to compute the probability of escape of each variant, which is just its change in frequency relative to the standard.
For instance, if such experiments are done at enough concentrations, it is even possible to reconstruct a conventional neutralization curve.

This notebook illustrates how to use `dms_variants` to compute these probabilities of escape:

First, import Python modules:

In [1]:
import io
import textwrap

import altair as alt

import dms_variants.codonvarianttable

import pandas as pd

Read in the `CodonVariantTable`.
These data correspond to snippets of the variant counts from a real experiment on the SARS-CoV-2 spike:

In [2]:
with open("spike.txt") as f:
    spike_seq = f.read().strip()

variants = dms_variants.codonvarianttable.CodonVariantTable.from_variant_count_df(
    variant_count_df_file="prob_escape_codon_variant_table.csv",
    primary_target="spike",
    geneseq=spike_seq,
    allowgaps=True,
)

Set up a data frame giving the antibody / no-antibody sample pairings for each selection:

In [3]:
selections_df = pd.read_csv(io.StringIO(textwrap.dedent(
    """\
    library,antibody_sample,no-antibody_sample
    lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.15_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.15_2,thaw-1_no-antibody_control_2
    lib1,thaw-1_REGN10933_0.59_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2
    lib1,thaw-2_279C_0.00088_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.00088_2,thaw-2_no-antibody_control_2
    lib1,thaw-2_279C_0.0035_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.0035_2,thaw-2_no-antibody_control_2
    lib1,thaw-2_279C_0.014_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.014_2,thaw-2_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.037_2,thaw-1_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.15_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.15_2,thaw-1_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.59_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2
    """
)))

Now run `CodonVariantTable.prob_escape`:

In [4]:
prob_escape, neut_standard_fracs, neutralization = (
    variants.prob_escape(selections_df=selections_df, by="codon_substitutions")
)

In [5]:
prob_escape

Unnamed: 0,library,antibody_sample,no-antibody_sample,codon_substitutions,n_codon_substitutions,aa_substitutions,n_aa_substitutions,prob_escape,prob_escape_uncensored,antibody_count,no-antibody_count,antibody_neut_standard_count,no-antibody_neut_standard_count
0,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,neut_standard,0,neut_standard,0,1.000000,1.000000,11107,16475,11107,16475
1,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,AAG180AGG CAT653CAG,2,K180R H653Q,2,0.509973,0.509973,1794,5218,11107,16475
2,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,CAC146CAG TCC151--- GCT417GCA AGC475ATC ACC111...,6,H146Q S151- S475I I1208T,4,0.627884,0.627884,1733,4094,11107,16475
3,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,AGA450AAG TTG580AAC AGC744AGT CAG955AAG,4,R450K L580N Q955K,3,0.592655,0.592655,1428,3574,11107,16475
4,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,AGC98CAC GTG306ATC AGC475ATC GAT1082AAT CTG119...,5,S98H V306I S475I D1082N L1191M,5,0.668822,0.668822,1350,2994,11107,16475
...,...,...,...,...,...,...,...,...,...,...,...,...,...
79249,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,TTT2CTG GCC520AAC CTG856CTA,3,F2L A520N,2,0.000000,0.000000,0,55,52409,10877
79250,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,GAG279GGG AGA993ATA,2,E279G R993I,2,0.000000,0.000000,0,1,52409,10877
79251,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,AAC437ACC CTG515TTC ACC910ATC TGT1041TAT GAC11...,5,N437T L515F T910I C1041Y D1116C,5,0.000000,0.000000,0,40,52409,10877
79252,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,AAG1071AGG,1,K1071R,1,0.000000,0.000000,0,2,52409,10877


In [6]:
x = [1, 2, 3]
x.insert(0, "target")

In [7]:
x


['target', 1, 2, 3]