# Compute probabilities of escape
In some experiments that involve antibody selections, it is possible to spike in a "neutralization standard", which is a set of variants known not to be affected by the antibody.
In such cases, it is then possible to compute the probability of escape of each variant, which is just its change in frequency relative to the standard.
For instance, if such experiments are done at enough concentrations, it is even possible to reconstruct a conventional neutralization curve.

This notebook illustrates how to use `dms_variants` to compute these probabilities of escape:

First, import Python modules:

In [1]:
import io
import textwrap

import altair as alt

import dms_variants.codonvarianttable

import pandas as pd

Read in the `CodonVariantTable`.
These data correspond to snippets of the variant counts from a real experiment on the SARS-CoV-2 spike:

In [2]:
with open("spike.txt") as f:
    spike_seq = f.read().strip()

variants = dms_variants.codonvarianttable.CodonVariantTable.from_variant_count_df(
    variant_count_df_file="prob_escape_codon_variant_table.csv",
    primary_target="spike",
    geneseq=spike_seq,
    allowgaps=True,
)

Set up a data frame giving the antibody / no-antibody sample pairings for each selection:

In [3]:
selections_df = pd.read_csv(
    io.StringIO(
        textwrap.dedent(
            """\
    library,antibody_sample,no-antibody_sample
    lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.15_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.15_2,thaw-1_no-antibody_control_2
    lib1,thaw-1_REGN10933_0.59_1,thaw-1_no-antibody_control_1
    lib1,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2
    lib1,thaw-2_279C_0.00088_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.00088_2,thaw-2_no-antibody_control_2
    lib1,thaw-2_279C_0.0035_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.0035_2,thaw-2_no-antibody_control_2
    lib1,thaw-2_279C_0.014_1,thaw-2_no-antibody_control_1
    lib1,thaw-2_279C_0.014_2,thaw-2_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.037_2,thaw-1_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.15_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.15_2,thaw-1_no-antibody_control_2
    lib2,thaw-1_REGN10933_0.59_1,thaw-1_no-antibody_control_1
    lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2
    """
        )
    )
)

Now run `CodonVariantTable.prob_escape`:

In [4]:
prob_escape, neut_standard_fracs, neutralization = variants.prob_escape(
    selections_df=selections_df
)

In [5]:
neutralization

Unnamed: 0,library,antibody_sample,no-antibody_sample,target,n_aa_substitutions,antibody_neut_standard_count,no-antibody_neut_standard_count,antibody_count,no-antibody_count,prob_escape_uncensored,prob_escape
0,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,neut_standard,0,11107,16475,11107,16475,1.000000,1.000000
1,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,spike,0,11107,16475,42343,140698,0.446398,0.446398
2,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,spike,1,11107,16475,102466,340357,0.446554,0.446554
3,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,spike,2,11107,16475,124509,404439,0.456643,0.456643
4,lib1,thaw-1_REGN10933_0.037_1,thaw-1_no-antibody_control_1,spike,3,11107,16475,106742,342394,0.462421,0.462421
...,...,...,...,...,...,...,...,...,...,...,...
114,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,spike,1,52409,10877,179056,361718,0.102736,0.102736
115,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,spike,2,52409,10877,201992,331201,0.126574,0.126574
116,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,spike,3,52409,10877,141530,219133,0.134043,0.134043
117,lib2,thaw-1_REGN10933_0.59_2,thaw-1_no-antibody_control_2,spike,4,52409,10877,78292,83427,0.194766,0.194766


In [6]:
x = [1, 2, 3]
x.insert(0, "target")

In [7]:
x

['target', 1, 2, 3]