# Average antibody escape across `polyclonal` models
This notebook aggregates and averages the antibody escape computed across multiple fit `polyclonal` models to different libraries, replicates, etc.

First, import Python modules:

In [None]:
import os
import pickle

import pandas as pd

import polyclonal

import yaml

Get parameterized variables from [papermill](https://papermill.readthedocs.io/):

In [None]:
# papermill parameters cell (tagged as `parameters`)
antibody = None
escape_avg_method = None
polyclonal_config = None
avg_pickle = None
selection_groups_dict = None

Convert `selection_groups` into a data frame and get all of the pickled models:

In [None]:
models_df = pd.DataFrame.from_dict(selection_groups_dict, orient="index")
print(f"Averaging the following models for {antibody=}")
display(models_df)

# convert pickle files into models
assert all(map(os.path.isfile, models_df["pickle_file"]))
models_df = models_df.assign(
    model=lambda x: x["pickle_file"].map(lambda f: pickle.load(open(f, "rb")))
).drop(columns="pickle_file")

Now build the average model:

In [None]:
avg_model = polyclonal.PolyclonalAverage(
    models_df,
    default_avg_to_plot=escape_avg_method,
)

Look at correlation in escape values across replicates:

In [None]:
avg_model.mut_escape_corr_heatmap()

Get `times_seen` for the plotting:

In [None]:
with open(polyclonal_config) as f:
    times_seen = yaml.safe_load(f)[antibody]["times_seen"]

print(f"{times_seen=}")

Plot the activities:

In [None]:
avg_model.activity_wt_barplot()

Plot the site summaries of the escape:

In [None]:
avg_model.mut_escape_lineplot(
    mut_escape_site_summary_df_kwargs={"min_times_seen": times_seen},
)

Plot the mutation-level escapes averaged across replicates:

In [None]:
avg_model.mut_escape_heatmap(init_min_times_seen=times_seen)

Save the average model to a pickle file:

In [None]:
print(f"Saving model to {avg_pickle=}")

with open(avg_pickle, "wb") as f:
    pickle.dump(avg_model, f)