# `preferences` method of `AbstractEpistasis`
Tests `AbstractEpistasis.preferences`.

Import Python modules:

In [1]:
import binarymap

import pandas as pd

import dms_variants.globalepistasis

Create functional score data frame:

In [2]:
func_scores_df = pd.DataFrame(
        {'aa_substitutions': ['M1A', 'M1C', 'A2M', 'A2C', 'M1C A2C', 'M1*'],
         'func_score':       [-0.1,  -2.3,   0.8,  -1.2,  -3.0,      -5.0],
         })

func_scores_df

Unnamed: 0,aa_substitutions,func_score
0,M1A,-0.1
1,M1C,-2.3
2,A2M,0.8
3,A2C,-1.2
4,M1C A2C,-3.0
5,M1*,-5.0


Create binarymap:

In [3]:
binarymap = binarymap.BinaryMap(
            func_scores_df,
            func_score_var_col=None,
            alphabet=['A', 'C', 'M', '*'])

Now initialize a `MonotonicSplineEpistasisGaussianLikelihood` model and fit it:

In [4]:
model = dms_variants.globalepistasis.MonotonicSplineEpistasisGaussianLikelihood(binarymap)
_ = model.fit(ftol=1e-10)
model.phenotypes_df.round(1)

Unnamed: 0,aa_substitutions,func_score,func_score_var,latent_phenotype,observed_phenotype
0,M1A,-0.1,,0.1,-0.1
1,M1C,-2.3,,-1.3,-2.3
2,A2M,0.8,,0.7,0.8
3,A2C,-1.2,,-0.5,-1.2
4,M1C A2C,-3.0,,-1.7,-3.0
5,M1*,-5.0,,-2.5,-5.0


Now get the preferences for the latent phenotype using the default method of handling missing values:

In [5]:
model.preferences('latent', base=2).round(3)

Unnamed: 0,site,A,C,M
0,1,0.429,0.169,0.402
1,2,0.303,0.219,0.478


Same for observed phenotype:

In [6]:
model.preferences('observed', base=2).round(3)

Unnamed: 0,site,A,C,M
0,1,0.482,0.105,0.413
1,2,0.268,0.146,0.585


Now with including the stop codon. Note how this also requires the missing value to be guessed from the overall average for mutating site 1 to `*`:

In [7]:
model.preferences('observed', base=2, exclude_chars=[]).round(3)

Unnamed: 0,site,A,C,M,*
0,1,0.475,0.103,0.406,0.016
1,2,0.241,0.131,0.525,0.102


Now do the same but get the missing values from the **site** averages:

In [8]:
model.preferences('observed', base=2, exclude_chars=[], missing='site_average').round(3)

Unnamed: 0,site,A,C,M,*
0,1,0.475,0.103,0.406,0.016
1,2,0.208,0.113,0.453,0.226


Get preferences in tidy format:

In [9]:
model.preferences('observed', base=2, returnformat='tidy').round(3)

Unnamed: 0,wildtype,site,mutant,preference
0,M,1,C,0.105
1,A,2,C,0.146
2,A,2,A,0.268
3,M,1,M,0.413
4,M,1,A,0.482
5,A,2,M,0.585


Stringency re-scale:

In [10]:
model.preferences('observed', base=2, stringency_param=2).round(3)

Unnamed: 0,site,A,C,M
0,1,0.562,0.027,0.412
1,2,0.165,0.049,0.786


Raise an error on missing values:

In [11]:
model.preferences('observed', base=2, missing='error').round(3)

Unnamed: 0,site,A,C,M
0,1,0.482,0.105,0.413
1,2,0.268,0.146,0.585


In [12]:
# NBVAL_RAISES_EXCEPTION

model.preferences('observed', base=2, exclude_chars=[], missing='error')

ValueError: missing functional scores for some mutations