# Introduction to Genotype-Phenotype Map Module

This notebook a brief introduction to how the genotype-phenotype map module works. 

External imports for plotting and other utilities 

In [3]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

Imports from `gpm` module.

In [5]:
from gpmap import GenotypePhenotypeMap

Let's define and arbitrary space. Everything in the cell below will typically be given to you by the experimental data.

In [22]:
utils.genotypes_to_binary?

[0;31mSignature:[0m [0mutils[0m[0;34m.[0m[0mgenotypes_to_binary[0m[0;34m([0m[0mwildtype[0m[0;34m,[0m [0mgenotypes[0m[0;34m,[0m [0mmutations[0m[0;34m)[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Get binary representation of genotypes w.r.t. to wildtype.

Parameters
----------
wildtype : str
    wildtype sequence.

genotypes : list
    List of genotypes to transform.

mutations : dict
    mutations dictionary that maps sites to mutations.

Returns
-------
binary : list
    list of binary representations.
[0;31mFile:[0m      ~/Documents/research/projects/pkgs/gpmap/gpmap/utils.py
[0;31mType:[0m      function


In [23]:
genotypes

['AAA',
 'AAT',
 'ACA',
 'ACT',
 'AGA',
 'AGT',
 'ATA',
 'ATT',
 'CAA',
 'CAT',
 'CCA',
 'CCT',
 'CGA',
 'CGT',
 'CTA',
 'CTT',
 'GAA',
 'GAT',
 'GCA',
 'GCT',
 'GGA',
 'GGT',
 'GTA',
 'GTT',
 'TAA',
 'TAT',
 'TCA',
 'TCT',
 'TGA',
 'TGT',
 'TTA',
 'TTT']

In [24]:
from gpmap import utils

# Wildtype sequence
wt = "AAA"

# Micro-managing here, stating explicitly what substitutions are possible at each site.
# See documentation for more detail.
mutations = {
    0:utils.DNA,
    1:utils.DNA,
    2:["A","T"]
}

genotypes = utils.mutations_to_genotypes(mutations, wildtype=wt)
binary = utils.genotypes_to_binary(wt, genotypes, mutations)

# Generate random phenotype values
phenotypes = np.random.rand(len(genotypes))

## Creating a Genotype-phenotype map instance

Create an instance of the GenoPhenoMap object, passing in the wildtype sequence, genotypes and their phenotypes, and the substitution map. 

In [25]:
from gpmap import GenotypePhenotypeMap

In [26]:
gpm = GenotypePhenotypeMap(wt, # wildtype sequence
                   genotypes, # genotypes
                   phenotypes, # phenotypes
                   stdeviations=None, # errors in measured phenotypes
                   log_transform=False, # Should the map log_transform the space?
                   mutations=mutations # Substitution map to alphabet 
)

In [30]:
gpm.data

Unnamed: 0,genotypes,n_replicates,phenotypes,stdeviations,binary
0,AAA,1,0.398113,,0
1,AAT,1,0.516832,,1
2,ACA,1,0.883106,,1000
3,ACT,1,0.930886,,1001
4,AGA,1,0.434212,,100
5,AGT,1,0.338685,,101
6,ATA,1,0.34692,,10
7,ATT,1,0.38408,,11
8,CAA,1,0.641701,,1000000
9,CAT,1,0.450876,,1000001
