# How to build a genotype-phenotype map (a.k.a. sequence space) from protein lattice models

This notebook demonstrates how to use Jesse Bloom's **protein lattice model package**, [latticeproteins](), to build genotype-phenotype map. The phenotypes are the protein's folding stability in this case. You must have `latticeproteins` installed as a dependency for this package. 

We'll begin by importing some of his package here.

In [1]:
import os
from latticeproteins.conformations import Conformations

In `latticeproteins`'s `conformations` module, we can build the ensemble of all possible conformations for sequences of the same length.

In [2]:
length = 6
database_dir = "%s/database" % os.getcwd()
c = Conformations(length, database_dir)

Here comes the new stuff...

We'll import the `LatticeSequenceSpace` object which will build a sequence space between two starting sequences that differ at all sites.

In [3]:
from latticegpm.space import LatticeConformationSpace
from latticegpm.utils import search_conformation_space

First, we need to find two sequences that have a non-zero fitness and differ at all sites! `search_fitness_landscape` does exactly that.

In [4]:
temperature = 1.0
threshold = -0.5
wildtype, mutant = search_conformation_space(c, temperature, threshold)
print("Wildtype sequence: " + wildtype)
print("Mutant sequence: " + mutant)

Wildtype sequence: FSQIEH
Mutant sequence: VFTTWN


Now, we'll build a sequence space between these two ligands with the `LatticeSequenceSpace` object and print out some example nodes in this space.

In [6]:
# Create an instance of LatticeFitnessSpace
sequence_space = LatticeConformationSpace(wildtype, mutant, c, temperature=temperature)
# Print some example sequence
sequence_space.print_sequences(sequence_space.sequences[0:10])

* * * *
       
* S-Q *
  | |  
* F I *
    |  
* H-E *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F I *
    |  
* N-E *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F I *
    |  
* H-W *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F I *
    |  
* N-W *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F T *
    |  
* H-E *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F T *
    |  
* N-E *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F T *
    |  
* H-W *
       
* * * *
* * * *
       
* S-Q *
  | |  
* F T *
    |  
* N-W *
       
* * * *
* * * *
       
* S-T *
  | |  
* F I *
    |  
* H-E *
       
* * * *
* * * *
       
* S-T *
  | |  
* F I *
    |  
* N-E *
       
* * * *


We can access all sequences and fitness in this space by calling these properties.

In [8]:
genotypes = sequence_space.sequences
phenotypes =  sequence_space.stabilities