# TMH Open Topology
## A tool to evaluate the topological preference of a TMH based on a population of TMHs with known topology

This runs the TMH sequence through a positionally dependent matrix of residue scores and checks the total score between forwards/backwards runs of the TMH. A greater difference indicates a greater topological preference. The advantage of this method is that the individual contribution of each residue are calculated, and whilst the accuracy of the predictor may not always be the highest overall, it allows for sensitive evaluation of the topology of a TMP without the need for hidden layers in neural networks or HMMs.

The input sequence should be a single TMH with 5 flanking residues either side. However, flanks are often fuzzy in biology, so determining the exact TMH boundary at a single residue position is difficult, if not, impossible. Users are encouraged to submit several versions of the TMH with different boundaries.



## Input

In [1]:
#Replace the below AA sequence with your own.
input_sequence=str("FRFRVIAALGFLVGAKVLNVQVPFLFKL")

## Sequence checks

In [2]:
def characters(input_sequence):
    amino_acids = ["I","V","L","F","C","M","A","G","T","S","W","Y","P","H","E","Q","D","N","K","R"]
    character_check = True
    for i in list(input_sequence):
        if str(i) not in amino_acids:
            print("Character detected that do not represent an amino acid. Please remove any non-AA characters or convert the characters to upper case.")
            character_check = False
    if character_check == True:
        return(True)
    else:
        return(False)

def length(input_sequence):
    lower_length_cutoff = 20
    higher_length_cutoff = 40
    if len(input_sequence) > lower_length_cutoff and len(input_sequence) < higher_length_cutoff:
        return(True)
    else:
        print("Sequence must be between", lower_length_cutoff, "and", higher_length_cutoff, "residues long.")
        return(False)
      
        
# The sequence validity is set to false, then the checks are run to see if it can be set to true.
sequence_integrity = False

if characters(input_sequence) == True and length(input_sequence) == True:
    print("Sequence valid")
    sequence_integrity = True


Sequence valid


## Score Calculation

In [8]:
def topologyscore(input_sequence):
    print("\nInside to outside score")
    for position, residue in enumerate(list(input_sequence)):
        print(residue, "at position", position, "scores", )
    print("\nOutside to inside score")
    for position, residue in enumerate(list(input_sequence[::-1])):
        print(residue, "at position", position, "scores", )
    return("Topology-score, Likelihood score")

if sequence_integrity == True:      
    print(topologyscore(input_sequence))


 


Inside to outside score
F at position 0 scores
R at position 1 scores
F at position 2 scores
R at position 3 scores
V at position 4 scores
I at position 5 scores
A at position 6 scores
A at position 7 scores
L at position 8 scores
G at position 9 scores
F at position 10 scores
L at position 11 scores
V at position 12 scores
G at position 13 scores
A at position 14 scores
K at position 15 scores
V at position 16 scores
L at position 17 scores
N at position 18 scores
V at position 19 scores
Q at position 20 scores
V at position 21 scores
P at position 22 scores
F at position 23 scores
L at position 24 scores
F at position 25 scores
K at position 26 scores
L at position 27 scores

Outside to inside score
L at position 0 scores
K at position 1 scores
F at position 2 scores
L at position 3 scores
F at position 4 scores
P at position 5 scores
V at position 6 scores
Q at position 7 scores
V at position 8 scores
N at position 9 scores
L at position 10 scores
V at position 11 scores
K at p

## Interpretting the score.

