# TMH Open Topology
## A tool to evaluate the topological preference of a TMH based on a population of TMHs with known topology

This runs the TMH sequence through a positionally dependent matrix of residue scores and checks the total score between forwards/backwards runs of the TMH. A greater difference indicates a greater topological preference. The advantage of this method is that the individual contribution of each residue are calculated, and whilst the accuracy of the predictor may not always be the highest overall, it allows for sensitive evaluation of the topology of a TMP without the need for hidden layers in neural networks or HMMs.

The input sequence should be a single TMH with 5 flanking residues either side. However, flanks are often fuzzy in biology, so determining the exact TMH boundary at a single residue position is difficult, if not, impossible. Users are encouraged to submit several versions of the TMH with different boundaries.



## Input
Enter your input below. Note that this is not the full protein sequence, nor a fasta formatted sequence. The sequence should be the predicted, or experimentally derived TMH with ±5 flanking residues.

For example, in the ion channel (UniProt ID Q401N2) there is a TMH between positions 234-254 (IIALLVPAEALLLADVCGGLL) so here, TALKSIIALLVPAEALLLADVCGGLLPLRAI would be the input.

Enter your sequence in the input box, and press enter/return.


In [2]:
#Replace the below AA sequence with your own.
input_sequence=input()

TALKSIIALLVPAEALLLADVCGGLLPLRAI


## Source code

In [3]:
def characters(input_sequence):
    amino_acids = ["I","V","L","F","C","M","A","G","T","S","W","Y","P","H","E","Q","D","N","K","R"]
    character_check = True
    for i in list(input_sequence):
        if str(i) not in amino_acids:
            print("Character detected that do not represent an amino acid. Please remove any non-AA characters or convert the characters to upper case.")
            character_check = False
    if character_check == True:
        return(True)
    else:
        return(False)

def length(input_sequence):
    lower_length_cutoff = 20
    higher_length_cutoff = 40
    if len(input_sequence) > lower_length_cutoff and len(input_sequence) < higher_length_cutoff:
        return(True)
    else:
        print("Sequence must be between", lower_length_cutoff, "and", higher_length_cutoff, "residues long.")
        return(False)
    
              
def topologyscore(input_sequence):
    print("\nInside to outside score")
    for position, residue in enumerate(list(input_sequence)):
        print(residue, "at position", position, "scores", )
    print("\nOutside to inside score")
    for position, residue in enumerate(list(input_sequence[::-1])):
        print(residue, "at position", position, "scores", )
    return("Topology-score, Likelihood score")

      
        
# The sequence validity is set to false, then the checks are run to see if it can be set to true.
sequence_integrity = False

if characters(str(input_sequence)) == True and length(str(input_sequence)) == True:
    print("Sequence valid")
    sequence_integrity = True


if sequence_integrity == True:      
    print(topologyscore(input_sequence))


Sequence valid

Inside to outside score
T at position 0 scores
A at position 1 scores
L at position 2 scores
K at position 3 scores
S at position 4 scores
I at position 5 scores
I at position 6 scores
A at position 7 scores
L at position 8 scores
L at position 9 scores
V at position 10 scores
P at position 11 scores
A at position 12 scores
E at position 13 scores
A at position 14 scores
L at position 15 scores
L at position 16 scores
L at position 17 scores
A at position 18 scores
D at position 19 scores
V at position 20 scores
C at position 21 scores
G at position 22 scores
G at position 23 scores
L at position 24 scores
L at position 25 scores
P at position 26 scores
L at position 27 scores
R at position 28 scores
A at position 29 scores
I at position 30 scores

Outside to inside score
I at position 0 scores
A at position 1 scores
R at position 2 scores
L at position 3 scores
P at position 4 scores
L at position 5 scores
L at position 6 scores
G at position 7 scores
G at position 8 s

## Interpretting the score.
Validation pending.
