# Problem:

Implementation the classical affine-gap local alignment algorithm, for sequence over the 20-letter protein alphabet.

The Blosum scoring matrix can be found in attachment. For the data attached there is a collection of two sets of 50 protein sequences, each divided into 25 pairs of putatively homologous sequences; one is pairs of sequences from human and mouse, while the other is from pairs of sequences from human and fruit flies. (For affine-gap weight, you can choice $W_g = 10, W_s = 2$, q is the length of the gap)

# Solution:

## Data Preparation

First, convert the doc file to plain text by hand, and save as *data_human_fly* and *data_human_mouse*.

Then, clean the data, and save the sequences to Python lists.

In [1]:
import numpy as np
# np.set_printoptions(threshold=np.inf)
Wg, Ws = 10, 2

def data_preparation(file):
    with open(file, 'rt') as f:
        lines = f.readlines()
    
    l = list()
    fly, human, s = '', '', ''
    pre_seq = False
    for line in lines:
        line = line[:-1]
        if line.isalpha() and line.isupper():
            s += line
            pre_seq = True
        elif pre_seq:
            if len(fly) == 0:
                fly = s
                s = ''
            else:
                human = s
                s = ''
                l.append([fly, human])
                fly, human = '', ''
            pre_seq = False
    return l

In [2]:
human_fly = data_preparation('data_human_fly')
human_mouse = data_preparation('data_human_mouse')

# the human_fly data has some problem, exchange fly and human sequences in each pair
for i, v in enumerate(human_fly):
    human_fly[i][0], human_fly[i][1] = v[1], v[0]

In [3]:
# input BLOSUM62
with open('BLOSUM62-As1.txt', 'rt') as f:
    lines = f.readlines()
    
blosum = dict()
col = lines[0].split()
for i in range(1, len(col)+1):
    tmp_row = lines[i].split()
    row = tmp_row[0]
    for j, v in enumerate(tmp_row[1:]):
        blosum[row + col[j]] = int(v)

## Computer the optimal score

In [4]:
def local_alignment(s1, s2):
    
    # d is a tensor of 3 dims, the first dim represents maxtrix ABC
    d = np.zeros((3, len(s1)+1, len(s2)+1), dtype=np.int32)
    # initialize matrix A
    d[0, :, 0] = d[0, 0, :] = np.iinfo(np.int32).min
    d[0, 0, 0] = 0
    # initialize matrix B
    d[1, 0, :] = [-(Wg + Ws * k) for k in range(len(s2)+1)]
    d[1, :, 0] = np.iinfo(np.int32).min
    # initialize matrix C
    d[2, :, 0] = [-(Wg + Ws * k) for k in range(len(s1)+1)]
    d[2, 0, :] = np.iinfo(np.int32).min
    
    s = np.zeros((len(s1)+1, len(s2)+1), dtype=np.int32)
    s_dir = np.zeros((len(s1)+1, len(s2)+1), dtype=np.int32)
    for i, x in enumerate(s1):
        # print('x')
        for j, y in enumerate(s2):
            # print('y')
            # compute A[i, j]
            d[0, i+1, j+1] = blosum[x+y] + max(*d[:, i, j], 0)
            # compute B[i, j]
            d[1, i+1, j+1] = max(*(d[:, i+1, j] - [Wg + Ws, Ws, Wg + Ws]), 0)
            # compute C[i, j]
            d[2, i+1, j+1] = max(*(d[:, i, j+1] - [Wg + Ws, Wg + Ws, Ws]), 0)
            # score matrix
            s[i+1, j+1] = d[:, i+1, j+1].max()
            s_dir[i+1, j+1] = d[:, i+1, j+1].argmax()
    return s, s_dir

## Traceback

In [5]:
def traceback(s, s_dir, s1, s2):
    ans1, ans2 = '', ''
    max_loc = s.argmax()
    i, j = max_loc // s.shape[1], max_loc % s.shape[1]
    score = s[i, j]
    gaps = 0
    while i > 0 and j > 0 and s[i, j] > 0:
        if s_dir[i, j] == 0:
            ans1 += s1[i-1]
            ans2 += s2[j-1]
            i, j = i - 1, j - 1
        elif s_dir[i, j] == 1:
            ans1 += '█'
            ans2 += s2[j-1]
            j = j - 1
            gaps += 1
        else:
            ans1 += s1[i-1]
            ans2 += '█'
            i = i - 1
            gaps += 1
    ans1, ans2 = ans1[::-1], ans2[::-1]
    return score, gaps, ans1, ans2

## Running

In [6]:
def running(data):
    for i, (s1, s2) in enumerate(data):
        print("Pair of Sequences:", i+1)
        print("Len S1:", len(s1), "   Len S2:", len(s2))
        s, s_dir = local_alignment(s1, s2)
        score, gaps, ans1, ans2 = traceback(s, s_dir, s1, s2)
        print("Score:", score)
        print("Gaps:", gaps)
        print(ans1, ans2, '\n\n\n', sep='\n')

### Human-Fly

In [7]:
running(human_fly)

Pair of Sequences: 1
Len S1: 858    Len S2: 844
Score: 3499
Gaps: 22
MVNFTVDQIRAIMDKKANIRNMSVIAHVDHGKSTLTDSLVCKAGIIASARAGETRFTDTRKDEQERCITIKSTAISLFYELSENDLNFI████KQSKDGAGFLINLIDSPGHVDFSSEVTAALRVTDGALVVVDCVSGVCVQTETVLRQAIAERIKPVLMMNKMDRALLELQLEPEELYQTFQRIVENVNVIISTYGEGESGPMGNIMIDPVLGTVGFGSGLHGWAFTLKQFAEMYVAKFAAKGEGQLGPAERAKKVEDMMKKLWGDRYFDPANGKFSKSATSPEGKKLPRTFCQLILDPIFKVFDAIMNFKKEETAKLIEKLDIKLDSEDKDKEGKPLLKAVMRRWLPAGDALLQMITIHLPSPVTAQKYRCELLYEGPPDDEAAMGIKSCDPKGPLMMYISKMVPTSDKGRFYAFGRVFSGLVSTGLKVRIMGPNYTPGKKEDLYLKPIQRTILMMGRYVEPIEDVPCGNIVGLVGVDQFLVKTGTITTFEHAHNMRVMKFSVSPVVRVAVEAKNPADLPKLVEGLKRLAKSDPMVQCIIEESGEHIIAGAGELHLEICLKDLEEDHACIPIKKSDPVVSYRETVSEESNVLCLSKSPNKHNRLYMKARPFPDGLAEDIDKGEVSARQELKQRARYLAEKYEWDVAEARKIWCFGPDGTGPNILTDITKGVQYLNEIKDSVVAGFQWATKEGALCEENMRGVRFDVHDVTLHADAIHRGGGQIIPTARRCLYASVLTAQPRLMEPIYLVEIQCPEQVVGGIYGVLNRKRGHVFEESQVAGTPMFVVKAYLPVNESFGFTADLRSNTGGQAFPQCVFDHWQILPGDPFDNSSRPSQVVAETRKRKGLKEGIPALDNFLDKL
MVNFTVDEIRGLMDKKRNIRNMSVIAHVDHGKSTLTDSLVSKAGIIAGAKAGETRFTDTRKDEQERCI

Score: 9283
Gaps: 235
GACYNTSQKCDWKVDCRDSSDEINCTEI█CLHNEFSCGNG█ECIPRAYVCDHDNDCQDGSDEHACNYPTCGGYQFTCPSGRCIYQNWVCDGEDDCKDNGDEDGC█E█SGPHDVHKCSPREWSCPESGRCISIYKVCDGILDCPGREDENNTSTGKYCSMTLCSALNCQYQCHETPYGGACFCPPGYIINHNDSRTCVEFDDCQIWGICDQKCESRPGRHLCHCEEGYILERGQYCKANDSFGEASIIFSNGRDLLIGDIHGRSFRILVESQNRGVAVGVAFHYHLQRVFWTDTVQNKVFSVDINGLN███IQ█EVLNVSVETPENLAVDWVNNKIYLVETKVNRIDMVNLDGSYRVTLITENLGHPRGIAVDPTVGYLFFSDWESLSGEPKLERAFMDGSNRKDLVKTKLGWPAGVTLDMISKRVYWVDSRFDYIETVTYDGIQRKTVVHGGSLIPHPFGVSLFEGQVFFTDWTKMAVLKANKFT█ETNPQVYYQA█S█LRPYGVTVYHSLRQPYATNPCKDNNGGCEQVCVLSH█RTDNDGLGFRCKCTFGFQLDTDERHCIAVQNFLIFSSQVAIRG██IPFTLSTQEDVMVPVSGNPSFFVGIDFDAQDSTIFFSDMSKHMIFKQKIDGTGREILAANRVENVESLAFDWISKNLYWTDSHYKSISVMRLADKT█RRTVVQYLNNPRSVVVHPFAGYLFFTDWFRPAKIMRAWSDGSHLLPVINTTLGWPNGLAIDWAASRLYWVDAYFDKIEHSTFDGLDRRRLGHIEQMTHPFGLAIFGEHLFFTDWRLGAIIRVRKADG██GEMTVIRSGIAYILHLKSYDVNIQTGSNACNQPTH█PNGDCSHFCF██████████P█V████P███NFQRVCGCPYGMRLASNHLTCEGDPTNEPPTEQC█GLFSFPCKNGRCVPNYYLCDGVDDCHDNSDEQLCGTLNNTCSSSAFTCGHGECIPAHWRCDKRNDCVDG

Pair of Sequences: 7
Len S1: 315    Len S2: 341
Score: 964
Gaps: 1
LSCRFYQHKFPEVEDVVMVNVRSIAEMGAYVSLLEYNNIEGMILLSELSRRRIRSINKLIRIGRNECVVVIRVDKEKGYIDLSKRRVSPEEAIKCEDKFTKSKTVYSILRHVAEVLEYTKDEQLESLFQRTAWVFDDKYKRPGYGAYDAFKHAVSDPSILDSLDLNEDEREVLINNINRRLTPQAVKIRADIEVACYGYEGIDAVKEALRAGLNCSTENMPIKINLIAPPRYVMTTTTLERTEGLSVLSQAMAVIKEKIEEKRGVFNVQMEPKVVTDTDETELARQMERLERENAEVDGDDDAEE
LTSRFYNERYPEIEDVVMVNVLSIAEMGAYVHLLEYNNIEGMILLSELSRRRIRSINKLIRVGKTEPVVVIRVDKEKGYIDLSKRRVSPEDVEKCTERFAKAKAINSLLRHVADILGFEGNEKLEDLYQKTAWHFEKKYNNKTV█AYDIFKQSVTDPTVFDECNLEPETKEVLLSNIKRKLVSPTVKIRADIECSCYGYEGIDAVKASLTKGLELSTEELPIRINLIAPPLYVMTTSTTKKTDGLKALEVAIEHIRAKTSEYDGEFKVIMAPKLVTAIDEADLARRLERAEAENAQVAGDDDEED




Pair of Sequences: 8
Len S1: 463    Len S2: 462
Score: 2069
Gaps: 0
MGKEKTHINIVVIGHVDSGKSTTTGHLIYKCGGIDKRTIEKFEKEAAEMGKGSFKYAWVLDKLKAERERGITIDISLWKFETTKYYITIIDAPGHRDFIKNMITGTSQADCAVLIVAAGVGEFEAGISKNGQTREHALLAYTLGVKQLIVGVNKMDSTEPAYSEKRYDEIVKEVSAYIKKIGYNPATVPFVPISGWHGDNMLEPSPNMPWFKGWKVERKEGNASGVSLLEALDTILPPTRPTDKPLRLP

Pair of Sequences: 11
Len S1: 128    Len S2: 299
Score: 401
Gaps: 2
KKLVV█KGGKKKKQVLKFTLDCTHPVEDGIMDAANFEQFLQERIKVNGKAGNLGGGVVTIERSKSKITVTSEVPFSKRYLKYLTKKYLKKNNLRDWLRVVANSKESYELRYFQINQDEEEEEDED
KNVLRGKGQKKKKVSLRFTIDCTNIAEDSIMDVADFEKYIKARLKVNGKVNNLGNN█VTFERSKLKLIVSSDVHFSKAYLKYLTKKYLKKNSLRDWIRVVANEKDSYELRYFRISSNDDEDDDAE




Pair of Sequences: 12
Len S1: 375    Len S2: 376
Score: 1934
Gaps: 0
EEEIAALVIDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDGVTHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKEKLCYVALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTFNSIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKYSVWIGGSILASLSTFQQMWISKQEYDESGPSIVHRKCF
DEEVAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVLDSGDGVSHTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKEKLCYVALDFEQEMATAASS

Len S1: 158    Len S2: 159
Score: 766
Gaps: 0
MSGIALSRLAQERKAWRKDHPFGFVAVPTKNPDGTMNLMNWECAIPGKKGTPWEGGLFKLRMLFKDDYPSSPPKCKFEPPLFHPNVYPSGTVCLSILEEDKDWRPAITIKQILLGIQELLNEPNIQDPAQAEAYTIYCQNRVEYEKRVRAQAKKFA
MSGIAITRLGEERKAWRKDHPFGFVARPAKNPDGTLNLMIWECAIPGKKSTPWEGGLYKLRMIFKDDYPTSPPKCKFEPPLFHPNVYPSGTVCLSLLDEEKDWRPAITIKQILLGIQDLLNEPNIKDPAQAEAYTIYCQNRLEYEKRVRAQARAMA




Pair of Sequences: 22
Len S1: 1013    Len S2: 1041
Score: 4067
Gaps: 2
GDKKDDKDSPKKNKGKERRDLDDLKKEVAMTEHKMSVEEVCRKYNTDCVQGLTHSKAQEILARDGPNALTPPPTTPEWVKFCRQLFGGFSILLWIGAILCFLAYGIQAGTEDDPSGDNLYLGIVLAAVVIITGCFSYYQEAKSSKIMESFKNMVPQQALVIREGEKMQVNAEEVVVGDLVEIKGGDRVPADLRIISAHGCKVDNSSLTGESEPQTRSPDCTHDNPLETRNITFFSTNCVEGTARGVVVATGDRTVMGRIATLASGLEVGKTPIAIEIEHFIQLITGVAVFLGVSFFILSLILGYTWLEAVIFLIGIIVANVPEGLLATVTVCLTLTAKRMARKNCLVKNLEAVETLGSTSTICSDKTGTLTQNRMTVAHMWFDNQIHEADTTEDQSGTSFDKSSHTWVALSHIAGLCNRAVFKGGQDNIPVLKRDVAGDASESALLKCIELSSGSVKLMRERNKKVAEIPFNSTNKYQLSIHETEDPNDNRYLLVMKGAPERILDRCSTILLQGKEQPLDEEMKEAFQNAYLELGGLGERVLGFCHYYLPEEQFPKGFAFDCDDV

### Human-Mouse

In [8]:
running(human_mouse)

Pair of Sequences: 1
Len S1: 1169    Len S2: 1169
Score: 5744
Gaps: 0
MAEEEVAKLEKHLMLLRQEYVKLQKKLAETEKRCALLAAQANKESSSESFISRLLAIVADLYEQEQYSDLKIKVGDRHISAHKFVLAARSDSWSLANLSSTKELDLSDANPEVTMTMLRWIYTDELEFREDDVFLTELMKLANRFQLQLLRERCEKGVMSLVNVRNCIRFYQTAEELNASTLMNYCAEIIASHWDDLRKEDFSSMSAQLLYKMIKSKTEYPLHKAIKVEREDVVFLYLIEMDSQLPGKLNEADHNGDLALDLALSRRLESIATTLVSHKADVDMVDKSGWSLLHKGIQRGDLFAATFLIKNGAFVNAATLGAQETPLHLVALYSSKKHSADVMSEMAQIAEALLQAGANPNMQDSKGRTPLHVSIMAGNEYVFSQLLQCKQLDLELKDHEGSTALWLAVQHITVSSDQSVNPFEDVPVVNGTSFDENSFAARLIQRGSHTDAPDTATGNCLLQRAAGAGNEAAALFLATNGAHVNHRNKWGETPLHTACRHGLANLTAELLQQGANPNLQTEEALPLPKEAASLTSLADSVHLQTPLHMAIAYNHPDVVSVILEQKANALHATNNLQIIPDFSLKDSRDQTVLGLALWTGMHTIAAQLLGSGAAINDTMSDGQTLLHMAIQRQDSKSALFLLEHQADINVRTQDGETALQLAIRNQLPLVVDAICTRGADMSVPDEKGNPPLWLALANNLEDIASTLVRHGCDATCWGPGPGGCLQTLLHRAIDENNEPTACFLIRSGCDVNSPRQPGANGEGEEEARDGQTPLHLAASWGLEETVQCLLEFGANVNAQDAEGRTPIHVAISSQHGVIIQLLVSHPDIHLNVRDRQGLTPFACAMTFKNNKSAEAILKRESGAAEQVDNKGRNFLHVAVQNSDIESVLFLISVHANVNSRVQDASKLTPLHLAVQAGSEIIVRNLLLAGA

Len S1: 566    Len S2: 572
Score: 36
Gaps: 0
LFQCDHVQYTLVPVSGW
LYWCSWIATDLVVVVGW




Pair of Sequences: 8
Len S1: 1480    Len S2: 1476
Score: 6018
Gaps: 6
MQRSPLEKASVVSKLFFSWTRPILRKGYRQRLELSDIYQIPSVDSADNLSEKLEREWDRELASKKNPKLINALRRCFFWRFMFYGIFLYLGEVTKAVQPLLLGRIIASYDPDNKEERSIAIYLGIGLCLLFIVRTLLLHPAIFGLHHIGMQMRIAMFSLIYKKTLKLSSRVLDKISIGQLVSLLSNNLNKFDEGLALAHFVWIAPLQVALLMGLIWELLQASAFCGLGFLIVLALFQAGLGRMMMKYRDQRAGKISERLVITSEMIENIQSVKAYCWEEAMEKMIENLRQTELKLTRKAAYVRYFNSSAFFFSGFFVVFLSVLPYALIKGIILRKIFTTISFCIVLRMAVTRQFPWAVQTWYDSLGAINKIQDFLQKQEYKTLEYNLTTTEVVMENVTAFWEEGFGELFEKAKQNNNNRKTSNGDDSLFFSNFSLLGTPVLKDINFKIERGQLLAVAGSTGAGKTSLLMMIMGELEPSEGKIKHSGRISFCSQFSWIMPGTIKENIIFGVSYDEYRYRSVIKACQLEEDISKFAEKDNIVLGEGGITLSGGQRARISLARAVYKDADLYLLDSPFGYLDVLTEKEIFESCVCKLMANKTRILVTSKMEHLKKADKILILHEGSSYFYGTFSELQNLQPDFSSKLMGCDSFDQFSAERRNSILTETLHRFSLEGDAPVSWTETKKQSFKQTGEFGEKRKNSILNPINSIRKFSIVQKTPLQMNGIEEDSDEPLERRLSLVPDSEQGEAILPRISVISTGPTLQARRRQSVLNLMTHSVNQGQNIHRKTTASTRKVSLAPQANLTELDIYSRRLSQETGLEISEEINEEDLKECFFDDMESIPAVTT

Score: 5013
Gaps: 2
MKLLKPTWVNHNGKPIFSVDIHPDGTKFATGGQGQDSGKVVIWNMSPVLQEDDEKDENIPKMLCQMDNHLACVNCVRWSNSGMYLASGGDDKLIMVWKRATYIGPSTVFGSSGKLANVEQWRCVSILRNHSGDVMDVAWSPHDAWLASCSVDNTVVIWNAVKFPEILATLRGHSGLVKGLTWDPVGKYIASQADDRSLKVWRTLDWQLETSITKPFDECGGTTHVLRLSWSPDGHYLVSAHAMNNSGPTAQIIEREGWKTNMDFVGHRKAVTVVKFNPKIFKKKQKNGSSAKPSCPYCCCAVGSKDRSLSVWLTCLKRPLVVIHELFDKSIMDISWTLNGLGILVCSMDGSVAFLDFSQDELGDPLSEEEKSRIHQSTYGKSLAIMTEAQLSTAVIENPEMLKYQRRQQQQQLDQKSAATREMGSATSVAGVVNGESLEDIRKNLLKKQVETRTADGRRRITPLCIAQLDTGDFSTAFFNSIPLSGSLAGTMLSSHSSPQLLPLDSSTPNSFGASKPCTEPVVAASARPAGDSVNKDSMNATSTPAALSPSVLTTPSKIEPMKAFDSRFTERSKATPGAPALTSMTPTAVERLKEQNLVKELRPRDLLESSSDSDEKVPLAKASSLSKRKLELEVETVEKKKKGRPRKDSRLMPVSLSVQSPAALTAEKEAMCLSAPALALKLPIPSPQRAFTLQVSSDPSMYIEVENEVTVVGGVKLSRLKCNREGKEWETVLTSRILTAAGSCDVVCVACEKRMLSVFSTCGRRLLSPILLPSPISTLHCTGSYVMALTAAATLSVWDVHRQVVVVKEESLHSILAGSDMTVSQILLTQHGIPVMNLSDGKAYCFNPSLSTWNLVSDKQDSLAQCADFRSSLPSQDAMLCSGPLAINQGRTSNSGRQAARLFSVPHVVQQETTLAYLENQVAAALTLQSSHEYRHWLLVYARYLVNEGFEYRLREICKDLLGPVHYSTGSQWESTVVG

Score: 4371
Gaps: 23
MASPTSTNPAHAHFESFLQAQLCQDVLSSFQELCGALGLEPGGGLPQYHKIKDQLNYWSAKSLWTKLDKRAGQPVYQQGRACTSTKCLVVGAGPCGLRVAVELALLGARVVLVEKRTKFSRHNVLHLWPFTIHDLRALGAKKFYGRFCTGTLDHISIRQLQLLLLKVALLLGVEIHWGVTFTGLQPPPRKGSGWRAQLQPNPPAQLANYEFDVLISAAGGKFVPEGFKVREMRGKLAIGITANFVNGRTVEETQVPEISGVARIYNQSFFQSLLKATGIDLENIVYYKDDTHYFVMTAKKQCLLRLGVLRQDWPDTNRLLGSANVVPEALQRFTRAAADFATHGKLGKLEFAQDAHGQPDVSAFDFTSMMRAESSARVQEKHGARLLLGLVGDCLVEPFWPLGTGVARGFLAAFDAAWMVKRWAEGAESLEVLAERESLYQLLSQTSPENMHRNVAQYGLDPATRYPNLNLRAVTPNQVRDLYDVLAKEPVQRNNDKTDTGMPATGSAGTQEELLRWCQEQTAGYPGVHVSDLSSSWADGLALCALVYRLQPGLLEPSELQGLGALEATAWALKVAENELGITPVVSAQAVVAGSDPLGLIAYLSHFHSAFKSMAHSPGPVSQASPGTSSAVLFLSKLQRTLQRSRAKENAEDAGGKKLRLEMEAETPSTEVPPDPEPGV█PLTPP█SQHQEAGAGDLCALCGEHLYVLERLCVNGHFFHRSCFRCHTCEATLWPGGYEQHPGDGHFYCLQHLPQTDHKAEGSDRGPESPELPTPSENSMPPGLSTPTASQEGAGPVPDPSQPTRRQIRLSSPERQRLSSLNLTPDPEMEPPPKPPRSCSALARHALESSFVGWGLPVQSPQALVAMEK███EEKESPFSSEEEEEDVPLDSDVEQALQTFAKTSGTMNNYPTWRRTLLRRAKEEEMKRFCKAQTIQRRLNEIEAALRELEAEGVKLELALRRQSSSPEQQKKLWVGQL

Score: 2969
Gaps: 20
MDEDEFELQPQEPNSFFDGIGADATHMDGDQIVVEIQEAVFVSNIVDSDITVHNFVPDDPDSVVIQDVVEDVVIEEDVQCSDILEEADVSENVIIPEQVLDSDVTEEVSLPHCTVPDDVLASDITSTSMSMPEHVLTSESMHVCDIGHVEHMVHDSVVEAEIITDPLTSDIVSEEVLVADCAPEAVIDASGISVDQQDNDKASCEDYLMISLDDAGKIEHDGSTGVTIDAESEMDPCKVDSTCPEVIKVYIFKADPGEDDLGGTVDIVESEPENDHGVELLDQNSSIRVPREKMVYMTVNDSQQEDEDLNVAEIADEVYMEVIVGEEDAAVAAAAAAVHEQQIDEDEMK█TFVPIAWAAAYGNNSDGIENRNGTASALLHIDESAGLGRLAKQKPKKKRRPDSRQYQTAIIIGPDGHPLTVYPCMICGKKFKSRGFLKRHMKNHPEHLAKKKYHCTDCDYTTNKKISLHNHLESHKLTSKAEKAIECDECGKHFSHAGALFTHKMVHKEKGANKMHKCKFCEYETAEQGLLNRHLLAVHSKNFPHICVECGKGFRHPSELRKHMRIHTGEKPYQCQYCEYRSADSSNLKTHIKTKHSKEMPFKCDICLLTFSDTKEVQQHTLVHQESKTHQCLHCDHKSSNSSDLKRHVISVHTKDYPHKCEMCEKGFHRPSELKKHVAVHKGKKMHQCRHCDFKIADPFVLSRHILSVHTKDLPFRCKRCRKGFRQQNELKKHMKTHSGRKVYQCEYCEYSTTDASGFKRHVISIHTKDYPHRCEYCKKGFRRPSEKNQHIMRHHK
MDEDEIESTPEEEKSFFDGIGADAVHMDSDQIVVEVQETVFLAN███SDVTVHNFVPDNPGSVIIQDVIENVLI█EDVHCSHILEETDISDNVIIPEQVLNLGTAEEVSLAQFLIP█DILTSGITSTSLTMPEHVLMSEAIHVSDVGHFEQVIHDSLVETEVITDPITAD██TSDILVADC