#**SUBS**

#**Finding a Motif in DNA**

#Combing Through the Haystack
Finding the same interval of DNA in the genomes of two different organisms (often taken from different species) is highly suggestive that the interval has the same function in both organisms.

We define a motif as such a commonly shared interval of DNA. A common task in molecular biology is to search an organism's genome for a known motif.

The situation is complicated by the fact that genomes are riddled with intervals of DNA that occur multiple times (possibly with slight modifications), called repeats. These repeats occur far more often than would be dictated by random chance, indicating that genomes are anything but random and in fact illustrate that the language of DNA must be very powerful (compare with the frequent reuse of common words in any human language).

The most common repeat in humans is the Alu repeat, which is approximately 300 bp long and recurs around a million times throughout every human genome (see Figure 1). However, Alu has not been found to serve a positive purpose, and appears in fact to be parasitic: when a new Alu repeat is inserted into a genome, it frequently causes genetic disorders.

Given two strings s
 and t
, t
 is a substring of s
 if t
 is contained as a contiguous collection of symbols in s
 (as a result, t
 must be no longer than s
).

The position of a symbol in a string is the total number of symbols found to its left, including itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15, 17, and 18). The symbol at position i
 of s
 is denoted by s[i]
.

A substring of s
 can be represented as s[j:k]
, where j
 and k
 represent the starting and ending positions of the substring in s
; for example, if s
 = "AUGCUUCAGAAAGGUCUUACG", then s[2:5]
 = "UGCU".

The location of a substring s[j:k]
 is its beginning position j
; note that t
 will have multiple locations in s
 if it occurs more than once as a substring of s
 (see the Sample below).

Given: Two DNA strings s
 and t
 (each of length at most 1 kbp).

Return: All locations of t
 as a substring of s
.
Sample Dataset
GATATATGCATATACTT
ATAT
Sample Output
2 4 10

In [1]:
# Dataset input
s = "AACGTCAAACCAACTAAACCAACTAAAAACCAAGAAACCAATTAAAACCAAAAACCAAATGGTTAAACCAAAAACCAAAAACCAAGTGCGTCGTACGTATGAAACCAATTCAAACCAACAAAACCAAAGAGAAAACCAAGGGCTCGGAAACCAAAAACCAAAAGCACTCAAACCAAAAAACCAAAAACCAACAGAAAACCAAGAAACCAAATAAACCAATAAACCAACAGAAACCAAAAACCAAGCCTAATAAGAAAACCAAAGGCTATGCAAACCAAGGCCAGGAAACCAAGGGAAACCAACTGAAACCAAATTGAAAACCAACAAAAACCAATAAACCAACGAAACCAAATTAAACCAAAAACCAAGAAACCAAGAAACCAAAAACCAAAAACCAAGAAACCAAAAACCAAGGAAACCAATAAACCAAGAAACCAAAATAAACCAAACAAAAACCAAAATAAACCAACTGAAACCAATAAACCAAAGAGAAACCAAAAACCAAGATTAAACCAACTAAACCAAGGATGGAACAAACCAAAGACGAAACCAAAAACCAATCCAAACCAAAAAAACCAATAAACCAAGCCCCTAAACCAATAAAACCAAATAAACCAAGTGTAAACCAAACATAAACCAAAATACTGAAACCAAAAACCAACTCAAACCAATCACACGTCTAAACCAAAAAACCAAGCCAAACCAAAACTGAAACCAAAAACCAACTACTCCATAAAAACAAACCAAGGATCCGTAAACCAAAAACCAATGAAACCAAAAACCAACTATGAAACCAACAAAACCAATGAGAAACCAAGGCTTTAAACCAATAAACCAAAAACCAAGAAACCAAAAACCAAGACTAAACCAAGGTAAACCAACAAACCAAAAACCAAAAACCAATAAACCAAGAAACCAATCGTAAACCAAAAAACCAAAAACCAATGAAACCAAGAAACCAAAAACCAAGTAAACCAATTAAACCAA"
t = "AAACCAAAA"
def find_substring_locations(s, t):
    locations = []
    t_length = len(t)

    # Iterate through the string s
    for i in range(len(s) - t_length + 1):
        # Check if the substring matches
        if s[i:i + t_length] == t:
            # Store the 1-based index
            locations.append(i + 1)

    return locations

# Find and print the locations
locations = find_substring_locations(s, t)
print(" ".join(map(str, locations)))

45 65 72 148 155 170 178 231 355 378 385 400 432 453 492 547 564 634 648 682 700 712 756 772 832 847 883 890 924 932 956


#**PRTM**

## Calculating Protein Mass

###Chaining the Amino Acids
In “Translating RNA into Protein”, we examined the translation of RNA into an amino acid chain for the construction of a protein. When two amino acids link together, they form a peptide bond, which releases a molecule of water; see Figure 1. Thus, after a series of amino acids have been linked together into a polypeptide, every pair of adjacent amino acids has lost one molecule of water, meaning that a polypeptide containing n
 amino acids has had n−1
 water molecules removed.

More generally, a residue is a molecule from which a water molecule has been removed; every amino acid in a protein are residues except the leftmost and the rightmost ones. These outermost amino acids are special in that one has an "unstarted" peptide bond, and the other has an "unfinished" peptide bond. Between them, the two molecules have a single "extra" molecule of water (see the atoms marked in blue in Figure 2). Thus, the mass of a protein is the sum of masses of all its residues plus the mass of a single water molecule.

There are two standard ways of computing the mass of a residue by summing the masses of its individual atoms. Its monoisotopic mass is computed by using the principal (most abundant) isotope of each atom in the amino acid, whereas its average mass is taken by taking the average mass of each atom in the molecule (over all naturally appearing isotopes).

Many applications in proteomics rely on mass spectrometry, an analytical chemical technique used to determine the mass, elemental composition, and structure of molecules. In mass spectrometry, monoisotopic mass is used more often than average mass, and so all amino acid masses are assumed to be monoisotopic unless otherwise stated.

The standard unit used in mass spectrometry for measuring mass is the atomic mass unit, which is also called the dalton (Da) and is defined as one twelfth of the mass of a neutral atom of carbon-12. The mass of a protein is the sum of the monoisotopic masses of its amino acid residues plus the mass of a single water molecule (whose monoisotopic mass is 18.01056 Da).

In the following several problems on applications of mass spectrometry, we avoid the complication of having to distinguish between residues and non-residues by only considering peptides excised from the middle of the protein. This is a relatively safe assumption because in practice, peptide analysis is often performed in tandem mass spectrometry. In this special class of mass spectrometry, a protein is first divided into peptides, which are then broken into ions for mass analysis.

##Problem
In a weighted alphabet, every symbol is assigned a positive real number called a weight. A string formed from a weighted alphabet is called a weighted string, and its weight is equal to the sum of the weights of its symbols.

The standard weight assigned to each member of the 20-symbol amino acid alphabet is the monoisotopic mass of the corresponding amino acid.

Given: A protein string P
 of length at most 1000 aa.

Return: The total weight of P
. Consult the monoisotopic mass table.

Sample Dataset
SKADYEK
Sample Output
821.392

In [2]:
# Sample input
protein_string = "TKRCKMVGKILAQPVEFFKHEICANNQQDNKPMKCYVEIFIQNMYWKINGQNHGPGSTQFKEHKNIHSEEGWHLPMYKTPFDLMWGIFYVYHRSHWIMGYRAYTKGCVPQEPVGPGAYDEMKFGVRCSNAMVCPYKSVEIRAPTARRRMQYTLYWHHQYCWHAQQDDNGMECYWNKTTREEPEAAYVRGTAYDTLHHANLWQNESYSPTKYACCGDAGPAQHRQMRQDVDQIMRRAAGFCIWGESFASVQLDTKDQQVNFSLGDPHIATTCDQASAKDIAKAKDQAHFILCYCFQHPSSSCACYTNPPEYQITPWFTHDRRSGMTSNCIHQLQLVFYSFEHACHFMNLAHWYRPTMNLFYLTNITVERQKIQEYIYGYEEDMEIIYIRANKQPPTWQNPVSIQPRKRCIREEPVHKCPQFDLMQWLLIPHKMRDGDGEQREKIMCTARHYEGTYIKAMCTLLNGDPRTDERINDHVHSDKSFHCDNYAQNGAAVTWTEKPAGRNLRPGIWWDLLQKPTGHNEYPPNYLYAKGEYKFNCTIPFAYMDPARANYKQRESGRTIEMNLMGADNWFAKGLHWKWHVTHTREHEFCYDQRPVQGDWAGHTLWQPDLECFRCMGHLIKMGSPYSYFDLIFQIKDMREFAQIPIHDPYGNKGVEATVCRGFHQYYFKAENRAAGYQGQAQKYPPVNTFHAATDSAAQQLIGKMVTQWLQSMQPWVEVSKAYAWDNIQERILHYDVLRMVTYLNKTHKQDTQVPSKDWTHISCQCHKLYRTRERKRLICSTQDCHWYYRKALFELIVRVDAMYKVLEDHGQSHVTALWGCESIKTNMTFFTHVIENGNQKNWAYPHITMLYKTCCLYPELPAPDGKGCNNRGLMKAMAKQFQPQNFADKRELAMVCTFPASGMNTIFKWLAKVVEAADCTLPMNALCNWIVKNYYEAQQTHDGGTDKMHNWHCDFTSLQYCIWH"
def protein_weight(protein):
    # Monoisotopic mass table for amino acids
    mass_table = {
        'A': 71.03711,  # Alanine
        'C': 103.00919, # Cysteine
        'D': 115.02694, # Aspartic acid
        'E': 129.04259, # Glutamic acid
        'F': 147.06841, # Phenylalanine
        'G': 57.02146,  # Glycine
        'H': 137.05891, # Histidine
        'I': 113.08406, # Isoleucine
        'K': 128.09496, # Lysine
        'L': 113.08406, # Leucine
        'M': 131.04049, # Methionine
        'N': 114.04293, # Asparagine
        'P': 97.05276,  # Proline
        'Q': 128.05858, # Glutamine
        'R': 156.10111, # Arginine
        'S': 87.03203,  # Serine
        'T': 101.04768, # Threonine
        'V': 99.06841,  # Valine
        'W': 186.07931, # Tryptophan
        'Y': 163.06333  # Tyrosine
    }

    total_weight = 0.0

    # Calculate the total weight of the protein string
    for amino_acid in protein:
        total_weight += mass_table.get(amino_acid, 0.0)

    return total_weight



# Calculate and print the total weight
total_weight = protein_weight(protein_string)
print(f"{total_weight:.3f}")

113414.043


#**SPLC**

##RNA Splicing

###Genes are Discontiguousclick to collapse

Figure 1. The elongation of a pre-mRNA by RNAP as it moves down the template strand of DNA.

Figure 2. RNA is identical to the coding strand except for the replacement of thymine with uracil.
In “Transcribing DNA into RNA”, we mentioned that a strand of DNA is copied into a strand of RNA during transcription, but we neglected to mention how transcription is achieved.

In the nucleus, an enzyme (i.e., a molecule that accelerates a chemical reaction) called RNA polymerase (RNAP) initiates transcription by breaking the bonds joining complementary bases of DNA. It then creates a molecule called precursor mRNA, or pre-mRNA, by using one of the two strands of DNA as a template strand: moving down the template strand, when RNAP encounters the next nucleotide, it adds the complementary base to the growing RNA strand, with the provision that uracil must be used in place of thymine; see Figure 1.

Because RNA is constructed based on complementarity, the second strand of DNA, called the coding strand, is identical to the new strand of RNA except for the replacement of thymine with uracil. See Figure 2 and recall “Transcribing DNA into RNA”.

After RNAP has created several nucleotides of RNA, the first separated complementary DNA bases then bond back together. The overall effect is very similar to a pair of zippers traversing the DNA double helix, unzipping the two strands and then quickly zipping them back together while the strand of pre-mRNA is produced.

For that matter, it is not the case that an entire substring of DNA is transcribed into RNA and then translated into a peptide one codon at a time. In reality, a pre-mRNA is first chopped into smaller segments called introns and exons; for the purposes of protein translation, the introns are thrown out, and the exons are glued together sequentially to produce a final strand of mRNA. This cutting and pasting process is called splicing, and it is facilitated by a collection of RNA and proteins called a spliceosome. The fact that the spliceosome is made of RNA and proteins despite regulating the splicing of RNA to create proteins is just one manifestation of a molecular chicken-and-egg scenario that has yet to be fully resolved.

In terms of DNA, the exons deriving from a gene are collectively known as the gene's coding region.

Problem
After identifying the exons and introns of an RNA string, we only need to delete the introns and concatenate the exons to form a new string ready for translation.

Given: A DNA string s
 (of length at most 1 kbp) and a collection of substrings of s
 acting as introns. All strings are given in FASTA format.

Return: A protein string resulting from transcribing and translating the exons of s
. (Note: Only one solution will exist for the dataset provided.)

Sample Dataset
>Rosalind_10
ATGGTCTACATAGCTGACAAACAGCACGTAGCAATCGGTCGAATCTCGAGAGGCATATGGTCACATGATCGGTCGAGCGTGTTTCAAAGTTTGCGCCTAG
>Rosalind_12
ATCGGTCGAA
>Rosalind_15
ATCGGTCGAGCGTGT
Sample Output
MVYIADKQHVASREAYGHMFKVCA

In [57]:
# Define the genetic code for translation
genetic_code = {
    'AUG': 'M', 'AUC': 'I', 'AUA': 'I', 'UAA': '*', 'UAG': '*', 'UGA': '*',
    'UCU': 'S', 'UCC': 'S', 'UCA': 'S', 'UCG': 'S',
    'UAU': 'Y', 'UAC': 'Y',
    'UGU': 'C', 'UGC': 'C',
    'UUG': 'L', 'UUA': 'L', 'UCU': 'L', 'UCC': 'L', 'UCA': 'L', 'UCG': 'L',
    'UGG': 'W',
    'CCU': 'P', 'CCC': 'P', 'CCA': 'P', 'CCG': 'P',
    'CAU': 'H', 'CAC': 'H', 'CAA': 'Q', 'CAG': 'Q',
    'CGU': 'R', 'CGC': 'R', 'CGA': 'R', 'CGG': 'R',
    'GCU': 'A', 'GCC': 'A', 'GCA': 'A', 'GCG': 'A',
    'GAU': 'D', 'GAC': 'D', 'GAA': 'E', 'GAG': 'E',
    'GGU': 'G', 'GGC': 'G', 'GGA': 'G', 'GGG': 'G',
}

def translate_rna_to_protein(rna_sequence):
    protein = []
    # Iterate over the RNA sequence in steps of 3 (codons)
    for i in range(0, len(rna_sequence), 3):
        codon = rna_sequence[i:i+3]
        if len(codon) == 3:  # Ensure we have a full codon
            amino_acid = genetic_code.get(codon, '')
            if amino_acid == '*':  # Stop codon
                break
            protein.append(amino_acid)
    return ''.join(protein)

# Sample dataset in FASTA format
data = """>Rosalind_10
ATGGTCTACATAGCTGACAAACAGCACGTAGCAATCGGTCGAATCTCGAGAGGCATATGGTCACATGATCGGTCGAGCGTGTTTCAAAGTTTGCGCCTAG
>Rosalind_12
ATCGGTCGAA
>Rosalind_15
ATCGGTCGAGCGTGT"""

# Parse the dataset
lines = data.strip().split('\n')
main_dna = ''
introns = []

for line in lines:
    if line.startswith('>'):
        continue  # Skip header lines
    else:
        if main_dna == '':
            main_dna = line.strip()  # First sequence is the main DNA
        else:
            introns.append(line.strip())  # Subsequent sequences are introns

# Remove introns from the main DNA string to get exons
for intron in introns:
    main_dna = main_dna.replace(intron, '')

# Transcribe DNA to RNA (replace T with U)
rna_transcribed = main_dna.replace('T', 'U')

# Translate RNA to protein
final_protein_sequence = translate_rna_to_protein(rna_transcribed)

# Print the final protein sequence
print(final_protein_sequence)

MYIADQHALREAYGHMCA


#**REVP**

##Locating Restriction Sites



###The Billion-Year Warclick to expand


The war between viruses and bacteria has been waged for over a billion years. Viruses called bacteriophages (or simply phages) require a bacterial host to propagate, and so they must somehow infiltrate the bacterium; such deception can only be achieved if the phage understands the genetic framework underlying the bacterium's cellular functions. The phage's goal is to insert DNA that will be replicated within the bacterium and lead to the reproduction of as many copies of the phage as possible, which sometimes also involves the bacterium's demise.

To defend itself, the bacterium must either obfuscate its cellular functions so that the phage cannot infiltrate it, or better yet, go on the counterattack by calling in the air force. Specifically, the bacterium employs aerial scouts called restriction enzymes, which operate by cutting through viral DNA to cripple the phage. But what kind of DNA are restriction enzymes looking for?

The restriction enzyme is a homodimer, which means that it is composed of two identical substructures. Each of these structures separates from the restriction enzyme in order to bind to and cut one strand of the phage DNA molecule; both substructures are pre-programmed with the same target string containing 4 to 12 nucleotides to search for within the phage DNA (see Figure 1.). The chance that both strands of phage DNA will be cut (thus crippling the phage) is greater if the target is located on both strands of phage DNA, as close to each other as possible. By extension, the best chance of disarming the phage occurs when the two target copies appear directly across from each other along the phage DNA, a phenomenon that occurs precisely when the target is equal to its own reverse complement. Eons of evolution have made sure that most restriction enzyme targets now have this form.




Problem

Figure 2. Palindromic recognition site
A DNA string is a reverse palindrome if it is equal to its reverse complement. For instance, GCATGC is a reverse palindrome because its reverse complement is GCATGC. See Figure 2.

Given: A DNA string of length at most 1 kbp in FASTA format.

Return: The position and length of every reverse palindrome in the string having length between 4 and 12. You may return these pairs in any order.

Sample Dataset
>Rosalind_24
TCAATGCATGCGGGTCTATATGCAT
Sample Output
4 6
5 4
6 6
7 4
17 4
18 4
20 6
21 4

In [75]:
def is_palindrome(seq):
    """Check if a sequence is a palindrome."""
    return seq == seq[::-1]

def find_reverse_palindromes(sequence):
    """Find all reverse palindromes in the sequence with lengths between 4 and 12."""
    palindromes = []
    seq_length = len(sequence)

    # Check for palindromes of length between 4 and 12
    for length in range(4, 13):  # lengths from 4 to 12
        for start in range(seq_length - length + 1):
            substring = sequence[start:start + length]
            if is_palindrome(substring):
                palindromes.append((start + 1, length))  # Store position (1-based) and length

    return palindromes

# Sample dataset
header = ">Rosalind_24"
sequence = "TCAATGCATGCGGGTCTATATGCAT"

# Find and print reverse palindromes
reverse_palindromes = find_reverse_palindromes(sequence)
for position, length in reverse_palindromes:
    print(position, length)

17 5


4 6
5 4
6 6
7 4
17 4
18 4
20 6
21 4

def is_palindrome(seq):
    """Check if a sequence is a palindrome."""
    return seq == seq[::-1]

def find_reverse_palindromes(sequence):
    """Find all reverse palindromes in the sequence with lengths between 4 and 12."""
    palindromes = []
    seq_length = len(sequence)

    # Check for palindromes of length between 4 and 12
    for length in range(4, 13):  # lengths from 4 to 12
        for start in range(seq_length - length + 1):
            substring = sequence[start:start + length]
            if is_palindrome(substring):
                palindromes.append((start + 1, length))  # Store position (1-based) and length

    return palindromes

# Sample dataset
header = ">Rosalind_24"
sequence = "TCAATGCATGCGGGTCTATATGCAT"

# Find and print reverse palindromes
reverse_palindromes = find_reverse_palindromes(sequence)
for position, length in reverse_palindromes:
    print(position, length)

#**TRAN**

##Transitions and Transversions

####The Genetic Codeclick to expand
Problem
The 20 commonly occurring amino acids are abbreviated by using 20 letters from the English alphabet (all letters except for B, J, O, U, X, and Z). Protein strings are constructed from these 20 symbols. Henceforth, the term genetic string will incorporate protein strings along with DNA strings and RNA strings.

The RNA codon table dictates the details regarding the encoding of specific codons into the amino acid alphabet.

Given: An RNA string s
 corresponding to a strand of mRNA (of length at most 10 kbp).

Return: The protein string encoded by s
.

Sample Dataset
AUGGCCAUGGCGCCCAGAACUGAGAUCAAUAGUACCCGUAUUAACGGGUGA
Sample Output
MAMAPRTEINSTRING

In [None]:
# dataset
rna_string = ""

from typing import Dict

def rna_to_protein(rna: str) -> str:
    codon_table: Dict[str, str] = {
        "UUU": "F", "UUC": "F", "UUA": "L", "UUG": "L",
        "UCU": "S", "UCC": "S", "UCA": "S", "UCG": "S",
        "UAU": "Y", "UAC": "Y", "UAA": "Stop", "UAG": "Stop",
        "UGU": "C", "UGC": "C", "UGA": "Stop", "UGG": "W",
        "CUU": "L", "CUC": "L", "CUA": "L", "CUG": "L",
        "CCU": "P", "CCC": "P", "CCA": "P", "CCG": "P",
        "CAU": "H", "CAC": "H", "CAA": "Q", "CAG": "Q",
        "CGU": "R", "CGC": "R", "CGA": "R", "CGG": "R",
        "AUU": "I", "AUC": "I", "AUA": "I", "AUG": "M",
        "ACU": "T", "ACC": "T", "ACA": "T", "ACG": "T",
        "AAU": "N", "AAC": "N", "AAA": "K", "AAG": "K",
        "AGU": "S", "AGC": "S", "AGA": "R", "AGG": "R",
        "GUU": "V", "GUC": "V", "GUA": "V", "GUG": "V",
        "GCU": "A", "GCC": "A", "GCA": "A", "GCG": "A",
        "GAU": "D", "GAC": "D", "GAA": "E", "GAG": "E",
        "GGU": "G", "GGC": "G", "GGA": "G", "GGG": "G"
    }

    protein = ""
    for i in range(0, len(rna) - 2, 3):
        codon = rna[i:i+3]
        amino_acid = codon_table.get(codon, "")
        if amino_acid == "Stop":
            break
        protein += amino_acid

    return protein

# To compute and print the result
print(rna_to_protein(rna_string))

MPNRTVPSQSLVLHRFGKYRRSLLSTVSGDCIPAYTHASFGWRLLGEPRDLFIWRSPTVWFRSNTKSYAISITVRLRDAEPGDLPKAGAITGLDQGSEFAQNKTMPILISHQFASSLGHWSTRILHGAPDFSFECGLLATSVRGPAHDRRPWSRRFPISLHDQDFSQQDASACTCSPIRPQPLSRCCSPHPKSEGTSLSKLKLGQPRTDYRLHAIASTIIFPLRCASCHRGVYFSVPKLVASTSQISHNRPVYARFVPLEPNHLVPHHNALRNSLSEAVVMSVSRSQGALRDKSSVSESRNKGCRYIHQSTQGAVRDVRESHKLGKKVGSTDFTHIPAQRAIRRRAGGDYEQSPVFSLCPFSNDTEQLLTRDYLVQICKHLTERQLRTPALRSTISFGRGFWKLCTIVDTSKVTVQQTLSRKARLAVAPTCEHELTHITRWTSVLRRLLACQNSIWCIYKRGWGDFTRGPQTNPENVAPVLATSTRPTPVHCALRPGLGKSLKWRLNQHMSYAPRFGTMTRAQSANFPIPSATWGLTHYRTRCRRTCYVAHSSICGFGRCASFNRVFRLYYCRNLKLRATVRSQSSCARVVLRVNAHPNGVVRSLSGDRGTFYFLVLLTNGDMKDPWIVEQRNSGNVKIKAQGAVSLVRWGELGVGDGETTALGTRTVYTSMPHYFPLQFYWRDSTQPVSAIIGPRILLLIAPNLIPRPLSNIAGSRTNARNDYIKSLSQWKLTRRRTIERIRRKVAVTTFALPSSLSSRLQPALMRVLGSPLYSVASGLCVPVPPSNPRNIEKRCSQRRNHSPLEPLASLIEFSLSRVGKIVCLKVELASCSARKWETAFSQYSRYYTKRAWTMDTETIHYSQEPSRRGIGALLRAPSRVPLLRRVVEKPNQWRTARPSIDSSRDLRYVKPIRLVWPLAYSPSPVGKAFHFRFLRISHDTFLTTDLMVQMTFSCHAQCMTFPMADWQVVKKIGRGTLKLVTYTVNSTWRRSLLHYHEQA

##**LCSM**

##Finding a Shared Motif
In “Finding a Motif in DNA”, we searched a given genetic string for a motif; however, this problem assumed that we know the motif in advance. In practice, biologists often do not know exactly what they are looking for. Rather, they must hunt through several different genomes at the same time to identify regions of similarity that may indicate genes shared by different organisms or species.

The simplest such region of similarity is a motif occurring without mutation in every one of a collection of genetic strings taken from a database; such a motif corresponds to a substring shared by all the strings. We want to search for long shared substrings, as a longer motif will likely indicate a greater shared function.

###Problem
A common substring of a collection of strings is a substring of every member of the collection. We say that a common substring is a longest common substring if there does not exist a longer common substring. For example, "CG" is a common substring of "ACGTACGT" and "AACCGTATA", but it is not as long as possible; in this case, "CGTA" is a longest common substring of "ACGTACGT" and "AACCGTATA".

Note that the longest common substring is not necessarily unique; for a simple example, "AA" and "CC" are both longest common substrings of "AACC" and "CCAA".

Given: A collection of k
 (k≤100
) DNA strings of length at most 1 kbp each in FASTA format.

Return: A longest common substring of the collection. (If multiple solutions exist, you may return any single solution.)

Sample Dataset
>Rosalind_1
GATTACA
>Rosalind_2
TAGACCA
>Rosalind_3
ATACA
Sample Output
AC

In [45]:
# Sample input in FASTA format
fasta_input = """>Rosalind_2245
GCAAAAACATCCAATATTCCCGGTAACTATGTGCTGTGAAATGCGGAGATCCCCTAGCGT
ATACATGGGCAAGCTTAGTATACCGATCGACCTTTCGTTCGACCAGGGCGGTGACATGCT
TCGATTAAGAGTAATACGATCTCTACCGCCCATGCAGCCCGTTAGCCACGGTTTAACATC
TATCCTATTGCTCATTACCATGGACAGCAACTAAGGGAACTAGAAATTGAACCTAGATCG
GTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGA
AGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCC
ATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATG
TTGAACCCATGAGTGATCATCGAGTCATTCATTCTACGAAACGGAGCGCCCATATACGTA
GGCGCTCACAAAAGGATTGCGCTGAGTGAAGCGTCGCTCACCTACTGCTTCATTACGATG
AGCTAAGTCTTGCACACTTCTGATCTTCAATTACGGGGGGATGATCATCCTTTGGCCCAC
GACCTAATTTTAAATTCTTAGCATTAATGCTCGCGTAACACCGAGTGTTCCTGTCGGGGG
TCCGGACCAACCATAGCCGATGACGATCCACTCGGTTTGACGCACTGTGGCACCAGCGGA
CCTGTGAAGATTATCATCCGCATTTCAGCTCAACAGACCTCACGGCCCATAATCCTACTG
CCTAACGAGGTGGTTGACCGAGACTTCTGTGCTTTCGAAGAAATGGACAACTGCCTGCCG
TGCGGACTTGCTAAAGCTCTTGACTTTGACAACACTGTGGACGCACTCCCCGTGGTATGA
ACTTTGAGCTTAGCAGAGTCATTATTGGTGCTGCAATGCAGAGAAGTGTCAATCGAACGT
AGGGGCCTGGACCCCCAATCCCCATCCACATAACAGCTGC
>Rosalind_7521
AACTTAGGCCTCCGTTGCGTGCCATTTACTGCGTGTACAGATGCCTGCCGGCGCCAATAT
ACAGAATAGCAACCGGGAACCTGATCTAACTTCCAGGCTGGTTACCTATTCGCCATATTA
TTAACTCATGGGTGAACGGACTCCCTTGCGTAACAGACGGGTCCCGTTGACATGCGCAGC
AGTTGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATAT
GGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAA
CATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAG
ATTCCTCTAACTGGTGGGCTTATTTAATATGAAGAATTTGTGGGATGTGGCGCTAGATTA
ACCCCGAGGAACCAGCAACCGCGAGCATAGTATATGCGGGTGACAAAATGCTGTTTTACA
CCCAAGACCCATAAAGCGGCTACGAGTGCCTCCTCTGCCTGGCTTATGTGAGAACTACCA
GACGGACGGACCCTTGCGAATAGGGACGCCAAAGTTAATAAATCTCCTATACACGGTCGC
CACACTAAGTTTATCGCGCTCCTTCTTGCACCCCCCATCTCGGATTGGCCGAGCCGGCAG
CGGCCTCGCTGCCGAAATACTACTCAATGGACGGCAGGATGGTGATATCGTGGCGTGTCG
AAATGTGTACGGTATACGCCTACTGTGCCCATACGTAAAAGTCCTGAGTCCGGTTCTAGA
AAGTTGGTTCTCTAGTTTCATTTATATAGAGTCCGCCGGCTGGGCGGTACCTGCATGGGT
TGCCGCTGACACTGACATGATTTACTGTAGGTAGCAGCGCTTCTTCTATAAAGAGACGGG
TTGACTTATTTTGCGGTATATATGGCGGTTCAATAATTACGACTTAGAGTGACACATAAT
TCCCTCAGAAAGCCCAACGACCCATTTATACTGTATTGGA
>Rosalind_3727
GCCTCCAGTCGGCAACTAGCCTTCTTCTATCACGCCCGTTGCGTCCTTTGGCCCATACGT
CACTAGCTAACGGAATAGCAAATGCTACCTTTCAACTTATTAGGTCCTTGGCACATCGAA
CCCGATTCTTCCATCGGTAGATGCACCTCACAGACAACGCCCGCGATCCGCGTCAAAGTG
AGTAACTATGGAGGCCACATAGACCATCGAACTCCGCTGGCAGCCAGCGGACTTTCAGTG
TTAGCAACGTTAAAAACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCG
CCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTA
GTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTAT
CGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATACGCGAAAAGCTGCT
ATCTGTCGGCTTTAAATGCTGGGTGCTACTTTCCACGATCTGTCTCGTGCTGAACTAGAT
GTGAGGACGTGTGATTGTATTACATGTCTGGCCCCATCCAGGATCTAACAGTTAGTCTTC
AAGCAGAACGCCTACTTCCTTTTCTGCATGCTTGACCGTCACGGGAAGCGGCGACCGGTA
TCGTCCTAAAACTATGGCGCAGTGCCCGGAACTGCCATCCGAAGCTGATTTCCAAGGGCG
TTCAACACTCCTTGGAGGTTGTCGTGTCGGGTAGTGCCCAATCAACCAGCATACACGAGC
ATTCGAAGATGTGCTTTTCTCACTCACGCGAGTGGAATTGATATCGTATGGCTAGTCCTG
TAACTCGAACATTGAGCAGTGAGATTTGCTATGCAACATCCTAGTACCCTAACTAGTACT
GACACACAAGGCCCTTAAAGTCGTTTTAACCTAGGGCTGCCCATAATCGGGAAGGGTTGG
TATCCATGTGGTACTATAGCCGTCTTTCAGAAAGTACGTG
>Rosalind_7444
GAACGATTCGGTCAGATTCAAGGTCGGCATGGCGTTATGTGATGCATGTCCTATCTGTGT
GAGGCTTATTTTTCGGGTCCTACCTGAAGTGAACCTCGGCACAGAGTGCCAAGTAATGCG
TAATATATACTTAAGGCCTATAAGAGTTTGCGTCTAAGGGAACTAGAAATTGAACCTAGA
TCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCA
CGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTC
GCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGA
ACAAGATCCAGTCAGGACGATTGATACGGACTGGCCACGCGTTCGCCGTGTACAGTGACC
AGGGATAAACCTCGCTCGCACTAAGGGGTGACTGTAAACTGGTCGAGACGTGTGAGGGTC
ACACGTGGCACATAAGGTACACCACGTGGTTACTATGTTTGACTAAATTATAATCTTTCA
TTTTCGGTATAGCCTGCTCACCTTGTCTAGACGCGTTCGTTGCATTCAGGTGGTACCAGT
GTGGGAGGGGACCGAAAGCCACCATCCTATTTAGCTTCCGTTAATCTTTCTAAATTTGTT
CCGGATTTGCTTGTGTATAAAAGGTGTGGTTAGTCAATGGGGCGATGATTGCTCTCATGG
ACACTAGGCATCGCACCAATGGGAGGGGGAGGGGTCCGCCCTTATTGCAGGTCATACAGG
CACTGCTGTTCTGGTCCACTTGTGCCCACTATACGGAGTACGGTAGGCGTACTAAGCCCA
TGGCGTTCAACCGTGTGTGGACTATGTAAGCTACAAGGCCATCTTGAAATTCTAATCACC
GCATCATAACATCTACCATACATCGTAATAGACCCTACCTGTGCGCAGCAGGAATCGGGA
CGCCCGCAGGCATCCCGGTTTTTTACGCGTAAGAGCGCTT
>Rosalind_3790
CTAGTCTTTCGATGTCTAATCTAGCTGGTCGCCACGACTATAGCTTCACGTGGTTACAAC
TAACAGGGTCTCTGGTCCTTCTGGTGATGCCGCAAAAGCTTGCCCCATGCAGCCGCGGGT
GTGAAGATATTGTACAGTTCCTGCTTGATGACGACAGGCAGCTGGCGCTCACGGTCTAGA
CCGTTAAGTAGACCAACTTGCCTGACCTGTTCATTACTAATCCCCGAAAGGCATTACCCT
TGTTCAACGACAGCTCTCTGATCTCCCGAGCGCTGGCCTGCAGCCGAACCTATTAAGCAG
CCTATGACGAACTCGTGAAGGTGGGCCGGATATACGAATCCTAATCCGCTGGCATGTACA
ACGCTAAGAAAGGCCGCACCCAAGATCTCAGGTAGCTATTGTACGTTATAGGGATACCAT
GTTGATGCGTTTGATTGGCAAGATGATATGATTATCATTCGATCGACATAAATGTGTCGT
TCTGCAGATGGCTCGCTACTAAAGAAGACCACGGGGCCCAAGCGCAGCAGCTCGAAGAAT
TCTGGGAAGGCAAGCGTACTAGGGCACCTCGTACCTACGATCTAAGGGAACTAGAAATTG
AACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATA
ATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACC
GTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATT
TAATATGAACGCTTAAAGTAACGTACATTGGTGCCGTGCCACGATGTCACCCGAAATAAC
ATGTACAACCGGCGTTCACACGCGACTTTATTCCATGGACAACTGGGTGAGTAATTTCCA
ACCCCCAAGCTCTATGTGCGCCCCGGGAGGAGAACGTCCACAAACGTTCGCGATACTTCT
GATTTCATGTATAGGACGGATAACCTCAGGTCTTAAGAAA
>Rosalind_7129
CAAATCCATGCGGGTACAAACTAGCATCGTACGTCAAGTAGGCCAGCCTACGAGCGACTA
CATAGGGGTACACTTTGAGACAAGCTCTCATGCCGGAATGAGACTAATGACCGCACACAG
CAGACGTCCTAACTGTTGAACGTGAGAGCCCGGTATAACTCCATACGATGACCACGACCC
GCAGAACTAACGCAAGAATCCTCTGCACAATGCCTAACCCCTATACTAATGGGATGCGAA
CCGGACGCACTATCATAGGCATACGGCCGGCAGCTAACCACGATCGATCGGCCTGGAGGC
GACCAAATATTAATCGCTTTGATTAGGCAGTTTTGCTAGCAGACGTCTCTCGCCGTATGG
GCACACTATATCGGGTACCTTATATGCAGTGGTTCTTCAGCTGTAGTATACCCCAAGTCT
GAAAAACATCCTAACCTCGAGGATTGAGCTGCCGCCGTATCCCGTATTGCTTCTACCCCT
AATCCAGAAGACAGGCAGGGCTATCTCTATTTCTTCCCCGTCTAAGGGAACTAGAAATTG
AACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATA
ATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACC
GTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATT
TAATATGAATCCATTCAAACCATAAGGTCCATGTAAGGTCCTTGAGCGCCGAAAGAGCAA
CACGCTGACTCCAGGGCAGTTTTACGAATGCGGTCGGCTATAATGTGCCAGTCCCAGGAG
GGTTTTAAAGATGACACATCTGTCTGTTTGGGTCAGGATTATCACTCTAGTATGCTTATA
TGAGGTTAACGCATTTCTGTGGTTAACTCCGTAAGACGGAGCAGCCCTTGCACGATCCCA
CGGCGATGTATTAAGATCAGCTGTATACTGTTCATGTACC
>Rosalind_6089
ACATGAACTCACTTTGCGCATCGTCCCAAGGACGAGCAGGGTCGAACGTTGACGAAAGCG
CAAACCCAGTCGCTCGGACTAAAGTTTTCCTTAACGCATATCTTTGCCGGAGGACCCTTA
CTTGCACTTGATCGATAGAACTACCCTCTGATAGTGCATAGGACTGCGCCGCTACTCGGC
AGGTCCTTGCGAGGATATGGCAAGGCCTATGATTGTTAATGGATCAGCATTTGGGAGGTC
CAAAGTGTATCAGCCTCACTAGGGACGACGAATCCAACAGCTGTCGGGGACCGCTTAGAG
ATCGGGACGATGTATCTCGTCTCTGCCTGTGCCTAGATCCGCTCCTAGAATATGCTAAGG
GAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAA
CAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTG
AACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAAC
TGGTGGGCTTATTTAATATGAAGATAATAAGTGCCCGCGTATGCCGAACTCTCTCATAGC
TCCATAACTCCACGGAGACGCGCTAGCAATGACAATCTTCTGATGGAGGTAAAATTCCTT
CCGTGAAAGCAATGGTTGCATCACAACTACAATTTATTATTGAGCGTCCCCAGGGACGAA
CGTACCTGCACGGAGACGCTTGGGCCCTTCCGTGGCTGAGCGCTCATATGAGGATGGTAC
TTTTACATCTTGATATAGGTCAAAGTAAGTACCGACGACAACCATCTTCCAATCGAGCCG
TAAACTAGGGAGGCCTTGTAGACATACAAATAACTGTAATAACGTCTCCCGAGGAGTGTA
GCCACGATGTCTGGTATGTTGTCGCTTTAACGGCAACTAGCCGACCCTTGTGGGCATGCG
CGTGCATGTCCTAAGGCGGATTAAAAGCGTCTGCTACTTG
>Rosalind_5402
GAGGCCACAACACAGCGCTGGGTATCAAAAGGGTTGACAGCAGCTGATACGTCTAACCAT
CACTGACCGCGTTAGTTAAGGTGCGGCTCAGCGCGAAAGTCAGATCGCATGTAGTGCTAT
CTACAGTCTGAGGTCCAGATGGCTGAGCTCAGGTGCTGGAGTGCAAAGGGCGCTCAATAC
TGTCCCTGACCGGTCCGGGTGTATAACCTATTGTTTACGAGATAGAGTCCTGCACATCGC
TGTTACGATAGTCTGTGACGTATGGCTCTAATCGCTAAGGGAACTAGAAATTGAACCTAG
ATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTC
ACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTT
CGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATG
AACAACTCCGCTGAAGCACGGGTACCGCAGGCCTATAAGGCTTTGCAGGTAGACCTCCTA
GTCCTTTATGACAGGGGAGGGTAACTCCGCGCCCGGGTTTAGAGAGTGTGAGCAACAACG
TCACTTTATGTGCGTTTACGTAACGGCTCTAATTCTGGTCGCAATGTAGGAGTTGTCTCG
ATAATCGACAACCTAATAGTCAATTGCGGCCCACATTGTACTCGAACCCCCTAGGACCTT
AGGCTCACTTGATTACCTACCCAGACCACGCGAGCCCTGGCGATAGAAATCCCTCTGAAT
CCGAGTACACGTATCACTCTCCTCTATCCGAACCCATTTAGGTCTTTCCTGTAGCGGCCG
CGGCCTTATCGGCTGATAGGCAACTAACGATTTGAATAGTTATAGTGAATTTCATGGTTG
GTTATTCATTTGGAGCGCTGCCACAGCATAATTTCAGAGACGATTTCCCATCGGTAGGTT
AGATCACCCAGTGTAATTAAATACCTATTTTATCCGTGGC
>Rosalind_3964
CACGCGAACGCAAGACACATGAATCCAAGCTTAAAACAACTACGTTCAGAATTTAGTCTC
CTTTAGAATGACCTACATCATGGTGATGACATCGTTAAATTCAGGTTCCGGGCAACGCGC
TTTTGATCTACCCGCTCTACACGCATGGAAGGGACTGCCATCAAGGCCGTCCCAGGCTTA
TGAGAAAACTAGTTAGTCAAATACCACCGGCTCCGGAGGACGCATTGTCCTTGATATTCA
AGATGCCTAGCGAAACCAGAATATGCCAGGCGCTGCTTTGTATGTAATCGTCAGAATCAG
CCAAGCGATAATGCCACTTACAGGCTCGCATCCTTCCGCAATCATCTATGTTACAGTTCA
ATAGTGACCCTGTTTCGTGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGA
GCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATG
CTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACT
TATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGCCTGCCCCGGAA
CCAGGTCTACTCACCTGGTTCCTTGTATTTATTCCCTCAATTCAGCAAAGCCGAAGACCA
CCAGATTTTCTGGGCAATAGGTGAATTATTTTCTTAGTATTCGCGCTTAGTTTTGTTTCT
CATAAGGCCTGGCGGTCTAAGAAGGTTGTGTGCACACAGGATGGGCTACCCCGATCGGAT
AAGTCAGGTTCGCGACGAGTGGTGATATGAAAATATTGCGCCTGGTGCCCGATCAGTCCG
GAGCCGCCCGAGATCCTTGAGCATTCGCTAAACTCAAAAGGGTTTCCTTTGGGTGCACTT
GAAAACGCACTAGGTTAGGCAGTTGTGGCGGAACCAGGTATAGCTTTCTTACCAGAGGTG
GGAGAGTCTGTGAAAACCTTACCAAGCGTTACGATGTAGA
>Rosalind_9657
TCCAGCACAAGACACCCTAAATCGAATGCTCGTGCCCGGCGGCTGGTCCGAACTATCCGT
AGTTCCTCCGGAGACAGAGCATGTCGTGATCTCACAATGTTGTCCCCTTCTCCCGCTTGG
TACGGTACTCAGTCTGTGGCGGAGTCCAAGACTGACCACGACGCGTGGGCTGTCACGATC
GCCCTCGGTTTTCGAAACTAGTCGCACTTCAGAACTGGGGACCAAGGAGAACGGCCCCCC
GCCGTTCGCCGTTTAATTACTGAAGTAACAAAACCACCATTTAACAGCCTCGACGAGTCT
CGGGCGAAGAAACTGTCTTGGAAGCAGTATCTCTGTAGCCTGTCGGGACAGAGGTGCGAC
GGAATGAGGTTGTTAGCTTGGTGCCGTCATCTCAGATATGAGCTAAGTGCGTTTGGCAGC
CGCGTGTAGCTTAAGGGCGCCATTGCCAAGCTGGTTGGAACAGAACGTTCCGCAATTTAC
CTTTCCGGACTCTGATCGGATCAATGCAATTTTTTGAGAGCAAGATAGACCCGTGTAATA
TAGTTCTGTACGTTAGGGTGCCCGTTCTAAGGGAACTAGAAATTGAACCTAGATCGGTAT
GTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGA
CTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTC
TAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATGGTGG
TCTCTACTCGCGTTATGCTCGCGTGAACCGAGCCGTCGGCCAAGAGAGAACGTCGGACAA
AGAACCTATACTAACCCTAGTATGTTAACCGGCCTAGCTGATGGGGTATACAGGCTATTT
TTCTGTATAAATAGGAATACAGAAGGGCTAGCGTAAATGACATCTAGGAGTACAGACAGA
GATCTGGAAGATCGGGGACTTTTCTTCACCAATGCAAGTT
>Rosalind_7770
ACGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGG
CTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACA
TATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGAT
TCCTCTAACTGGTGGGCTTATTTAATATGAAGAACCAGACCGGACTCTATTAAACACGAC
TTTCGACCCCTCCCGCCTGCGCCCGAGGGCCATGACGCCCAACTAAAAGACATGTAGTGC
CCAAGGTGTAGATTGCGTGTTAGACGGCTACGACCGAGCGAAAAGATACAACCAGTTACC
ACGCTGCGCCTGCTTCACCAAATCGGCGTTCAAGCCAACGCACTTTTAAAGTTTCGGAAG
CGACGGTCTCGCTCAGAGTACACTAGCAAACTTATTCCACGGTCTGTCGCTGTGTTTTAC
TTCGATACGTGTGCCGCCAGCAGGCTACTTACAGGCAAGCCGCCACGTTTATTCTAAGTA
GGAATCAATATCCCATACGGGAGAGAGTGCCACCTTGCGCGAGCGAGACCTGGCAAATTT
CTGGGAAGCTTTGTAAGCACTTTTGGCCTAACCGGGGGATACTAGCGATAGATCCTGCTG
ATGAACCCAAAAGCACGCTCGAATTATGGTGTGTCACAGGAACGCCATTCCTCCATCGCG
TTCATTACCCCCTGGGTGGGACATCTCAAAATGGAGCGGGGCCGGTACTGAACCGAAAGC
GCTGATAGAGGTAGGTAACTGAGTTCACCTTGTACGGACCCACAACAGTATCTGGCTACT
TAGGGTATAAATTGCAACCCTCCATCCAACAGTCAGCGTCTGGGGAGCTGCAGTGTATCC
GGTGATGCTGCCGCTCCAGACCTAAGAATGTGGTCCTACTTACCGATGTTCCCATGGGCG
CTCGATTCAAAGCCTGTTCCAGTAATGAGACAATTTCGTG
>Rosalind_3962
TACGGTCAAAGGGGATCGGCGACGTTCTTAACTGTTCCTCGTCGGCTGGAATACGCGGGG
AGCGCAGAGAAGAGCCCTTCCGATGCGCTTGCACTTGGGTAGGCCCCAGAGGATCACCGT
TTGGACCGCGCCTGCAACGAAACAAATAAGGGTGCTTTCGGCGGGACTTACCCTCAGTGC
GGTAAGCGGACTTCAGGGAGGTGTCAGAGTGACAGTGTTCCACGTCCCATTAATCCACTC
ACGCGCAACGCAATGTCGCAAGCTATGGCGAATTATATCGTTCCGCCTCAAGGGGCTTTA
CCGTGAGCAAGCGGGCGGGATAAACGTAAAGATGCAAAAATTCCGCCCCTTGCCGCTTAG
TAGTCAGGAGAATACCCGTATAATGACAACTCCCCCACACAACAAGCTCTTCGTTCTTAT
GGAGGCAAGGGTGCCGATTATCAGGCCACAAGCCCTATCAGACGCGCCGATAAGAGATCA
TATATTTATTGCGAGCCTTATGTTACCATCATCTCACTCCCAGGCATCGAGTCAGAGCCG
CTTCTAACAACAGCAGTCGTACACTCCTTTATGCACAACGGTGTAATCCCATAGAACTAC
ATGAAATGGTATAAGACACCGTGAAACAAGCACAAACATAAATTAGTCGAGATATCGCCC
TTAAATGACCGTTCTCGTGCGTTGGTGGAACTCAGGAAAGTCTGTTTAGCTATCACGATC
ATTTAGATAAATACCAGAGGCAGCTAACTCTACCTAAGGGAACTAGAAATTGAACCTAGA
TCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCA
CGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTC
GCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGA
AGAGCAGAACCTGTGTGTTGGCCGGGGAAAATCCTTGAGA
>Rosalind_9878
AAAATGCTAATTGACCACCTCGTTTCACCCTATGAATTAATGTCTAATCCGGGGTGACTT
CTGACGTCTTAGATAGTTTGATAAAGTCATTGCACGTAAATAGTTAACGAATTAGCTTCG
TAATGCCTCATGTGTTTCGCTACCCGCCGCTTTGGTTCATAATGAGTGAAAGTATCTCTG
TGCCACTTGCGGGAGGATGATTTGACACGCCTCTATGATGACTACACTCTTCGGGATGTA
TTCAACTGCTTTGTCAATGACCCAGCTTATAACATTAGGTCGCTTACTCTGGGGCTTGCG
CCATCCAAGATAGGATGGGGCGCTAGATGCCAGAATGCCTGGTTGGAGATAGCTCTCCAT
GTTTGTGAATGGCGACGGCACTTTTTCGCGGTGTCCTCGTCTCTACCGTGAATATCAATT
ACCGCTATCGATACAAACGCAATTAGGACAGCTCTCGGTCGCTCGCACATTTTTGGATGA
CAGAAAATCCAGAGCACGGAGAGCTATAGATGTCTAACACTAAGGGAACTAGAAATTGAA
CCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAAT
CCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGT
ATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTA
ATATGAAGATCACGGGGCTTTGCTCGTATCCTTCACTTTCAGAGTAGGGGAAAAGCGCAA
CTAAAAATTCGCCTAGGCGGTCTTAGGGCTGACCTGCTAATCGAGCGATTGCCTGCCGTG
TGGGTTTCAGATGACGCAACGTTTTATCTCGATGTAGGAGGCAGACGCATCCGTCCTGTA
CTTTTAGAGAAAAGTTACCTTACTCGCCACGCTTCTGCGGTCCGGACAGGCCGCAGAACG
ACGCCACTAGGGCCCGGGTGCCGAGAGACCGAACTACTCA
>Rosalind_0398
CGCTAAGATAAAATCGTCATACTAAGGATTACATATTTCAAGCGCCGGATCGACCTACAA
TCCAGTCTGCCGTCAGGAGTGATCTAAATGCGCCCACCGGCGCAAAGGTCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAATTCGTAAGATCTTGTCTATTACACACGTAGAGTTTAATAAGTT
CAATCGGCCCTACCCGGTTCTGCATACGTAGGCGCTAAGGTCTGGACCGTGCTGAATGGT
AAACTGTGATTCTTACAAGGCACATGGAGCGCTGCTGCCAAGTGGAGGCAGCCTTTTGTG
AACGACATAGTCGAGAATACCCTCCCGAATGATCCCAACTGCCAACTCCATACCGTCCAG
GCGATCTAAGTGCAGTTCGCGACCCACTCATTGGCCTACCGCCCCCCTAGGGTGTTCCAG
ACTATCAAAGTTAATGCAAAGACTCCCCATCACCCCCGACGACCAACCAGGTGCTGCTAA
TGTAATTTGTGGTTTGCCTTGAACCCCGGCAGTTAGCTACAGACTTTGCTGCGTCTGGTT
CGACGAGAAGGAGGGGGATGGAAACTTGAAGTAGCAGGGCCGCGCTTTAACTCGTGGTAA
AAAGTCACCCTCTCAGCCTAATCTGAGATTATCCAGGGGCACGAGCGACGGTGAGGTCCA
GTCATATTAGAGGTAGACTTGACAACGATAGGATGGAAAGTGCCGGCTGGTCTCCCAGCA
TATCATCACGGAGACCTTACATGGTGTTTCTAATCAACGGACCGTTACCGGGACATTTTT
GGACGTTAATCCGGTGAGTCTTACTGATACACAAACAAAT
>Rosalind_9978
TACCCCTCCGTACCTGGGGCGATCAACTCGGCTGCATCGGCTGCAAGACTCGTCGGAGTT
GTCTAGATATAATCACAAGTTAAGTATTATAAGCACACCCACACAGTATGTGCACAAATG
ATTAGTCATTTTCATTTGGGCTCCCGCGTGAAATTGCACTTATATTGTATGATAGTTTTT
GGGTTGACCAGTAGTCTGAAATCCCGCGAACCCACGGGACGGAGGAGCCAGGACGAACTC
GCCAATCGAGCCAAAACCCCCCGCACCCACTCAAGCCTCTGAAAATCTCTAAATGAAGAT
GAGGAGAAGTCTTCCTCAAGGTCTGACCTCGATGGGAGGACGTCCTCGTTGGGCCAAAAC
CTTATGCGGAAAGTTTCATGGCGGTTTAGGATCGCGCCTATAGAAACGTTCTCAAAGGTT
TTCTATGTTTTGCTAAACTACTTGTCGCGACCAGCACCGTATCTCGTCAAACTTCTTCCT
CTAGTGAGAGGGACTTTCGGGGGTGTGTGCTTTTAACCATAGTGCAAGAAGGGCTATGTC
GTGCGTAGGCCGTCTCTTTTCGCACGACCTGGATGTCAAACGAAGCCGCAACCATTCGTA
TCGGGTTGACAGACAGGTTAGCAATAGTCGTGGAGCGGACTGTACGCGGAATTGCCATCC
CTGCATCTCCTCTTGTAAACGCACCGGGCCCTTCATGTTCGCTTAGACGAAAAGCGCTCG
CAAACTACAGTTCAATTTAGGATCAATTTGTAACAACGGTAAGAGGGCTAAGGGAACTAG
AAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTG
ATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGG
CCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGG
CTTATTTAATATGAAAACACAGCTATTGTGGGTCCGCGAG
>Rosalind_5976
GGCTCCCCCTACTAGATGTCAGAGATTCCTTCCCTGCTCCGGGGGTTTAGTTTAGCGCTG
TTGATGCGACCGTCCGAACATGCGGGTTTCTAAATATTGTCTAAGTCGACTGTGACTCGT
AATTAGGCCAATAGGTGTCCGTGCGTGTCCCTTAGGACGGACGAATGGGTCGTAGCGTAG
TTTTACTCTAGCTAATGTCTTTGGCCATTGTAGTCCGTTCAGAGGCATGTAGAAACCGCG
TACAACCCTTCGTGGTCACTGGAAATAAAGCGTGTATGGGTGGGGGAGCTTGTACTCCCA
GGGCCATGATCGGGTGCAACAAGAGTTTGTGCGAGCGACATAGCAGCATTATTGTCAGAA
TACGTGTTCGTGTTCCAGAGTTGTCCGACCGGCTGTAACTACGTCCAAAGCAATGAAGAT
ACCCAACCCAAAGAAACGCCCTCGGTTTAAAACAAACACGGCAACTTGGAAGCTGCACCA
TTGATCGCTAGCGTGCTCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGC
GCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCT
AGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTA
TCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTATGACGTCCTTAA
CATTTGAGAGAGGACGGGGTCTCGGTGCTCCTCCCGCTATGTCTACAGTGGTACTACCAT
CTCTGTAAATCCCGATATGTATTAGACTGATGAAATCAATTCTACGGCGCACTTACATAA
AGTTGATCTAATGAGACGGAACTAGACGCACGACTCACGGCGCCTAGCTTTTACGGCCAG
TCAATGGGTTGCGACGGAGCCGACAAGGCTTGGTGCCCAAGGAGAAACGCGCCACTGAGG
GGTATTCAATTCTATCCATGCTGCTACGTCCGAAGAACGA
>Rosalind_2339
CAGCTTTACCTCGGGAGGCGTAGTGGGCGTGTAACCCTATACTCTTGCCCTATTCCGGGC
GGGGAAAATGCAAGATTATGCTCTGCCAGGTTTTAAAGGGGCGTTTCGTCAACATGTGTC
AGCGCTATTTGCGACCCCGTTAGTCCGACATAGTGAATCCAGCGCGGGTAGTTCATATAG
TGGGGCTGCATCACACCATATCCTCATTATGATTGTACCAATCAGGATATTACTGTGCCG
CGTGTACCTTAGGTTGGTACGTCAAGAGAACAGTCCGTACTTGGTATACGGCGGGTATTC
CGGTGCTTTAAAAATTTAATTTACCACCGCTATTGCATATACCTTGGTGAGTCTTGTGAA
TCATAGCCGTGCTTTGTCATCAAAGGGGTTACCTAAGGGAACTAGAAATTGAACCTAGAT
CGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCAC
GAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCG
CCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAA
TGGTCACCATTGGCGGACGAATAGACCCACCGAGACCTTAACCAATGCATGTCGGCAAAG
GCAAATAAAACGCAGGGGTTGATTGACTACTAGGTGTCAGGAGTAGTAATCGGAGTCTGG
GGCGAGCCTAATGGCAGCTCGCCGACAGGAACCCGGACCTGATCAATCGATCCGCAGCGT
CGTAAGCAGATCAGCTTTCGCGGAGCTTGCATACGAGACCCTGCGGCCGTCTGACACAGC
AAACGTCAGACGGAGCCGCGTCCATTCTTCCGAACATTCCAACTGAGTAAGTGAGCTCGG
TAGTCGGTCCAAGCTTCCCATATTGCATACATGACTCAGTGTTCAATATGAGACCCATTG
TACCGCTGAAATGCCAGACGATTAGGGTTCTTTGCCTATC
>Rosalind_6458
CAGGTGCATCTAGAAGAGTATCTAAAGTACGGATCCCTGAGGATTTAAGCCCGCGAATAA
GTTAGACGGCGAAAGTTTGTAGTCTCGCGTAAGCGGCCAGGTGTTTAACTGCTAGAAGGC
ACGGAGAAACGGCGTATATCTAAGAACGGCCATTGGGTCAAGCAGACTAAGGCCACCGGT
CTGGACTACAACGGTAAACGGCGCAGAAACAACTGGGTGGATCTCGTATCATTGCGTGCC
TACAGCGTCCAATGGGACTTTTCGGTGCCCGGAGCATAATAGCGATTTAACTCCTGTTCG
AAGGTAGATGGATTGTGATTAAAACAGTCTACCAACGCCTACCTTGTGCGAGTAATGGCG
CCTTATCGTTCCTACTAACGCAAGGTGAACTCAATTGCTATCGGAAATTGCTATTTGCTC
TACGCATGTGGGACACTGCGCCCGGCTTTACTTGTTATTGCAGGTCGGCACTTCGGTCCT
CACAAATACGCGCAGTATTACGAAGGGGCACACGTTTTAGAGGTACCTCGATACACGTTT
TTCCATGCCTGCCCAGGCGCATAATCGCTTTGCCAAGTAGCGCTAGTGTGCGCCAGTAAT
GAGGAGAGGCCGGATTTAGAAACACAAGCAGTTCAGCGCGTCGACCTGCATACATAGTAA
CCGTAAAAAGGGACGCGGTGAAACGCTTCTACTCGGTAACATAATTTGCATGAAATCTTT
CCTGTGTGCCGTGGTAAAAATAGCGTACCACTGCAGCCTTGCTGCAGCGTTGAACCTAAG
GGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGA
ACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGT
GAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAA
CTGGTGGGCTTATTTAATATGAAAGGTTGAGGCTCAAAAC
>Rosalind_8316
GTGATTCTTATTCCTCCCCCACAACATGTACCGTCTCGGGTATCATAAACACCTAATTTA
AGAGGTCAGCTCTTTTGGCAGTTCCGAGTATGTCCGATGCGTCTTTTCGAAGAACACACA
CGGATCATAACTGGAGTCCGCAGTGCTTCAAAAGGTGTATTGATCTTAGCCCCTGTTATT
AACCGGAATACCGGCGGGCAACGACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGT
CGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACT
TCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTA
GGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAATGTTAAC
TTCTTTATCGCGTTGGGAAGTTACTGAGTTTTAATTGCGTCTAGAGAGTCAGATCTCACA
CGTTACCGACAGACATACCGGGGAGGAACGTTGTCATGCTAACTATGGCACCTTGCCTAC
CCCATGCCGGACTGCAGGCACGAAGCACCTGTGAACCCTTTTATCCTAACCTGCCTCTAA
GACACGTGGCTCCGGGTCGATGGGAATTCTAATGCCGACGTAGGCTAGTTATGGATACAG
ACTTCAGGGGCGCCGGGATCGCGTCACAATCGCCTAATCCCGGAGCGCATACGGGCGCAG
AAAGTAATATGTAACTTGACATGGCCCCTTACTCTTGTACCGTACCAGATCACGGCATAT
TGACTAGCGCCCCTATTCTGAGGCCAACTTTTAGAACTCACTCATGCCGCAGCTATTAAT
AATTTCGGTAATTCATGTTCAGGACACTTCAGTCACCGCGGTCCGCCGGAGTAGACCTAC
TTCGTCCTACTTATCTAAACAACAGAAAAGTCCTCTAAGATTCTTCAGTCATGAGCTGAT
GAGGATGAACTTACTGTGTGACTCAGCGTCTCAACTGCCC
>Rosalind_1424
ATAACCATCTCTCTTCAAGAATTCGCAACATAGGCCAGCGGAACCACCATCCGTTGTTAA
TGTACCGGGCACCCTTGCGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGA
GCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATG
CTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACT
TATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACCGGGCATATTGA
GGTAGCCAGTGGTGATCAGCATATGCTGCTATATCCATCCAAAACTTGTTATGGCTGTTA
GAGGCCGAGATAAGAGCTATAATGACCTCTTTAGACGAGTTTAGTTGACCCCGACGTATG
AGAATGGACGGACGCCAGTGATGATAATCCTGAATAATCACGAAACCTAATGAGTCCACT
GTCGGAGCGCACTGGCGGCTCTCCCAGGAGAGCAGCATGAAACCATGAGGTCAAGGGCTT
GGGTTGAGGGGTGCATTCGAAATCGAATCTACTGGATGTGCGCTCTTTAGCATACTTATT
GAATTGCATTAATCCTCACGGGTTACGATAAGTTCTTTAAAGTGCACTGAAGTCTGGAGT
TGCACTTGTCTAGGCGGTACTCTAAGAGTAGAGGCGCGTAGTGCACTACATGACATGGCT
AGATTTGAGGGCATCTATTGTCAATACTAGAATCTAGGAGTCTAGGTTCTTCTCCACGGC
TCAGTTTGCACGGGCACAGATTTTCAAGCATATGAGTGAAGGTTCCAGGAGAGAGGGCAG
AATTATAGTAAAACCAAAAATTAAATGATGGTAGACTCGACCAGAAGTTGCACACAAGAG
GCTCAGAGCCCTAAACTTAGTCTACCCGTGCACAGACAGATGTTTTGAGCACAAGATCAC
AGCAGCTAACTCCCCCTGTCGCATATAGCAGACAGATCCC
>Rosalind_5870
GGGAAGGATTGGTTCCGCGTGAATCGATTGCGCTAGTAACAGCGCCTCGTCATCACCCGG
GCCTGGGACTGCACTAATAGATGAACGTTCTAACAGTTAGGCAAGCACCGATGGCTCCGC
AACTGGCGCAGTATTACTCGTAAATAACTTACGCAGCTAACCGGACCCATTCCCACATCC
TAACACGCGTCCGCAGGCCGAATCCGCATGGTAACCACTAGGATGACTCATCATGCCGCT
GCAGTTGTTATTAGCCGTGGCGTCCGCCCCTTGTCGGATATACACGGCGACCTACTATTG
GCAGCACCCCCGCAAACCACCCCGTCTACGAGCTAAATCCCTACACGAGCCAACGGGAGA
CCTCTTCGGATAGGGCCCAGCAACGAAGAAGTTGGTGTGTCCGTGCGGGTTCCGCGAGTA
CGAAATTGAAGCGTTCGTTGACGATGGCGAACATAAAGATTCTGAACCGAACGCAACGTA
AGCATTGGTGGAGGGCGGCATAGACGCGTCGAGTGACTGTAAAGGCGGGGAATCGCAAGC
GTCCCACGTCTTTCGGACCACGCGCAGGCAGAGCGGCTAAGGGAACTAGAAATTGAACCT
AGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCT
TCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATG
TTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATA
TGAAGCGTGCCATAAGTGAAAACCCGTCATGGGCCTTGGGATGTGTCTTGCACCGGGAGC
TTAGTGTCTCCGTCTTATCTGGTCCTTGATGCCTCGAGGATGATTCACATCGTTCGATAT
ATATCTCACGTTAGACAATTAGTCGCAGCGTCGAGTCGTTTTCGGGTGACCAGAGCTCAG
TGATAAGATTTACCGAAACGCCGCTAGAGGCGGCATTATT
>Rosalind_4686
CATGATGGTCAGGTCGATGCAGTTCGCTCGTTGCGAATCACGCCATATCCAAGAACAGTT
CGAACTGTGTTAGCGCGCTGTCAAACCCGTTCAACCAGGAAGTGGAGACGTTAAGGTGAG
AGTAGTTGTCTCTGATCTTCCCGGAAGACTAACGACTGCGGATAAAAAGGTTGGCGTTGA
TCTTGTGTCTAAATAGAGAGGTTGTCGCGCATAAAATTCAATAGCGCGGGGACAAAGGAG
GGAAGAAGCTGCTTCTGTGGTATGCCGTTACCGCTTCGTAGCGTGCTTGCCGGCTACAAC
ACAGAGTCAGTGAACCGTGACACGACTGTCTTTTAGTAACATCCGCTACCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAAGGAGAAGCCCCTGCAAATCCCTATTGCCCCTACCAGCTCTGAA
GTAGTATAAGGGTGGGTAAAAATGCGTCGCATGAAGATTATCAAAGCTGCGAGATAGCAG
TGTAAAGGATGGATTACATACCCTGCTTAACACCGAGAAGGACCCGTGGGCCAGACGGAT
CACCTTGGGGAGCATGAGGAAAAGCTGGATTACTAACGTCTATCGTTAACCGTTAGAGAC
ACTACGAACCAGGGATAACTCTGGCTGTGACGGCACACAACAAGAATGCTTTTCGTTTCT
CATAATCGGCTTACACGCCTTACGCCAATGCCCCAAATCTTATCCATCCGGTCCAAGGTC
TCCCTGTGGTTCTCCCAGCTAGAGCGCCCTAAGTCTCCGATGAGAGGACTCGGGACGGCG
TTGGTAAGGACTCAGCCGACCGTAGACCCCAGCTTGCAGA
>Rosalind_4738
TGAGCATAAAGAGACGATGCACGTGCTCTGTGAAACTTTCGGTTCACGGTTAAACCATGC
GGTCGTCGTCATATCACCCGCGCGATACGAAAACTTGCGTCTCACAGCGACGGGCGCTAC
AAGACCGTTGGAATCAGAGGGTCGCGAATAGCAGATGACCGAGCGAAGACCGATCCGATA
CAATGAGAGCCATTCAGCGCAGAATTCGCAATGCGGCAGCACGCATCTTCCTCTTATGAG
CGTTTGGGTCCACCCGGGTGACCAAGTCCACCCTCCTGCACCGCGCAAAAGTGTTACAAG
GCTTCAACTGATGTCCGATTGTAAAGCTATTCTATAACCGTTGTATGTTGCGCGGGCTCC
ACACAATGCATCTCTGAAGTGACACGTGCGCTGTATCAACAGGAAAGCCAAGTAAGTCGT
TGCGGTTCAACCATCCGGCCAGCCAAAGTCTGCAACGTATGATTACTTTGCTGTTCTAGA
ATGATAGATAAAATGCCAGAATATCCTTACTACTTATATTCACGTACAAATCACTACGAC
AGTTCCGGCTACAGAATGTCCGACTATAAACTAAGGGAACTAGAAATTGAACCTAGATCG
GTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGA
AGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCC
ATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATT
GCATACAAGTGGACTGGGATGTAAAAATTGTGCAAAACAGTATTTTACCGTAGTGTCCCA
TTAGGTACGTATGCCTTCCCGAAGAGTCACTTTTATCACACGTGTTCACGCATACTATCC
GGCGTATCAGATCGAGAATGTGAGAACCTTGCACACTAAGACGGGTCCACTTCGGTATTC
TACAGAGGTAGATAGATAAACGCATCGAAGCAGCATTTCG
>Rosalind_2711
AGCGCGTGTGAAAACCCTTGCTATGTTCGTTATGGGCCATTTCTACCAGGGCTTAGTGGT
ACCCTCCGGACAGGGGTGGTGCCGTCCTCTTTTTCATGACTAAGGGAACTAGAAATTGAA
CCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAAT
CCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGT
ATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTA
ATATGAACGCGCGCGGGTTTCACCGTCCTATGTTTCCTTCGTGGAGTCGCCCTGGATTCG
AATTCGTTCTGGGCACAGGCTGGTGGAGCAAGTAATGCGCGTCGCTGATCTCAAACTGCA
TGGAATGCTAGTCTGCTTGATGTTCCCCTTAAGCAATTCTTTTTATGTTATCCTCGTAGG
CTGAGCACAACATCTCATCAGGGCACCAGAACTTATGTACACGGCCTCCTCGAAGAGTGA
CAACAGTACGCAAAAGCGGGTACTGTGAGTACCGGCATAGAACATATCCCCTTTTCGAAG
CCCTCATGTCAGGCCGAACGAGAACCTAGGTGCGCCCAAAATCCCGTCCTTGGCTAGGGG
TCAAGTATGCCCTAACTGCGAAATAGAACCTGCTCTCACAATCTTAATCATAAAAGACAT
GGAAAAGGTGGACACAGCTGCATCCGTAAGTTATTGGAAGGCTTGCTCAGTAAAGACATG
TCTACATTGATATAACCCGCCAAGTTGGGATACTGCTACGAAACGGTAATCATAAAAATA
TGAGGAGACGGTCAGGCACTTACCGAAAGATCGAGATACGCTACGTCGGAGAGCCTCTTA
TGTCCCCGGGCGGCACGTACGGTGCTTGGTGTCGTGGATACATGGACCTAGAAATGTAAA
ATTATGTTCGTGTGTACCAGTAAGAAGGCTGCCCCCAAGG
>Rosalind_0020
GGCTCGTTGACTCAATTCTGGTGAACGGGATTCGCGGCAAGAATAAGGGACTGCGTTTGA
TGCTCGTATCATACACTTCTAGGAATAACAGTTCAAACAAAGGCTCTCCGCACACTGGTA
GGTAGGCGTTCCATTTGCCGCACGGAGTGGAACGAACTTCGTCCGTCATTTAGCATCGAT
CTGATCGCTTGGAAATACTGCATTCCGTGGTTCAAAATATAAACAGAAGCAGTTTAAGAC
CACATTCTCCATAATAACGGTGTTCGGTATAATTAACCGGACGATCACAGGGAAGTAGGG
TTACGCTGTGGTCACGAATTCTGGCACCCTGACTAGATTCTCAGATCTACTTTGGATGCG
GGACGTCCCCAATAGGACTAATTGATGAATGCGAATCGGAAGGTATTAGGACTTACTGGC
CAGGTTTCCCTCGATCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGC
CGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAG
TCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATC
GGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAATGCGGTCACGGTCCTC
GGCCAGTACCATGCTTTATGGCCATTGTTAAATTTGAAAGAAATCAACAGCGAAAGCGAT
GAGTCACGACCCTAATTCGAGAAACATTCCACGGTGCAGGCCGGCGAGTGGGGCAGGCCT
CTCTTGAGACGAGACCACACTACCGATGACTATCTATTTCGATACGAGTTTGCCGCTGGG
GGTCTCTGACCGGACAATTACGGTATGCTGGTCCGTATTACCGCAACCGTGCGTCGCTTT
TTCGCCCCTTATGTTTTTCTGTGTAAGACCCAATTTTGCTATACGTTGTGCGCCGCGTAC
TCGATAGTAGTGCTTCCGAGCCCCTAACTAGGGCCAGAGG
>Rosalind_8599
ACTACATGCGTAACAGGAGGGTGTTGTGCGCACCTCTTCGATTTCCGGGGCGGACCATAA
TACAAATTTAGCTATCACGCTCGGCAGCTCCGAGGGCCTCATACTAACCGATGTTCGGCT
CGTTGGGAAGGTAAACGTATCCCCTTCAAGGTTTTATACGAGGCCTAAGCTGCCGGGAAA
TCCTCCTTGATATATTGGCCCCTGCATACTCGTACGGGTGTTCCTTTTGGTGAGATAAAG
TGTCGCGCGAAGCTCCGGCCGAGCTGATACACGCTGGGCTAATCCCGACGTATGTTGGAT
GCAAGGACTTGATAATAAGCTAGAAGCCAGTATTTGCAAAAACTCGTTGAGCAAACACGA
GTACCCCATAGCGTGCATGGAACGTTTAACTGGCGATGGACCCGTTCCATTTCATCACGG
TGAATCGCCGTGACGTGCCTGTTATGATAGAGGGGGTCAACCGCCGCGGGGAAAGCATGA
CATTAGTTGTCTCTCACGCATTATCCTGTTATGTCTATCTATTGCGCAAGTCTGACACGA
GATTGCGGAAGACTTGATGGATCACTGCGCGGCCGCAGCCTCGGCTGAGTCTCATGGATA
GGGAGAACTAGTGGAGGGGGACATCGTATTATGATAAGCTGGCGCGGATCGAATATATGC
AACATGGTGCCCAGTTTGCCAACCTATAAATACTCGTTAAGCCTTGCCTCTCCAAAATGA
AAAGGCATGGACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTA
CAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCAT
TTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCA
GCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACGATTGCGGGCAAATAATCCG
TTCTAGTCTTGGAGTTCAGCCGTCGCACTCAATGCTACAA
>Rosalind_7390
TAGGAATCCAGGTTTACACCGGACCTTGTGCGCAACTTGGTGCCGGCAATATAAATGGTT
GGAATCAGAAACAGCAACTATCGTCCTGGAAGTTACGGATGCGTGAAAAGCCTTCCTGCC
AGTCCAAAGAACTGGTCGCCCTTTTTATAGTTCCCCTCACGAATCCTGCACCAAGTCACA
TAACGGACTGTGTCCTCTATGCTTTCTGGTATCGCATCGATGAATGTGTTCCCTACCTTG
GTTCAGACCTCGTTATGTTAGGATGACAGGGGTATTATCGCATTGGAAAGCACATCCTAA
GGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGG
AACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTG
TGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTA
ACTGGTGGGCTTATTTAATATGAACCTATCAGGCAGTCAAAATGCTAAATTCAGGTCGGA
AGGCTTCAATTCGTCGGGCGTGGGTACCGCGATAATAGACGGGTCACCCGCACGGGCCAC
CCGCGTGTGACACTGGTGATAAGGCAATATTTGACCCTGCAGGCAATCTGGCCTGTTTCG
GACTGTTGCCCGCAGTGTGATGAGTAAGTTATTGGACCCCAAGCATTAAAGACCATATTA
ACTTAATTATCGCATTTCCAGGATCTCAAAAGAAAGCCTGAAGCTTGACTGTTCGCTATA
TCAGAAGAGCAATCCAGGCATAAAAGATTGCAGGACACCGTAACGACAAGGCCGGAAGTT
GCAAACTTTCTGGGAGAGACTGCTATGTTTGGCCGCGCAATGCGATGTAGTGGACAGGCT
TGGCGGGGGCACATCCGAGGGCGGAGTCTACGACACTATAAGAGAACCCTGAGCAGTGTC
TGTGGCTTGACCCGATCATTCCAAAAGTGGTTCAACCGAC
>Rosalind_0566
TAGAAAATCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAA
TATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTC
TAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCG
GAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATAGTACACTTTTTCCGTTAAGACT
CCATCGTAACCTACTACAGCTCGCGAAACAAATAGGGTACGGGCCACTACCTACATACGG
GTACCTGATGTAACATCAGGCAGACCGGTTTTAGAGCACTATAAACGACACCTGAGCTCA
TCACCATTAGGTCTTGGATACTTCCCGACGGATGGGCTATGGTGATGCGGGGAGCTTCCT
TATGTCAATAATTTCGACGACTCAACGGTTCCCTCCTTGACAGACGTGTACTCTGCGGTA
TTATTACAAAGAAACATGACTTAATGGACCACGGGCAACAGGCACCGAACTCCCAGTATT
ATACCGCGTATAGTCGAGGAAGGTGAACGGGCACGAGTCTCGATACCGATAGAGCGCATT
GGGGGCGAAGCAGGTCCGTCGTCTTAAGCTTCTGGCTCAATGATTACTTTCATTCTGGGA
ATCTCTGTACCCTGCTGGGTATTTGTTTTGATGTGCGGCGACCGTGATTATGTCATGGCC
CTTTAGTTGGATTCCACAGACTACCGACCCTCATTCGGCGCATAACTCTTGCACAACCAG
CCCTAGACCGTTACCACGTACTCACACAGAATCTTTGAAGTGGGAGCGGTGTGAGCGTTA
ATCACTGCTACGGCTTGGCTCAAAAGCAAGGCGTACGTACATCCATTTGTATATCTGCTC
GACATCTGAGCTTGGTGAGGTGGAACAAGAAGGTACTGTGTTCCCCGTTAGCCGACCCTG
TGCGGAGTTAGATGGGGCCGTGGCATGAGATTGTTAATCA
>Rosalind_8694
ACCATACAATCACGCCTAACTCGTGGCCATCACTGGACCGCCGCTGCGTTAATGCCCTGA
ATTTTCGGTCACCGGGTTTTCAGCGCACATTTGACTGGAGATGTCCTGTTGTGTGATACT
GGCTCCTTCGCCGAATTCCCGCCATCCACCTGTCGTCTTTATCAGCAGTACGTATTAACC
TCGGGTAGCCTACATGAGGATAGGGTTAGCTGGGCGACTTAGAAATTTCAACTAGTGGGC
GGAGTCCGCTACGTGTTGAAATGGTAACACGTTATCGGCGGAATAGGTTGCCTTTTAGGA
CCCTTTCCGAATGCGTTCTAGTATGGCAGTTGTCCTCACAGACAAGCCATGAGCTTCGAG
GCTAGAGCCCGCCCATAGAGAGTCGTTAGAGGAAGCGACACGTGATCTCCGGGAAATCAG
GGTCTCTGGTCGTCGAAGAGCTGCTCTATACGGTGTTCTACCGCAGAGCCCTTCCTAGTA
AGACATAAACAGATCCAAAGAGCCTGAAGGGCACATGGGCACACCTGTCGAAGTTGGATA
ATGTCTCGCGGTTCTCGCTACGGCCATTCTTGCAACGGTTCAGGCAGTCTGTTACTTTAT
CGACCGCACACAGCTTAAAGCTTCCGACTTCATCCCTACCCTCTGATGTAGGTCTGACAC
TGGGGCAATTGTATAACCAGTCTGGAAGGCATTTGACTAAGGGAACTAGAAATTGAACCT
AGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCT
TCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATG
TTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATA
TGAAGCCTGTATAAGCGAGGCTACTCACTGTTATTTGGGGCCCGGACAAACTCTTAAGAG
ATGCTTATTACTACACAAGACTTTTCTATAACCCAACAAC
>Rosalind_7043
ATGTGTCTGGGATGACCGATTTGAGAGGGTTACCAAAAGTGTCATTGCGGTCATACCACG
AATAAAAATTGTCGTGGAATTTGCAACGATAGATCAGAAATGCTGGTCGCTGATGTACCT
GCCTGTGCGCGTTTCTCGAGGTCAGGGCCTAAGGGAACTAGAAATTGAACCTAGATCGGT
ATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAG
GACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCAT
TCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTCC
AAAGTCCCCTCGCAGGGAGTGTGGCGGTAATAGGCAATTAATACTGCGTATCCCAGGAAC
TTGGGGACGATCAACACAGCCCTGTATGTTCGATTCTTATGCTCTATGGAGGAGATGCAT
GTTTACCTATACTAGAGATGGCGCCCTCTTCATAAAGACACTGATGCAGTACGGGCCTAT
CGAGGAAGGTGGCATCGTAAAAGGATGGAGATGGAGGTATGGCCCATCTGCCCCATGGAT
CCACACGGCATTTGAGCGATTGTGTAGAGCAAAGCTTAGTGCAGCGCCTTGTTCAGGGTA
TCTGTACCAGGCCGGGGGGATCTGACTAGCCCCGCACTATATCCCCGCTCCAGAATTGAC
ACCCCAGTAAGCACAGTGATAGGATGAATGCACGAAGCCAAAATCCCACCAGTGCTTAGG
GTAATATTGGTTTTTCATAATGAAATCTAGGACATCTGAGAACGACTTGACCTTTGATAG
GTCTGCAAGAAATAGCTAACTCAGCTTAGGGAGGGGTTCTGATCGGGTTTATCAGTATGA
TTAGTGTTCGGATACACAATTTCGCTGTCCCCCGAAGCTAGACCCAACTTGTTTGCTAAT
TTCGCAGCCTCAGACAAGTGATGCGATCCGCGCTATGGAG
>Rosalind_0082
ATCGCCGTATGAGATAACTGACGGAGGACTGCGGGCCTGTCACGATGAATCGCGCATTCG
ATGCTATACCGTCAATGGGAGAGTTGCGCTATGTGAGACTCTGTGTTCGGAAATCCGCAA
CTTCTCCTGACACCAGCGCGGAAGGAGCCTCTCGACAACTGACATACTGACAGTGCCTCC
GCCTTTTCTATCCACTCAGGCGTGATTAACCAATCTTTAGAGCTAAAGGACGTGCGCCGT
CTGATACGCAGTCAGGCGATGGCTGTTGCACCATAGAAGTATCTTGAGCTCCTATCCTCC
GAACCCGCAACAGTGTGGCGACCAGCCATCCCCGCCAAGTTCGACATACGGGAGTATTCA
CGTAAGAGTGGCTCCCTGCAATAGTCAAACAGCTGATATCTTGTTATTTGCCGTAACCGC
CGGTTCAAATCTCTATCACAGTTGAACTGCGTAACTTTCCCGCTAGAGTGCGTCGTATTT
TTTAGATTGCGGAGTAATCATTCAGGTACAAGGCCTAAGGGAACTAGAAATTGAACCTAG
ATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTC
ACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTT
CGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATG
AAAAGCAATACCAGGGAAGTACGATGCTTTCTTATACCCGGCCCTGCTTGATGGAGGTTC
AATGAGAAGTTACTAAATAGTATATTGCAAGAGTACCGAATTGTCGAGGCCGCACCCCAC
CCGACTCGGCAACGGCTCTTTGCGAAGAGTGAATCATGAGAATCGTCGTCTCCTTTAAGG
CATCACATAAGAGCACTGTAATATAGTGCATTTTCACGTCACCGCCAGCGGATATCTCTC
TAGTCACTTGTTGTTTTTTCTTCATGACGTAGAGAAGGTA
>Rosalind_7197
TAGCTCACACTTAGGTCATCTTGTTAACAGGTCCCGAAAAGATCTCGCCCAGGGCTCCGG
TTGACGGAACGTCCAGTCCCAACTCGCCTATAGCGTCTCTACGTTATGCCCGGTCAATCG
GGTAAATTACGACGCCTAGGATATGTATTCGTCGCATCATGGTCGGAGCGACCAAGAAGT
TCGCCGAATGGGGGGGTACACGGTCCTACGACAGCTTCCCTAAGACACATCAATGGCATC
CTAAAAGCACGAAATGATCAACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGA
GAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCA
TGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGA
CTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATGTCTATTGAC
TATAGACGACATAGCAAAATGAACGGCGCTTCGAGAGGCTGGCAGCCGTTTCCGACCGGT
ACCGTCGGATGTGATTTGAATCCGCTGTTGTATACGCTGCCCGAAATCGTCGTTTGACAC
GATACAACTTGGATATGCGCAAGAGGGAGGGATGCCCGTGGACTCCAGTCTGGATGATAG
CTTGCGGAAGGTAACACGCAGTGTGGCAAAAACGCGTCGATGGAGCTTTACAGCAAGCTT
AGAAATCGACTGCTATACCATACCCTGTCTGGATGGGCCTCCGAACACAGATACACTTGG
TAAGCCGGTCGGAACGCAGCTATGGAAAGGGTCAGACTCGACTCCTATAGAACTAGAGCA
AGCTCAGGGTATTTTTCTCTTGACGCGTTAAACGGGTGATTGCAATAATGCATGTGGCAG
CCTCTCAAGCGACCGTTTCCCTTTAGTTGGGTATTCAAGCTCAAAGAGAAACGATACCGA
CAGGATCATTAGTCTAGCAGCCACTGCATACCTACAAAAG
>Rosalind_3448
GCAAGGATGTAAGTCATGTTCTTCCGTTATATTTCGTATTCATTGGATCGAGGGGCGTCG
AGTGTGCTGCCGCAACGTTACATCTAACGGTGAAAGCTTCCCTCGGCTCAAGCTAAGGGA
ACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACA
GGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAA
CTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTG
GTGGGCTTATTTAATATGAAGACATGGTGCATGAAACCTAAAAGCTTTAGCGACTTAGTA
AAGGGATGACTCATTCGGCAGGGATGGTGAGAGTAGTACCATTAGGTACTTGAAAACCCG
AGACACTGAAGAAGAGGCGCCACATGTTCACATATGCTGGATCCTGCTCACTGTTGCATG
CAAGAACTGCACACATTGCGCAAATAAGTCACGCCTGTCTCTGATTCCCCGCTCACGGGA
CTCTCCCGGCGGCTGCTCTGCATCAGACTTGCAGGGATGAGACTCAGCGCCCTTGTGGCT
AGAGACCGGATCGTAAGAACCAGTCACTACTACTGTGAGCCCTGGGTCGGATGGTGAGTC
CCGGCATCAGGGGCGATGTCTGCACCAGGACTGCCCTGATACATTACAGCAGGCTAAAAT
CATACACTTATGACTGGTTAAGGTATCGACGCGTCAGTAGGTAACCTCCTATGTAGGGCT
TCTTGACCTGAAGTCGCAACGAAAGTAGCGCTCGTTGATCCTGCTACCGTCGCTTCGCAA
TAACGGAGCGACGTGCATCAAGCTCAATCAATGGATGTAGATGTATCAGGGCAAAGTCGA
GAAAACCCTCACAGCGGCTCGACTGTATGACACGCTTGGAATGGATTACGGTGGTACGGT
ATGAATGGCCCCAACCTTGCTGGCTCGGACCGGTATGCAC
>Rosalind_8968
TGCGATGAAAGGACCGAGGTTAGAATGTTTTCAAGCGAGCTCGTACCTGAAAAAGGTCTT
GGGTGGTCTTGATTGCGATTGGGTCAGAAAGTGAATTCATAGAAAAAACCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAATCACCAACGGTTCCCTAAGGACGGTGAGCCTATCGCCCGTGTT
CGGAGAAGCCAAATCTCAACGAGGGGTCTGGAGTCCTACCATTACGTGCGCCCCGGTCTA
CTCTGAGCCTAGGAGTGCATTGCCAATTTACGTATGTTTCCTACCAGCACATGGCCTACA
TAATGGGGAATTTCTATTCGGCATTCCTAAGTGAAAGACCGTAACTGCATCGAGGATCCT
AAGTTACACAGATATTGCCACCGCCCGTTCAGGTAAGCTGGGGAGAACCAGGGTCTTTGG
GGGACATTCTGTTCCCCCTGCCCGCTACTTACAAGAAATAACTTGGTTACATGCATACAA
ACGTCGTACACACTGCGTCGCCGGCTTTTTCAGAATTACCGCCACCGCCCTTCTTGGTAA
CTCATATCTGATCTACAGAATCGGGTACGTCATCCCTTGCGATCTCAGCGGCACGCCATC
CCATACGCGTAATTAGGCAAACAGCTAAGAGTCCGTCCCATATATCGCGTTTGGCCTAAC
TCTCTCTTAAGGAAAGAGTTATCACGTATGGTAAGTGCAGGGCGAAAACAGGAAGGATAA
TCTTATCCACTTTACGTGCTTGCGACCTAAATCTATAAATTCGTAATAAGAGATATACCT
GTAAGTAACCGGGTCTTTCGATCGGGTCAAATTCTCACGC
>Rosalind_3409
TGTATTGGACCGGCCCTCGTACGGTTCGCTCACCTTTAACCTCCTACGCCAACACTCCTC
TATGTCGAATAGAATCGGCGGCCCACTGGTTTTTGACATACTAATGGGGCACTGGAAACG
GCGACCCGGGGAGTTAATAGATCTAAGAGCGACTCGGACGGGTTAATCATGCTAAGGGAA
CTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAG
GGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAAC
TGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGG
TGGGCTTATTTAATATGAAGACGAGGCCTAGTAGCAAGTTTAACTCGTTTGGTGTACCCA
AGGGCGCACGTATACTTAGCCCATGACAGGCCTGGCCGGTTAGAAGAAACAGGAAAATGC
CAGCGGTGGGCGAGAGACGTCGCCAATCAATGGAGCCGTGTCTCTGCGCCCGGCACATGT
TGCAGGGGGCGAATAAAAGCACACGGGGTGCTAGCTAAGGTGCGAACAATAGCAGTATCA
TCAAATGGCCGACTTTGAGACGAACTCCAGGGGATTCCTAGATAATGGACGTGTCAGCAG
CGCTCAAGAGTGTTTCGTAATAGTCATGAGCGTCAGCGCGTGCTCTGCGCCGGCGGTGTT
GATACGCACTGAACGGAGAGCGCGTGAAGGTGAAGAATGGAAACGCCTGTGAGTGAAATG
TACATATGCGACCCGAATTTTAACCCCGTGGTGAGCTCCCCTCTTACACCATTAGTTGGT
GGGCGTTCCTAACCTATGGCGAATCGGCCTCCGGGCTTTAATTCCTTTCTTCGTAAGCTG
GTTACGCATTACTAGACAACGCCCCCAAGCCAGCACACTCATTCCCGGCGCAATGGATTA
CATGTCTCTCTCTATAGTGCAGTACTCTCTCTGTGGCCCA
>Rosalind_2527
TTCCTTAGCTCCAATCCCCATATCTTACCCTATTCCAGGAAGTGTGTAACCCCCTCATGA
GCGGCTTGGTTAATACCCCTTGGTTCCTAAGGGAACTAGAAATTGAACCTAGATCGGTAT
GTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGA
CTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTC
TAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACCTTGG
TATCTATTTGCCGAAAAGTTAATACAGCGCTACGAAGCCTGAATCACTGTGGAATTAGAC
GCCCGCCGGCTTGGACTTGCCCCAGAACCTGCGACGCATCAGCAAGCCGTAGAGGAATAG
AAGACCTTGTTCGATATAATGGACACTCATGATATATAATTCTCCGCTGAGTTATCACAA
TCGGCAGACTGTACGCGAGTGAACCTCCACACGAACAAAGGATCAACCCTAGGGCTCGGA
ACGGGGGCGCTGATTCAAGTGCCCCCCGGTCAGAAGAACTACAGATTCGATATCCACATG
CAATCCTACCCAGACTACAATGGCGCGGATTTTTAAAACTGAGAACGCATTGCTTCATCC
TTGAACACGAATATGAAAAACAATCTGTAGGTACGCACGTAAGCACCACTTTTCTGTTGC
GACAACTCCTATCCTGATGGTTCAAGCTTAGCTGAACCACGTCGCGCGTGTCCGCACGGA
CACAGGGCAAATGCAATTCTTGCACAGTTCTAGACAGACCCTAATCATTGTTGTCTGATC
GACCTCTACGGACTTGATCACTCGTGTAAAAAGCACTCGAATGAAATTAAACTAACAGCA
ATTCAGAAGACCTCCTGTCTGCAGATTACAGAGGTCACATGGTCACTGTAAACGGATTGT
TTGGACGCGTACTCAGAGACAAACCTCCCGGCGAATCTGT
>Rosalind_6944
GTTCTCCGCCCACGCTGTGCCATCATCAACACGTGTACACATGTTGTGGCCGGCGTCGGC
TACCTAAATATGGCTTGGGATAATGCACTTATAGCATGAATTCGGATAGGTGACAATCCA
GTGTTTGCTCTGCACCGATTGAAACCGAGAGTGAATCCCGTTTTGTACAGTACCTCATGT
CGTTACTGACCAACGCGTCGTTCGTGAGGTGCATCGTGTGCGTCTTGTCAGAGGGGTTGG
CATGGGTAAGGGACACCGCCAACGTGGCGATTACAACAACCCTTTGGCATCTCCAAAGGC
CTCATCTGACACATGATCTCTCTCTTATGTGGTGCTAAGGGAACTAGAAATTGAACCTAG
ATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTC
ACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTT
CGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATG
AACATAGGCAGTTGAGTACCTGGCTGGAATGTTCTTTACAGCCCAAGTGTAACGGAAACC
ACTCCGGAGTAATATGAACTGTGTGTCTACTTTAAGCCGAGCCACTTTCATCGACGACTG
CTGGTCCACTGCCGATGCTACCGTCGCTATACTAAGGAACGGGACCAGGCAGTAGCGTTC
GTTCGCCTACTTTCGTGTGCGAAATCTCGATAGGATGGCAAGTCGCCTGTTCTCTCCAAG
TCATAAACTAAAACGTGTGACCAGGCACGAGAGTCGCTGTTCGGCGTCAGCCCGCATTGA
GCCTGGATTGGACACGGATTCAGCAGCCGGTAATAGCCATGGGCGGTATCATCCAGCTCT
CGCAGCGCTCGAGGAGGTAAATAACGCCGGCTCATAATAGCTTGGAATCTCATTGTCATT
TGGTGTCCGTCTCCCTCCCTTAGTGCGGACCAACCAGAAC
>Rosalind_8764
GTCGTAGTAGGATTTAATGGGTTACACCCTATACTCACAACGTCATCAGACGAGAATGCA
TGGTCCCACCAGAATTTCGTACTCCTCAAACGCCGCCATGAGGCAATCCTTGAGGTGGTA
CTCTGCTTCTCAATTTCCAGAATGGCTCACCACCTTATGTTGTAAGTAACTCTTTCATAA
ATTGGAGATCGGTGTGGTCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGA
GCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATG
CTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACT
TATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATTGAGTTAAAATT
GATCACTAATGATCGTCCGCTTCACTAAAGGACCACCTTTATCAGCGAAGAGTCACCGTC
CCACGTGTGCCCAAGCATTCCCTCTCTCGGTCAGGATGGACAGCGGCTGGGCGTCATTAA
CCGACTTCGTCATAGTGCTCTGATACTCCCGTGTCGCGGATGGTGAGACTTTTGTGCAAG
TTACTGTAGGGTAGATGTCCTTAACTCTGCTTTAGGCGGACGTTGCAAGGTTCCCCGTAA
CGGCGCACGACCGAAAAGCGCCGCACAGTGCAGATACTGGCATCTCTCAACCGGTGAGGC
TGCCCGTAGCCTCAGACATAACTCCACATTAATGAGAGACCACCGGCTGGTGATTGTTGG
GCGTCTTGCCAGCGAAAAAGCCCTTTCTCCGTAAGTACGCTCGCACTGCGAGTACCGTGC
ACGCCATTCAGTCGCCAAGTAGAATGGGCATAGTGGTCACCCGTTAGTTCAGCAGTTGCA
TGTGTATGCCTCAGCCAATACCATACGGAACCGGGGAAGGCTATTCCTTCCGACTGCGAC
GGTTAAGTTATTTATTGCATCTATCACTTGCAAACTTAAG
>Rosalind_4966
AGCCGCCTGGGCTTATTGTTAGCACAGGAAAGCCCAGTCGATCCATAACGGACTCCTGTG
CCAGGTGTCCCAGATTAGGGTGAATTAGGCATCGACAGAATAATGCGCGATCTTCCTCGG
TTCAGCCAGTTATCTGATGTCAAGCCGCGAACGCGGCGCATGTCGAAGTTAATTTCGTGC
TGAGCAAAAGCGAGGTTCGACCCGTCCTGACTGCTTGTACAGTGGCAAAAATATTCTTAA
GGGAGTTCACCTCGGGTCACTCACTACTCGTGTTAGTTGTCCATGACACCTAAGTTTGAT
ACTAGGTAATTGGAAATCTTCCTTATACACCCTTATGGTCGTACCGTAGGGACCGCTCAA
GCTGGATCGGGGACTTCGTTCGGAGCTATATTTGCAGCGGAAGGACACGCCGGTAGCGCC
TTGAAGCGCTGGGGCTGGCCTAGATGAGGTCCTTTTCGTGGCAGGAGACGCCTTTACAAC
GGACCGAGAACCTAGGCGTTTACGGAACCATAACCTCGGTATGAACGTCTAAGGGAACTA
GAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGT
GATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGG
GCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGG
GCTTATTTAATATGAAGTTTGGTCATTCTGTAGTTCATTCTGGGTGACGATTATTATAAT
CAAGCCGTGTGTTGCCTTACACTGAAGCCACTTCCTATAATCCTATGGAAGCTATAACTG
TGCCGGAGAGCGAGCCGTTGAGATAGTTCAGTGTCTGTGGACATTTACTACTCGCGGCTG
TCCCCATGTATTGTCTCGTTCAGCACTTTGTCACAAGCACAGATTGTTGGATGTTCTGTG
TAGATATAAGGTGATGGCTCTCGAGTACTGCGGAAATCAC
>Rosalind_7589
ATTTTCACCGTCCGGTTTTGCCATTCAACGTGTGCGACTGTCACGTCTGAGGAGTTTAGG
GAGTCGCAAGGATATAAGTCAAGGAAATACACGGTGATATTCCGGTCTGCGTGGTCTAAG
GCTCCGCGTGACGCTTCGGCGTTCCTCATGACAGGGCAATGCGAAAGATATATTACCACT
CAGTGTGCAGGAATGGCTGGTTAAAACGTCATAGGTCAGAACAGATATGCCTAGCTACCC
GCCACGTAGGGCCGTCCTTTGCTGATGTCCTATAATGTAGCGCCGACATGATACGGAAGT
TAGCAGGTTTGCACCGAGGTCGATGGCTGCCAGACTAAAAAAGCCAGTTGTAACGGTGCT
ATTCACGTGGTCTCGCCGTCGCAGGCTGAAGTCTCCCTAAGGGAACTAGAAATTGAACCT
AGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCT
TCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATG
TTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATA
TGAATGGGTGCTCCGGGGCCATAAGGACATTCGGGAAGGGAACTTGCCTGGGCAAAGCAT
TACGGCAGTTTCCAAAAGAGTGGTTCCTATTAGTCTACCTTTCAAGCTAACCCTATGGAA
CTATTTGACGTAGGGAACTATCAACGAACGACTGTAAGTTTGTATCTGCATCATTTTATG
GAAAAAACATAGCTCTCGTAGCCCTCAACCGAACCGGTGCATTGTTTTACTGTGATATGC
GGCATCCCATGGTGCGGTAAGATTTAATCATTGCCCCAATAAAGGGCAGCGGCACGTTTA
TTTGACATCTCTTTTGACCGAGGCAGTGTTGGTGACTGTCCCTGGTGAATAAGGAATCCG
TTACTGGATCTGTACAGCGCGCCTTGCTCGACCGCTGTCA
>Rosalind_7331
CACGACAATTTAATTTGCCAATTCCACTCAAACGTACCTGAATGCGCCCAAGAATCTTAC
CAAACCTTTGCTTCTATTCACCCGCATTTCAGGCTTGGCTTCGGGCAATAGCTTGGTTTT
ATTCGTCATCCACTTATACTCGATGGTGTTTGATTCTTGCTGGCGGGAACTAACCACCGG
TGACCGCCGCAAGTTAAAACTCGTAAGGGACAATAGAAACATGATCCATCCACGGTCGCG
GGTCCGGGGACTCGACAGCCAGCCATTGTTCGGCAGAGGCAAATGATCGGGCCGCCACCG
GGGGTACGTCTAAATGATTCAAATGTTCTGGCAGCCCCGACGAGCCACGCTGAGGTTATA
AGCTAACAACTGCTACACCTGCGTGTATCCTTTATTAAACTCATCAGGTGCGCAAAAACT
CGTCCCCCGTGATGCGTGTGAAACGAGACTGTCTGTGTGGGGGCGCTAAAGGTGCTCTTT
TTCGCCACTTGATTTTCCCGGTCAATTCCCTCTTGCTGATCAACAACTGGGCCCACCCTG
CCCCGTGTGTCTTGTCCGGCGCGACCCCACATAAACGATCTGATTGGTGTCTGATACCTT
GAGATGCTGTGCTCATAAACAGTGTATTCGCAATACCTGAGTGCCAAACGTGTGAGGCGA
GTAGACCAATAAATCTCTCATTGAGGATGACGCTGGTCGTTTCGCAAAAGTCTTGAGGAC
TAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCA
AGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATG
CTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCT
CTAACTGGTGGGCTTATTTAATATGAAACTAAAATTTTACATCATTCCATGCTCGATCCC
AAGCTTTACTATTGCTTCAGAATCCAACGAGGAGGGGTAC
>Rosalind_0237
TCCTAGAAGGGTGTGTCGAAGTCTTTTGAATTGCCAAATCTGCAATACTGAGGCATTGAA
ATCGCTGGGGTCGGGTGATAGTAACGACAAAAGCCTGTTGGAGCGCCATATGGCCTCGAC
GAGGTGTACTGGAAGATGCGATTTCGTGCTCTTACAGACTTTAAACAGTAATCAGCCTTA
AATACATTATTCAAAAACCATATGGGTCGTTACGTGCAGGGAGTTCTAAGGGAACTAGAA
ATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGAT
GATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCC
AACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCT
TATTTAATATGAATAGCCGGCAATCGACTACCCCCTAGAAAAACTCTCTCGTACTACGAC
GATTGAGTTCACAATGTCTGTCAGCTGGGAGCAGTACTCCAAAAGTAAGACCGATGTGTG
GAGAGTGGGTAACTACTCCCAGACATTAGAACTGTCTTGACTATGTGTCAACGATATTAC
GAGATATACCCGATCGTCCGCATTGGCCCGCACGCCTCCAGCCCTTCAGTATAAAAGTGC
CTTAACGTGTGTAAATCTCCGAAACCTAGGAGGGAGAATACCGAAATTACTTAGGTGCCC
GAATAAGCCCTTGGGGTTCTTAAGATAATCATATCAACTTCCATAGATTAACCGTCCAAT
GGCGAGGAAGCATGACGCTGCCCGTCTGGAGAGTAACCATCTAAATTGTCGACGTTTATA
CCGATATTCCTCATAAATACCACCGTTGATGAGACTAGTACCTATGAGGGACCTGTGAGG
GCTCCACATTGGGAAGGGTAATGGTCCCCTTTGAACTAGGCTTGTTGTGCGCTTTCTGGT
GTGGGCTACAAAATGGTTCCAACTGAACGGCCAGTCAACT
>Rosalind_0113
TCCCTGAACTATTCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCG
TACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTC
ATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGG
CAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACCGAAGATCGTTATACTCC
GCTGCAGAACTGATCCCTCTTATTCAACCTCAACCCCGCTTTATGCTGATCTTAGTGGGG
AATAATGTAACCGGTTTGCGCCACCAGGTCTACTATGTTAGGAGGTTTGCACCCAGTGCG
TCCGGAACCATGTTCAATGTTTTGCAATGTGACCATTCTATCAGACCTCCGGTCAGGGGC
CCTCTAGGGTACTTAAGTGAAGGGGAATATTCGACGGACGACCCGGTACCGTCCGTAAAG
TAGCAAGGCACGCGAGAAACCACGGCGTCTACGCGCTTCATAATGTACGTCTAACGCCGA
GCGATGCAGGGAAGGTCAGTAGACACTGTTGCTTAATTACGACATAGCGTTCGTAATTCT
AGTGTTTTAGCGCGGTTATATCAAAGAGTCATCATCTGCCACTATATAGGTTACCACGCA
CGCTGCTTGGCCCGTCCCTGCCGAGGCTTTCAAATACCCAGAAGGGATGAAGGTGACGGG
TAGGGAGAACCCCGGTCCGGTTGGCATGCGTGGGCAACGTTTATTTTGTGGACTCGTCAC
TATAATTTTTGACAGACCTGCCTGATCATTGCAAATGATGTCGGTAAGGTGTTCCTAGCG
CGCTTTGGCATGCTCGCTTTAACTCAGTGGACTGCGCTTTCCGTTGTCAGGCTTCAGGAT
TCGATATTACAGCCCGGCAAGCCAGTTAATGATTTTTCCAATGCTTCACTAGTAAGTTTC
CCATCCTAGTACTAACATTTATTCGCGCCTTGGCTAGTTA
>Rosalind_1206
GTCAACATTCAGTCCTAATGGAGTGCATGAGCAGACATGATGTTTCGCACGGAGTATTTA
TCCGAATGAAGGGAAGCAATGTGGAGTAGCAGTACGTCGCACTACGTATATACGGAGTTC
TTTCTCGTCATGTCGTGCGTTCCCGCACTGTTCGTCATTCAGATCCCCGTCTTAGGAACG
CCTTACCGTGGTCCTATGCTCGGAGACTAGTAATGGTGTTCGCACGTACTCCTAACAGCC
ATAGGAGCCCGCGTGGCAACCCAACAAAGGGCGTCGCCTTGGGTATTTGATTAATCCTAC
TAGTAGGTCTCAGTTCGTAATCTTGCACCGAATCTCTGCCGTGAAACAGTGTCTTACACA
ACAGGATGGGTACACCTGACGCGCCAAACCAGCAAAAGCGCGTTGGCTAAGGGAACTAGA
AATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGA
TGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGC
CAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGC
TTATTTAATATGAAGCAGCCGAAGATAACCGTGTACAAGAAAAGCCACCGTCTCAAGACA
TCTGTGGATATCCGGCCTAGTTAAAGCAAATTACAGACAACGAAGGTGTAGCTAATAGGG
TTAGTCCTGTCCATCCAACTAAAAAGCTGAAGTGGGTTCTTTGGGGCAAAGTTGTTCGTT
TAGAGGCAGAAGCTCAACAGTCGAAGCCGAGCTATTCTATTTTGGGGTGGGTATACTTCG
CTGCGCGAAACATTTCTATTGAGCTCTGTCCTCCTGGACTCCAAAAAACTTGCGGTTCTT
TTACCATCTTCCGGCAAACACTTGGTAATTGCTAACCCAAGTGTAACCACATATGTTTGG
CGCCGAAGCGCTCGCCAGCAATTATACGCGGTAAAGCTGC
>Rosalind_6835
ATTCCTCGTAAACCGCATTGATTGTGGGAGACTGTCATACAAAATGGATCTCGGGTCGGA
AAGCCCCGCTGATGCAAAAATTCGTTAAACACAAGCTGTCCAATACCGTGCCTCTTAGGC
GCGAATCATTGCAGGTTCCAGTATACAGTTGCTACGTACGGATGTCCTCTCCACTCTTAT
CTCGCAGTCTTGCACAGCGTGAACCAATCGGTTCTAACCCACGGTTTACCGGATCGATCC
ACGCGCGTGCTTTGTATTGTTATTCCCCACAATCTCTTGTTCCGTTTGCCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAAGGAAGGAACCAGCGCGGTCCCCCCCTCCCTAAAGAAAGATGCA
GCGCTTAAGTACGCCACTGCGTCCGTCCAGTGAAATATCGGGCCGTTGATCAGGTCAAAT
CAGACGAAGGAATGATCCAGTAACCGTTACACTTGCGATTAATCTTTGTCGATATGATGG
AGGGCAGAAAGGAAGGTAACTGGCAAATCTCACCACTGACACGCCGCGCGGGAGCTGACC
TGGAAGCATGTGGTTCGATCTATAAACGCGTGAGCGCCTTAGGCTGGCGCCGAGAGCTGA
ACATAAAGTCGAACGCGCTTTTGGGGGTGCTGTTCATTGAGTAACAATGTGATGGAGCCG
CAAGTTGCTCGACGATGGAGAGCATTTTCTGCATGGAAATGTCTTGGCTGGCTCAAGACG
ACGTTGGAAAGCGAGGGTCTGTGGGATCACCAGGCGCGACCAACGAAATATCTTCGTGAT
GTCTCTCTGCAGGAGAATGGCCCCAGACTTCCTCGATGTA
>Rosalind_3239
CGTCAGGCAAAACCTAGTGAGCGCTACTGTCAATCGTGCGTCCGCCCGCGCGTTCTAGCG
ATATGAGACAATCTTGGCTTTCCCACGGAACATAAGGACCAATATGTTTGGCAGCTCTGA
TGTTTAGGTATGCACGCTAGCGCCAGATCATCTATGGTCACTATGTAGAGCCACTCTACA
TATTCTCTTTCTCATGCATCTGCCCGTGCCGGGCACGCAGCTTTACTTTAACATACCATC
CCATTGCTCTTATCTAAACCCCAAGGATGTCGGATAGAGAGGCGGTACCGCGTATCGTAG
GTCGCGAAAAGTTAAGCGGCATGACAGCCGTGCACCAGGGAGGTCGACACACATCTCCTA
GACCCCTTGTCGGCATTAACCGTGGTTGCCACTAGTTACTCGCGAGGGATGGGTTGAGGA
ATATCGACTTTACTGGCACCCTTGTTTCTTCAAAAAAGCGCCTTTTAATTCCTCATGGTA
AGCCCATGTGGACGTTCAATCTGTGTGGCCGTTCGTAGTAATAAGTGTGATGTACGATGT
CATAGAGAACGTCCAAAACAAGGGTGCTGCGTTTTTAGCCCCTTTGTCAAACGGGCGTTA
ACATGACGCCGAATATCCAGTGCACCAACCGACTGTTGCAGCCCCACCATTTCGATCCTC
AAGGTTGACCCTGGTATCACAACGTAACGAACATCTAGTATAAATGATACTTAGTACTTG
TTAGCCCAACCACAAGGGTGCTCGAAGGACGAGATTTGCTAAGGGAACTAGAAATTGAAC
CTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATC
CTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTA
TGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAA
TATGAATTACCATATATAAATCAGTTCCCCAGTTAGTGCG
>Rosalind_5593
TTACTAGGGATAGTATCTTGGCAAGCGGTGGGCATGGCTGTCGCCTATTGACCTACACAT
GTTACTATTGTATAATAATGGTGGCACGAGGTCAATTGTATATGTGCAAGTTGGTTCTTT
GGTACCTTCGAAAATGATTATTGGAAAGAGCGTATTGTGTTCTAGATTCCCCGAACCCAG
GAGAGACGCCCCACATTGTACCTTGCAGCACCTAAGGGAACTAGAAATTGAACCTAGATC
GGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACG
AAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGC
CATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAC
TAGAGTTATATCACGTGCCCGGGACACATGGCTTTTGCTACAATGTTGTCGAGTCCGTGG
CTCATTACTGAAAAATTGGTACTCCTATTCCACGGCTTATACCCAGGGAACGCGCTTTAA
CACCGGAGATCAAAGATCCTCTTCGCTAGCTCGTCTCATCTACCCAACGCAGGGCAGATT
GAGGCACCTACGAGACTCAGCACGAACTATCACTTGACATACTTTTCCGATCTCTCAACG
TTGGACGGCCTTTCCCACGTGCTTCCAATATGTATGGCTTATGGAACAAACTCAACCTGA
AAGAAGACAGATCGGGGGAACAGTTGCTGGGGTTGTGCACGACTGGACGCCCCATAGCTT
GACAGCGAGGGTTATGAGACAGTGTTCGAAGAATAGGAGATCGGCACTTTAGGATTGATC
TTTCTTCGGTCATCTAGAAATTATTGTTGGAGGTATAGCCCGCAGATAACGTCCAAGCTG
TCCATGAAAATAGCACATGACACGGGCTGCCTACGGTAACACTTCCGACTTACTCGTTCG
GTATGAACATACTCGACCCCGCCGGTTCGACCTCAATCGT
>Rosalind_1720
CGCTCTTAAAGACAGTGCCCTTCCGAGGCTGACGGTCAGAGTTGCAATGAGACCAGTTCC
TGTGGTAATCTTACAACTAGCAAGTATGTACGCTTGCTATCAGGTATGCCTGACTCCATC
ACACGGTTTCACCGCAGCAGCCTAGGAATAACGCTATCTGAAGTTAGGGGGGATGGCCTT
TTATGCCACTGATCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCG
TACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTC
ATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGG
CAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATGGCAGCGGCCGCTGGCAG
TGGGTGACCATCACAATACCCTACTGCATTCTATAGTTCACGCCCGACTGCGAAAGGTAA
AGAGGCAGGGAGACAACCACACTTTCAGCTCCCGTCAGTGGCCCAGGGGGGCCCATAATC
TGATTAGCGTTTAAGCCCGTGATAGCTAGTGACTGAATCAAGTATTCTAAGACGTAGATT
TCACGTCGGGAGTTGTTGATGTCGCCCTTTACTTCGTTCTGGCGCTATTTCCAGGATCCA
ATTCGACGGCACCCACCTTTTGACATTCTGTTTAACCGGTGTCAAAAGCGAGAAAGCCTG
TGCCGCCGTATACGGTTCGGGATTATCAGCAGCCTGTCGTCCAACATTAGTCGATCGAAG
CCGCCCACGGCTGTACGGGATATACCAGGTGGCTGACCTTGTTCAAACTATTTTAGAGAT
TGTTTCGCTCACATAGGCCTTTATTTTTAGATCGGGTGGTGGAAGATCTCCGTTTAACAA
GCGTGTCAAGTATTCTGAAATTGCCGATTCCTTGGATTATCATAGTCGTCGAGCGTTACG
TTGCCTATTAGGATACATTTTTGCAGACCGCGGATGAGCT
>Rosalind_8072
CACTCGTAAGGTTATATGTCTTTTCCGCTGTGTTAGTAGCACTCCGCGTTGTGTGCGTAT
CCCCGGCGTTATAGTCCCCGCGTGAAGCAAAAAAGTCGGAACGCAACCTAACACCGTCTC
AACAATGACCCTCTATTTTCTAGCAAGTGTAGGGCTACATCGTTTTATAGGGTATGCGCG
TGCTGTACGTGCACTCCTGCAGCTGGCAATGACATCCGTCCTAGACTATTTGTTAAGCGG
GCACCGAGGCAGGGAATACGGGGTGTATCCTCAGGGTCCAACCTAAGCCATCGAGCAGAG
GCTATTTGGATGGAGCCTTAATATCCTGGGTAGGAGGACACAAGTCATAACTCTGTCGGC
CATGGACGCTTCCTCAGAGAGCTTGGGGTAAATAGCCTGAAGTCCGGTTGATGCTGTGCA
TGACAGGTGCCACGGTATTTTGGTGCGGATGCGGACATGCAATTAGTGCTTTTTTCAAGT
GCCCCATGGCCAATCTTCACACGGCTTAAAAATCCTAGTCTAAGGGAACTAGAAATTGAA
CCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAAT
CCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGT
ATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTA
ATATGAATCTCATCCATGCCTTCAAGCGCAAGCATAACGTAGCCATGATGTAACCCTAGT
ACCCCCAGGAATGAGAACCCCGAGTATAGGCCTACTTGCGCGGAGATCGCCTTGGCTCGT
TTACACGCTTGAAAAAGTCTCCTCCACTTTACATCAAGCCTCTGAAGCATCCTCTGTTAT
AGGCGCGAACGCTAAGGGTAGAACCACAGGAGCGCGATGTGGTATAACACACCTCCATAC
TATATTGCCGACTTATGCAAATTATAAACATTTAACAGAA
>Rosalind_8958
TCATCGAAATGAGCGGGCTAGCTGTTAATCCACTCAGCGACGAGTGCTAATAACCTGTAT
TTACATTTAAGGGTAGGGCTTCAACAAAATGAATTTCAACATGGGACCATACTCGCTAAG
GGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGA
ACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGT
GAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAA
CTGGTGGGCTTATTTAATATGAACCCGCCGCTACCCGACATCCGTCATAGAACTACTTCG
ATCCACCTAACGGATTCGGCATAGCCTTGCCCCTGCTGCTCGCACCGTCACCATGCGTGG
ACAACCTATCCGGGTCTACGTCTAAAAACCGTCCGCGAGCTGGCAGTTTACGCTGGAATC
CCCTATCGGGATAAGCAGAGACCACGAAGCGCGTTAACGTACACAGGGCCCATTACTAAG
AGGTGCGTGACGAAGGTGATGGCCCAAATATAAACTTACGAGATCGGATTAACGGAGTCT
GGTAAAGCTGCGGCTCGTGCCGTAAGCAAGGAAGAACATTTTGGGGCCCACCGGACCTGT
GACTGGAAATTAGTGCGATTTAAAACGTTTGAGAGAAAGTGGTGTCGTACTTCTGGCTAC
GCGTCTCTAGCCTCTGCACTAATTCACTTGACTCGTAATTGTACCACGCTCACGAGATGT
TGTCACATGGCGGCCTATATGCGTTAACAGACATGTCCAGATTTAGGTAGGCTCTTGCCG
TGGAACTCTGGATTAGCGGTAAGTGTCACCGCCTAGCCTTGAAAACTAATGGGGCCCTTC
GTCCCGGCAAGCCGCGAGGAAATCGGCAGCAGACAATGCGTCTGCAACACATCCGGTTTT
GTAAACGGTATAGCCCGCACCACTCGACTACCCCCTATGA
>Rosalind_8116
GGATTCACGATCTCCGAAGTTCCTCTAGGAGAGCATTGCATGTCTTGAGCTTAAACATGT
CCCCGGGAGCTCGTACAAGGGGAGGACGAATGCAAAATTGTACTTTTGGCGCCGGTCACC
TAATTGCGACGTAATAGGGATCATGGCACCAAAACTTCTTCGCTAGACATCTGCGCACAG
ACCCGAACGTTTACCCGCTCTTCCCACTTCTCTATGATTGAACGAAGAATATGCACCGTG
ACCTTCTACCATCTGATTAGAGGGCGTCCCAATAGACAATAGTTACCCACACTGCCCGAT
AGCGTACAAGTAGTTGGGAACATTCTGTGCTGTGGGGCTCCAGATAGTAGGTGTACTCAC
GTTATTCTTTCGGTGCACTACGGACTCATCAACTCTGAGGATGCGCTTTGAGCCCCGCGC
GGGCCATGGCGGTACATTGGTCCCGAAGCCAGGAACATAGCCGTGTGTGGGTGTGCAAGG
GCCACGGAATAGTAGACTTAATAGTAGTTTCACTTGCCCCCGCATAGGCCCAGCCCGCGA
GCTGTTGACTCCCTGCGATTGGGCCCATGGAGCGGCCCCGGTTACCCGACTCTATCTGAA
CAAATCTCCGCACCATGTGCTGTTGTAACAGAACATGTGCATTTCGGCCCGGGGCGACGT
AATGGGCCGACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTAC
AATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATT
TCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAG
CGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGAGCGTTAGAAGAGGGGGATTT
CCTCTGTTCCTAACTCGGGTCGTCTTTCTACCAGCAAGCACCGGGCTATGGGCCTTGTGT
CGACTTTGCCTAAATCATTATTAGTTTATTCATATGTTTT
>Rosalind_6868
ACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCT
CAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATA
TGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTC
CTCTAACTGGTGGGCTTATTTAATATGAAGAGGTACGGTAACCGTTAAGAAATTTCCACG
ATGAATTGCGTCCTGATGGGAAAGGCCTAGAGTGCCACTTTATTTCCCTCGTTCGATTGT
ACGGTCCGAGGCTTCGTCGCAGGGGTTTCGCATACGCAAGCGAGCCTAGCCCCGGTATCA
GTCGCTGTCGGCATCCAAGCGCTGCTCAGCGATTCATGCGTGCACCCAGGTGCAAAATAT
ACCAGTGAGCTGTCGTACCGGCCTCACGCCTCCTAGCCCGCAGTCGCAATCGCCGCACTA
CAGATAATGGATTAGACGCGGATCAAGATCCTAACAGCGTCGCTGTCGTTTCTGACTAGA
GGGCACAAGCCCTTTAGGTGAATTGATAGGAAGTGCGATTGCTTATCTGATATACGCGCT
CAGGTCTTCGCGGGTCGTAGGCGGGTCTCGCTAGACCTATGAACTTGTTGGGCCGGATTA
TCCAGCCTTATGAGATTATAATCATAAAACTCACATGACAAGCATTCTCCCAGTTCAGCA
GAACAATCTTTGATCGAGCTCGATCTCTATCGGGGCAGCGACTAAGTGATCACTACCAGC
CCCATGCTTCAACTTGCGGTTATGTCCAGTCATAGCGGTAGTGTTTAACTCTCAGTAAAG
CCAACGAAAACTGGTTTGTCCACCCCGACAGTGCCCTTCCCGAATGGCGAAAGTTCGACA
CCCCTTTCCAGTGCGGCCCGCTGTGGGCTCCTAGAGTTGGACACGTCAACCTACTGGATG
GGGGGTGTAACGCGGAGACCAATGTGTAGATGAGTGCCTC
>Rosalind_0240
CTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTC
AAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATAT
GCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCC
TCTAACTGGTGGGCTTATTTAATATGAACATTGCACGCCCATCCTTCATGGGCACCTTCT
CCGAGTGATGAGTATTATCTGCCTACGTAAGCAGTTAGCCATAGCCAGTTGAACCACGCG
CAAACGGTTCGGTCCCTTGCTGCCAGCCGTTCCACTCATCGTCTAGACCTTTGTACGGGA
GAAGTAGCAGAGTGCCCCAAAAAGTGCGATAAAACCGCGCTGGATAAGGATTCGAAAGCT
TCGCCAGAAGGGCTTATGTCACACCGGGAATTAGTTCAGCTATTTCCCTTGTACATTTAA
GGAGACGCCAAAATGACTACATAAAAGCCACTGTTGAGGGCCTCTTATAAGAAAACTCCG
TGAGTTCCTGGGGCTAGACGAGAAGCTACAGCTTCAAAGACGACCACGTGAGGACAAATC
GGGACCATGATAAGACGCGCATATCTCATACCTCGGGCCCTACTTTATGCATTAGAAACC
TCCTAATCTAGGGTCATCGTGAAAGCTGTTCGCTTATTAGGATCTTGACACAGAAACAGT
CCTGGGAATACTTTCGGTAAGGTAAGGCTCGAGCCAATCCGGTCTTGAGTGGCCTCTCAT
ACCGACACCGGATTGTACACGGCAGGATACCTCGCCAAGTTCAACGGAAATCGGCATCAC
CCGTGCTATCTCGAAGATTTGGTCCAGATAGACATAGGAGCCGAATGTCTCACACATAAC
TAATGTTATGGCAATTTGGACAATGTGCAACTTTTTTTGCGTGTGTCTCAGAGAAACGAC
TCGCCTTCGTCCTACATCTAAAGGGGATAGATACATGCTC
>Rosalind_8165
GTGCTACGATAGCCTAATCGAGCCCATCACCGCTCTCTTCAAGGCTACCTTTAAATCATC
GGTAGAACCTCCAAACCTTTTAATGTGATTGCGCATTTGACCCAGTAAGTGAAGTCCCTA
TAAACGCACGCAGACTTAACCACTAAGTATAAGCGCTAAATTAGTGAACGGTCGGAGAAG
TGCGTTGCTTGCACGAGCCTTGGTCTTCGGTCGCATACGTCGTCTTGGGAGGATCGTCTG
TTTCCTTGAGGTATTGTTTTCCAAGCTCACGGTTGCCAGGTTAATAGTCCGCCGCTTGCA
AGTGAGAACTGAGCACGTGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGA
GCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATG
CTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACT
TATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGAATTGTCCAAAG
AGGCAATACTTCAGACGGCTGCGGCACTTGAACCGTGGGGAAGTACTTGTGGTGTATGCG
CTCCACCTTCTGGAAGCTCACCGCCGTCTGCACATGGTGATTAATCTCCAGGAATAGAGT
CCCGCATGTCCTCCGAGGCCATATGAATATCATCACAATCCGTAACCCTTTTTATATCGA
GCGTCGCGCCAGGGGATGTAAGGATGCCGGGCTGCTGGTACGTGGGGAAACTTCATATGC
TGTGATCCAGTTCATCAGCGTAAGCCTGCATAAAGGCTGGCTAATTGGAGTGTATTCCTT
GGCACCGAACTGCTAAAGGATGGTACGATGGTCTCACCCAGTGTGGAGCCACATATACAA
CCCAAGACATACCGGGTTCAATTTGGTGACTACTCAGGTATATCCTGATAGCCTACAAGT
AGGACCCATACCAATTCTAAGTCGCGCGCTCGGAGTCTGG
>Rosalind_2755
GCTGGTTACAAACGAGACATCGGGTCCTGCGACACCCTATGTACCCTGGCAGAACATCTG
TACTCCTAAGTGATGTTCGACATTAGTTTAATTGCTGTTGAATCCTTCAACGTTTCCTGG
GGCTGCGGACCTTGCACGGTCTGTTTGGCGATCTTACACGTATAAACCCTCTGATGGCTC
AGGTTGTCTTCCGACTTCTAGCTGAGCTCAGAGCGTTCTCCACAGGTGCGGTTGTGATGA
ATTTGGATCAATCGTAATAAATTGGTCCTTGCAGACACTTTATCCGTCATGACTAAGCCC
ATGTCCGCATTCCGAGTCGAAGCGAGGGCTCCACTTGCTATAACATCGCCAAAGGCAATG
CTGGGTCTCTTAAGCTCTTAGCAACCCACTATCGCCTATCAGGCGTACAGATTCTAATTA
CAACAGACCTGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTA
CAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCAT
TTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCA
GCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAATCCCAGTTATTTGTAAGGGA
CGTTTGAAGATGCCCCTGCGATCGGGAAACGTACCTTTCTCCTATTCAGTAAGGCGAGCG
TCGCGAGCATCGGTAAAGATATTAGCTGGTTATCCGGACAAGACTCCCGTCAGGCGGTTC
GCGGGGATACTCTGAAATGCAGCCTTGGAAAGGGTTACCTAAACACCTTCTCCGCAATAT
TAGACAGATTCCGGAGACCCGGGGACCATGAGAGGGCCGGACGATCTGGTCTAGCTCAAG
TCTATGAAGCCTACGCCTACCGGTCCCACTTCCCCTGCCCTCATGGAGTATGTGAGTCGG
CCGCTGCCAACCGGACAAGTCGCGTCCAGATGGACATCGC
>Rosalind_4623
TGTTTCCACCTCACTCACCGTCGTTTTGCTCATGAAGTTTCAAAGTCCCACTATGACAAA
ATACTTATAAGGACCCTACTAAGAACCCGGTTTAAGCTAGCCATTCAAAACATATCGTGT
GCGTTGTTAGTAACTGGCGGAGAAAAGTTGTGCGTCATTTCCTAGGAGGGAAAAAATATG
GAAGACCTGCGACACAATGTATCTTGCTTCCCCGTGATAACTTCCACGTTCCATGATGGC
TTACTCACGCCGCCTCTACAATCGATTAGAGTTTCCGCGTCCCTATATTGGGGCGGTTCT
TGCAATTCCATCAGTCTTGGTCCCCAATTCTGGACGCGGCAGTACGACTGCCGAGAAAAC
ATGGTCGCAACTCTCAATCGAGGCTGGTGACGGTCTACTCCCTCCATGTTAGACCCATTC
AAAAAGTAGGTATACGCCAACTATGTGTTCAGGTCTATTTTTTGGGCTGGGACATAAACT
AAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAA
GGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGC
TGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTC
TAACTGGTGGGCTTATTTAATATGAACTGCACGCGGTTTGGATACATGTGAAAGGGAACG
TTACCGCTACCCCATCTCCTAAAAGTCTAGATCGAATTGCAGACAGGGGCTAGGGTCTGT
TCAATGTCTAGCAAGTCAATTGTGAGAGCCAGTAGTCTTTGGTTCAGGGTTCTTTACTAC
GCGACGTGAACCTCGTACCACGATATGAACTGATCGTGGGATTTGTCACTCCCGCACCAT
GGTGTCACGATTGCCACTAGCGAGTCACGTTACATACGCATTAAGAGACAGACCTGATTT
GCGAAGCGCGTCGTACTAACTTCAGATCCCTACCACCGAA
>Rosalind_9519
GACTTCAGCGCGCTGTGCGGCAACATACACTGCTAAATGCCTGCATGACACTTTTGCCCC
TTAAGCCACAATCTATTATATGCAGTGGGAGATTATTACTCTAAAGCTAAACGCAAGAGT
GGGGTCTTGTTAAGCGCGCCCCGTCGAATCGTACGCGAGCTGGAATCTTGTGCATTACTG
CAATTTTACGTTTACTGAACTGTTGAAGGCTATGCTACTTGAGGCCTTCGCACTTAGGCG
AGTCGCAGCCATAAGGTAAACGGAGCCATAATCGACGGTAGTAGCGCATTGTCACTCCCT
GCTCCTTCGGACCGGCCTCTCACGTTGACAGCCCTGCCGAGCCTGACCACTATGCATTAT
AGGGGCTAGACGACGCATCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGA
GCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATG
CTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACT
TATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTGCTTGCCGTGA
ACGCTATGTTAACTAGCGGGGCACTAGGTGTTGCGACTTGGCCCCCAACTTGAGATCTTC
AGCGACGTGACATGTTGAAAATTAGGCTGAGGAGTTCGAGAAATTGGCTTCAGCAACTTC
ATAACGGTACTCTGAACACGTCATGGTCTTGCCGCTGTACACGGTTACAATCATGCAGAT
AAATTCAAATAGGCGAAGGGCGTAGCATAAGTGGCCAGCATCGGGGGTAGGGTAAGTGTG
GGTTGTTGCAGATGGAGAGCTACCACTGTCAACACCCTTGGGATGAGTTTGGGCTAGAGA
GGCAAAGAGAAGACTCTACTGAAATTTGGCCCAGGGCCTCAGACTTTATTAGTGGCAACG
AATTGGGTCGGACTCATAACGAAGGAAACGCAGGGCAAAT
>Rosalind_0639
GGCGCTGGAACCCAGCACTATTATCTGTGACTCTTTGGCCTGACGGCCCTGATTTACACT
AACCAGGTGCTATGGTATGATTGGGGCGGCTTAGTCGGGAGAGCTGTCGTTGTGCCTTGA
TGGATCATCGCGAATGGTACTTATATACCTCAAAAATAGCCAGGGTGATCCCGACAATTG
GACTGTGTCATGGAATATTCACCGGGAATAAACAGTGTGGAGGTAGGAAAAGAGCTTGAC
ATCTTGACACCCATCGGTTGAGAGCAACCACTGCTACCATATACAACGCCTCCAAATACG
GCTACGCGACTTTACTTTGCAAGCTCGGTTACAACTGTGCTTCAGGAAAGCGTCCATGCC
AACGTGGGAAGGTGAAAAAAGCACTGCACTGTTATGGCTTTCAGTGTAGATGCGCGGGGA
CCTATGAAGACTTAGCAAAGCCGGCTGCTTGAAGACCTCTTCCCCTGCTTTTGTCCGGTC
GAATCGAGTACCCGGAACCGCTAAACGTGGTGAAGGTCGCCCACCTGGGAATGCCAGAAC
CCGATTTAATTGGTTCGATCCTTTCTGCAGCTAAGGGAACTAGAAATTGAACCTAGATCG
GTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGA
AGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCC
ATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACG
GGTCGTACATCAGGGTTCCACACCGCCAAGGCACCCCCTAAGGGTATTTGTGGGCATTGT
ATTAACATTCGTACGCATATTAGGCCTTCTGCCGAGGTCTAGGAGTACTAGATGGTCAAG
GGGATGGACCCGAACTACTTTACGTCAACAGCGTATGCCGGCAGGACGTCCGCGTGCTGA
ATCTGACTATCCCAAAGTTTATACACTATACCTTGCATTA
>Rosalind_4187
AAAGACAAGGTTCACCTATCTCGCCGCTCTTTAAATAACGTTCAGACTTCTGTTTAGATC
CCCGGACCTAAGTACCAAGTTATAAGAGAATCACACAGGTTCACTAAGGTGAGCGGGCCG
TAATATACCGCATACACAACAACGGACCACCTAAGGGAACTAGAAATTGAACCTAGATCG
GTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGA
AGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCC
ATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGG
CCCGTGGCAAGCGGTGGCTTCCCGCATTATCTCTACAACACGCTTATAAACTACTGGTCT
CCTATTGACAGCGATAGGAACAACTATGCTTCACTACCAAAGTAAGGAGACGGGAGCGGG
AATTATGAATTGTAGCTAAGTCGTGTATTCGCTGGTTGTGATACATGGCACGGCCTGCCT
TTCTGCGTATGTGCCATACCACGCTCCTATGTTGAAGCGCAAGTTTCGCGGGTGCTAGAG
TTAAAGCGACACTGCTGAACCGCTCTTCGAAATGGACACAAACCGTCACCAACGAGCGCG
GGTCTGTAAAGTGCGAACCCATTGGCTCTATGGTGACGAACACTTCGGCGCATATGTAAC
GACGATAGTTTACTTGGGCATGCCGCACGGCGAGGATAATGCCCCGAGTTCTATAGTGAT
GGGTGATACCGCGACAGTATGGGAGAAGAATGCCTACACGCACATACCGCTGATTTCATG
AATTCTGATCGAGGAGGTACGATATAAAGGATGGGATTGGATGGGAGCGAATCGTTCCGT
CTTTGTGCGCAACATCTCTATTCGGTGCGGTGGAGTCTCGACCCCTGAGAATCAGCATAT
GATTAGTGTTCGATTCCTGCCAGTTAAGAATCCGTACAGA
>Rosalind_2625
TGAATACGCTAATAATACCCGATATAAATGTAGAACATGACCGAAGGGTGCGTATCAGGA
CCAATACTCGCTGAGATTTGACGACGTGCATCAGGAGTCCCTCAGATCATACTTTGTGTT
CGGTACACACTCCGCAGTAAACAGCAGTGACGTTCACAACAACTGGGTCTTCGAAGCCTT
CAATACGGGAGTCAGACCGTCGGCCTTCGTTTATGGACTCGCGCTTGTCATCTATTAACC
AGTTAGATTCGGGGTCTAGGAGTTTTATGCACTTTATTCCCAATTGTGAACAGTTTTGAC
ACCTCACTTAACTGTTTAGTTAACCACGCAGGGCCGGTATGGGCTAAATCTATTCCCCCC
CGTTACTCTCCCCGTAAGGGACAATGTCATCACTCCAGGTCATCAGATAGTCTGGAACCA
ACATGCCCCAGCGCATAATCAATCGTGCAGTTAAGCTATGTTAAGAATCCTCATCTCTTG
GACCAGACCAGGGTAGGGGTAACAGCTACAACTATGTACACATGCGTGAGGAATGGCGAA
ATTGTCCCAAGACCCGGCAGAAAAGGGTTCACAGTAAGAGACCCAATACGTAACCAAGTC
CAATCAGGGAAGGCAGTCAGCTAAGGCGCCGAGAGCATTACGAAGCAATGTACGGTGATT
TGTTAGTCACGAGTCGCCATCGAAACCCCCCAGCGTCTCCTTCGGAATAGGCGCTAAAAC
AGCGACTTGGCCAGCAAAGTCAGACACTAAGGGAACTAGAAATTGAACCTAGATCGGTAT
GTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGA
CTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTC
TAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTACGT
GTCTTGCGTTCGTCATGCACCCTTCGCTGACACATTTAGT
>Rosalind_9989
TTACCCGCCGCGGCACTCCATAAAAAACAGCCGAACAGATGCGGCTTCTTACGGCGATTG
ACAAGTGTGCGAGGCTACGGTTATGCGAATCTCTTAGCGAGAGCATCATTCGGGTCGTGC
TTCCGGATATGCCCTCGCTGGTCTCTAATGCCCACCCACTTTGACAGTACACGACACGAA
TTCGTATTGCCTAAGTGTTATAGCAGGAGTTAAAGCCTAAGACTAAGCGTCGAACAGGTC
GAGCGCCTTTCCACGTTTTCATGGACTGAGGACTGCATTCATGCGCTTTTAATCCGATGG
GGGCTGCAATCTTGACGCTAGGGTATTTGCTACAACCGGACAATCAAAGATCTGCGTCTA
AGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAG
GAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCT
GTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCT
AACTGGTGGGCTTATTTAATATGAAGCATACATGTAAACTGGGATTATTTGGCACGTTTC
GACGGTCACGAGCTCCTTGTAACGCCTGAGGAGTGAAGACAACTATGTTATATTCGGAAG
GCCGCCAACTAGTCTAGGGCAATTACTTTTCGTCGCGACCAGAACACCAGCATTCTCCAT
ATTCATGTTTCCGGTAGACCGAGGCCCTTGTGGAGATGGGAACAGATTGAGGCGGCAGCA
TGAGTCTGAATCTAATTGCCCTCGTCTACGAAGTAAACGACGAAGACACATCGAAGCTAC
GTCGCCGCAGTCGGGGCCACCACAGGATAAACGCAGATGTGAATGCAACGGGAGATTTCT
AGCGCCAACCGGCAAAGAGCGAACTGCCCTTTAGAAGGCCCGACGCGAACCATAAACAAG
AAGGAGGGTATGAGTGGTATGCTCCACAGGTTTGAGAGTC
>Rosalind_8411
CGTATCGGCCCGACTCTTACCTCGGTCGGCGGATAGATATAAAAATTTGCAAGCGCCGGC
TTTGGTGCAACCAATCAGATCACCCCTCCAGTCTTTGGGAGAGGTGCGTCGTTTCAACAA
TTCGCATCCGTATACTTATACTTTGTACCTATTGGCTACAGTGGTGTGTTATTTTCTGTC
CTTGACGCGCGACGGCACCGCCCGAAGTAAACCTGTGTTATCGTGACAAATATCATGTGT
GTTCGATTTACGGTCCGACTCAATATTACCCGATCCGTCCAAGTACCAACCTCTCCACCA
ACACGGTGCTCAAGCAAGAGTCCGAATCGAATTGGCTATCAGGAGCGTTAGGACTTAATG
CTTATTCACTCTGACGTGGGCGCCTGTCCGTCAGGGCTCTACATTTTCGGAGGTTGCTAA
TGACGAGTTTGGAACCAATTGGGACCATTGTTACTGCCTTTGCCGCTGTCCCAAGGAACC
CGATAGATCCCTTCAAGTGAATGAGGACGAGGTCGTCGCACCCATTCGTGTCGTTATAAC
AGTCACGTGGTGACTGGGAACCTCGCCGGTCGTAGCAAATTCGTCTATTGCGCCAGATGA
GCCCAAGAGTACAAGTGAGGTCACGCTGTTATTGTACCTACGCGTTACAGTACACTAACC
CAAGAAAGTTATCAGCTGTTTGTCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTC
GAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTT
CATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAG
GACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATATGTTTGG
ATAGCACCGACTAACAGGAGGGCCAGTGTACGTGCTGCAACACAAGAAAGTCCTTCGGAC
TACCTTGTCTTAATAATGCTGGGCCGGATTTATGATCCGT
>Rosalind_0705
CAGAATCTGACCACGGCATCTTGCCATCGTCGTTCCAACCTTGTAGAATGTGCTTGGAAC
GGACTGGGGGTCTTAAAATCTGTTGTTCATAATCCTGCCATGAGTTGCGTTAGTGCGCTC
CGTCGTGGGCCCTTTGTCGGGGCATAACTGCTGCTTGCACACGTAGGTATCCACGGGCAG
GTCTGTCTATCCGGGTTAGAGCGAAGCGAGTGCACACGTCGCGCCATCTTGGGATCGGCT
CCCAAGTACACCGTTTAAACGATTAACATGACGTGACTATGTAACGTCCGATTAGTGTCC
GAGAAGACATTGAAGTTCCGGGTAGCCCAATGGTATACCCGGTGGCACTAAGGGAACTAG
AAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTG
ATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGG
CCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGG
CTTATTTAATATGAATTGTACCGGCTATTGGCTAGTGTAGTGGAACACCGCTTGTGAGGG
TATACGCAGTTCTCATAGAGCATAAGGTGTTCTATCTCATTACCATGGCATGCGTGCTGC
CCCCACGGCACAAAGACATCATATAAAGTATGCCGCTGTCCACGAGAATGTCAATCCAAG
ACCCAGCCCCTGCTTATGGCCCGATAGTGCTTCGCAAGTCCGCGTTTCTAGCGCATTAAG
TCACCATGCGCATCCTGTGAACGTGTTTACGCACCCGTGGAGTTCTATCAACTGAACTCG
TGGTAAGCAGGTAGAGAGGTAATTTCAGGTGGATATAAACCGAGTTCCCGTGGGCAACGC
TAATTGTTAAGTGAATGCATATCCTTGATTCGTAGAGTGTGGAGATATAGACGCTTTAAC
AGAGTGCCACCAACGAGCCGGTTTTAAGTGCCCCGTGTCT
>Rosalind_9628
AATAGGATACGGCTCCTTCCTAATATGCAGGGCGGAGTAGTGGCCCGCTCCGCCGTAGCT
TACAATACGTGCTGCTTATGTAGAAGTAATTCAATCCGCGCAAATGGTAAGGAGGTTAGT
AAGGGGGTCGCTCCGAAGTGTAAGACGGGGCTAGTATTTCGCGCAGGCTTGTTTACTCTT
AGTTCGGCTATTCATCCAAGCACAAGTATTTCCAGACTAGATTCCCTCCAAGCGACATAC
TCAGTTTGGAAATCTTCCCATAAATCTCCCCTTTCGTCTGATGCTAACGCTGGTCAGGAT
CGACATAACGAGCCATGTGATGGCGGCCTTCCGCCTGCCTAACTTGACCTTACACGTAAT
AAATGTAGCGTCTTAACTACCTTTGCGATCGTCCACTGAGCGCTGTTAGACCTTACTGCC
TTGACACTTGGAGGCCTAAGTACAATACTATACAGCGTTGCGGTCACCGCGCAGCCATGA
CGTGAAAGGGAAAATGATCAGTATGTTCAGCGTCTCGTAACTAGACCTCCGTAAGCCAGC
ATACTGTCGACTACCGTACACCGATACGGGCTCCCTCACATGTGGCCAGTGCTAGCCGGG
GCAATCTCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAA
TATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTC
TAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCG
GAGATTCCTCTAACTGGTGGGCTTATTTAATATGAATATGATGCTGCTTTTCTGTATCGG
GTGAAACGACTACTCTTCATTTCGCAAAACTACGACCATACCGCCCCAACTTGCTAACGT
TTCCGAGGAGTATGTATGCGACACGTTTGAGCGACGCTGCCACCACGATGAGGTGTAGTC
TACACGTTTAAGTGAGTCCTGATCCAATGCTTTTGCGTCC
>Rosalind_0215
CTCAGCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATA
TGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTA
ACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGA
GATTCCTCTAACTGGTGGGCTTATTTAATATGAAGGCCCTACCACCCAGGGGGGGGATCA
GCTGGCTAGTGGAATCGCGGAACCTGTTCTCATAGCGCGCTACGAATAAGGTGCCCGGGG
ATACACATAGACCCCATTTCAAACACCCATACTAACCCCGGCTCATCAATGGTGAAAGAT
AAAGGTGACACGTCTTTTTTGTGGTTATAGCAGCGCTACGATCGTGGACTCGCGTTTAGG
ACGTCTGGCCAAATCGTGCGAATTGGGAGTTAAACGATGCAAAATAAAGCGGCTAGCATC
TAGCTGAGTATCGGGACAGTTTTTAGACTACACAGGAGGTACTCGCTCCCGATCAGTTAC
GTGCTTCGGGGCTCCTCTTTTTAAGGGGTTCCCTCCAACAAGGTGGATAGAGATATCCAA
TAGTGGCAAGGTCCTGATGGTGTCCCATAATGGGCTTGACACTTTTCAGCGGGTCCATCT
GGCCTTACTGTTCGATCGGTCCTCGGCAAGTGAGATTTCAGTCTTTTTCGTGCGCCATGA
ATTGTCTGTCTATCGTATCTGCGGGAGCTGTGCGACGGCTCCAAGAATTACAGGGCACTC
TCCTGTAGTTTAGGTCTCCTAAGCAAGGATTGTCTGCGGAAACAGAATACCTAGCAGCTG
CAGGGAGGTTACGCACTGACACGCCGGCTCTTACGCATTCTCTAAATAAATGGCTCCTCC
ACCCCGCGTGCATATTTTAATATCCGTCTGAACTCTAAAGTTTGAGGCGCTCCTGCAGAG
AGGTACACCACGACGCACGCTCACGTTCGACAAACCGCCG
>Rosalind_2718
TACGGATGTCAGAGATTCCAAGACCGGCATGTTCTAAGGGAACTAGAAATTGAACCTAGA
TCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCA
CGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTC
GCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGA
ACTATTATGCCCAGGTGATGACTGATGTTGACGGAACTCGTCGTCATCTGACACTCTGCG
AGCGCAGAATGGCCAGTACGCGTCAGTGGATGAGTTTTAACAAACTTACGTTGGGATCGT
TAGTAGCTGGTGTGAGCTTTCTTGCCTTTGGCAACCCGATAGGTATATCATAACATCTGG
GAGCTTATGATTCCTTAGCCGGTCCTAGGTGGCTAGCTATTCGTGGTCTCAGATTTCATC
GCGTAAGGTTCATCAGATGACGTTCTGCCACGTTTTGGTAAAGGCCTCCCACGGTCAGGG
AATCACGAGCACCTGTCATCGCCTAATAATGTTCAAGTAAACAGTACATGAAGTCGCTAC
GATATCGACAGTTGGAGGCATCCTTCGGGTCCAACCCGATAGCATGGCGAGATCATGCGC
ATAGTCGTTGGTCAGACTCAGCTTCTCAAGACGGATCAACGCAACACCCATCCGTTGGGC
GGAACACCATCCTGCGCGCTATATCCAGCGCAGGGTACGACTGCTCTTGGCCAGAATAAC
GTCGCCTGATACTCTGGTATATTTGAAAGCTTTAATAAATGGAATAGGCTGCTCGCTGGA
GCGATTTTCCCGGATGCTCTTTCAGGTATCGTGACCTAAACAGCACGTGGAGGGAGCGAT
CGCGCACAAACGAACTTCTCGTGCATGATTTTGTCCTAAGGAATCACAGATGCGGCCTCC
CCGGAGTTGTAGGCCGTAACGAACCCTCGCGTTAGCTGCG
>Rosalind_8244
GGTGTGCCCCCGTTTCCACTAGGGTTGGTCTTGAATCAACCATGAACCGGAGAAGGAGTA
TAAGCTCAACTAGTAGTATGTTTCATCGAGACAGTGCGACTCGTGCCTTCCATCGACTCC
ATTGCATCCATTCGCAGTGCCGTTCATTTACAAACTATCTCACTGACCGTGGTACTGTCC
TAGGCTGTTTATTCAGTGGTTTGGAGCGATTGAACTTTGAGTCGTAAAGATGTCGTCGTG
GGGCCCCGCCGATACGTTCCCGGTCGTGGAGAGACGATCTGCAAGACTTAGAGCACACAG
CAATAACAGTGTAACGATCTATCCTATAGTACCTTGATGAGCCCCCCGGTACTGCACAAG
AGAAGGTCCCGGTCTTTATTATAACTGTATAAACAAACTAGTCAGCGGTGCTTTACATCT
CAGAAGATGTGGTATTCCTAGAAGACCAGATCGGGTGACAGTTGAGATATGGGAGGGTAT
AACCAAAGAATATCGCCTTCCGCGTTATGGACAAGATGCTGGGTCCTTTCGAGTCAGTTA
GACGCTCATTCGTCGTCTGTATGTACTAGTGAGTGGTCTTTGCGAATTTCACTAAGCTGA
TTCGCACCCCCGCCGCCGAATAATACCAACATTCGGATTCGTTAGCGCGCTCAACTGTAG
TGCCTAAATAATTATAATCGTCTTAACCGTCCGCCGCAATCAGATCACGGGTGTCGGGAT
GCGTTATCCCGGACGCCGGGCGTACCCGTGGGGATTTCCTAAGGGAACTAGAAATTGAAC
CTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATC
CTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTA
TGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAA
TATGAAGTTAGCTGCATCAGAATTCGCTTACTGAAAGAAT
>Rosalind_1908
GTAGGCTAGAACAGAGAAAAGCTGTTGTATCCGCATAAATATAAGGGTCTGATGGCCTCG
AATCCGTAAGGCGCTTTCGGAATTATAGCAGGACTGCAGACTGGGGACCGGCTCCCAATC
GCTCTTTAACGCAGGCAGGCTGCAGGTCATTTAATAAGACCCGAGTACTATTAGCTGGAG
AATCAGCGTACTGCTCGGAAGCGGAGATTCTACTCAACGATGGTAGAACACCTCAGGAGG
TTTAAGGTCCGAAGGATATGATGGCAATCAAATTCAAGATTAGCCGGTCCCACCAATCTA
AGGGCTGCTTTATCTATTTCCTCTCATGAGTTGTCAGTACTTAAATGTATGTGTGTGCTG
GCCGGTTTTACTCTGGCTACAACTCACGTTATTCATACGAGTAAGGTACAAGAGGGCTTG
TATTTCACATTACAATACGCGACAGGCACACGGCAGGTTTACCAGGTGACGAATGAGATG
TAAATTTATAAGTCCATGAACCTTTGAAGCGAATCTGGTGCTTTAACGCGGTCGACTAAG
CCGTATCGCGCAGTTTGTATCATTCGGCAGGCTTTCGCCCTGCGGTGGGACGTTAGATAA
CCGACGTAAAGTAACAACGATGAGAAGGACACATCGCCTCTCTTAGGCTGAGAGGTTCGT
CATGCTAGGAATCGGTGATGCTGATCAAGCGCTTGCCGTAGTTAGATAGTGTTGACATAC
CTCATTAAGATTAATAACGCTAAGACTTCGTTCGCTACCCTAAGGGAACTAGAAATTGAA
CCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAAT
CCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGT
ATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTA
ATATGAATACACTACCGTATAGTTCCCTTGTGGCATAGAA
>Rosalind_0969
GGGAGTCACCCCCCTCTTTCTTCCCTATACCCGTTGAAGTCGCACGCTCTGGTGATGTAA
TCTAATTGCTCAACAGGCTAGGGTTTGTGCCCCATTACATGCGTCAAAAGTAAGGTGGTG
TAGTGTTTCGATCGGGGGGAGTCTATTAAGCGTGTTGGGTGGTATTGACCCCTAACGTAC
TTCTGCCAACGAGAGTTGCTGATACTGCCTACGACAAGAGATAACGCCATTCACTAATGC
CTGTCCGCACACACGAGCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAG
CGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGC
TAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTT
ATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTATCTTGGTACTT
CTGCAACGTCCAAAGTCTTTCCGCGTCCCATAAAGAATAGCAGCCCAGTGCCTTTTAACG
TGCCTGATTCCGACTAAACCGAGCGAGTAGTTAAGAGAGATGAAATGATAGGTTACCGTA
GACTTGCTAGCCGGATGTTGTTAGCGTGCCAGTGTCCGTGTCAGAATGAATTGACATACC
ATCGAGCCGCCTACGTCAAAGCCACTAAATTGGGGTTAGGGAGAAAGTGAGGCCCTGGAT
CCGCTATCTATAAAAGGTCGCTTCATCAAGTCACTCCTATCTGTATTAGATTAGCGGGAA
CTAGCCCCAGGCGTGATTCGATCAGGCACAAAAGCAGACAGGTGGGACTGATCAGATCAA
GTATCGTAACCGGACAGTTTGTCTCGTTAATAGAAAACACGTGCCAATTGGCGGCCTTCT
TCTGGTACCGCGCGACACCTTAACCACCTCGGCGTAGCGCCTTTCTTCTCTCGCTGTTCT
GGGCCTACCATTTTCATAGGGCTGTTGTCGGCTATAGAAA
>Rosalind_8395
TGGAGGAGCATTATTTATGGTCTGTGGATTGAGAACCCTTCGCTCCGCAGAGAATCCCCC
CTCACCTATGCAGCCCCCGTTCAAGTCTTCTTACGCACGAAAGGTGACGTTACCACTACG
TTCCCTTCACGGGGGCGTGAGAGTATGAACGGTCCCTTCCATTTCAGTAAATTTCAACGA
TTACAGGGCGCAACCCTGCGTCATCCACCTGGTGCACGCAGTAGGGTATCACGCCAGTCG
GTAAGGATGGCTTCTAACAGGTGCTGGTGGGGCCTCGCGCCATCCCATTACTGGAAGCGA
GAGGCCGGACCTAGCTCCCGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAG
AGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCAT
GCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGAC
TTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAAGAGACGTGGGG
ACCGAAGTGAATCGGTGTCGGAATAAAGGCACAGAGTTTTGTTCAGGGAAAGATCTTCTG
TGTAAGGGAGATCCTTCCAGAGCACCGGTAATCCCTGTGACAATAAAGAATGCTCGCCTA
TATCCATGACACTCTCGTGTTCCAAACTGGAACTGTAATTGGGCCGACAATTCGCAAACC
CGAGACATTGACAAGGTCCTTGAGGCCCACAGGTCTTGCTGTGCTCTCGACGCCGCGAGC
GTATTCTAGAGCGCCGAAGACTTGCGCGTTATCATTATTGACGTCCAAGCGACAGCTGCA
AACATCACTTGTTCGCCGCATGCTTTACTGTGCAATCTATCCAATACTGGGAATATGTCT
CTACCAGTTCCTTGCAGTTCCCTTCGGCGATAAGACACGTTCTCGGAGAGCGACTCCGAA
TAGGTCCGAAGAGCGAGTCTTGAGCCCGCTAATACCTTAG
>Rosalind_4929
CCAGGGCAGTACTCAATGTTATTGCATCGCTATTCACCTCCCCTTCACAAGCCTGTTATA
TGCGAAATAGGAAAGTCCTAGCATGGAGTCGACGAACCATATAGGCATCTAAGTCGACAC
TACAGGGTCCGAGTTTATTCAGTCTAGAACTATATGGCCTGAGCCCGTTCAGCCGCGTTT
GTATATAGTCATGACCCAGGAATCAAGTACCGTTAGTGATGGGAATCATCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAAAGCCTACGCTGCTGTTGTACCATTTCCAGTGTAAGCTTACGAG
GTACATGATCTTACGACGTTAGGCTAACGGTCGCAAGGGTCTCACTACCTACTTTTCACT
TCATCAAATTTTCAATTGTTGGACTGAACTTGTCCTTTTCGGCTGGCCGAATCCACCACT
AGGCGTCATTGGGCCCTCTTCTGTTTACGAGAGCCCGGTTGCAGACTCGTTGGAGTATTG
GAAAGGACTTTCTCGCACCCATGCTAATTGGGTAATGTCGCTAAACGAGGCGGTCGTAGA
TTCACGTTTACTAGGCTGGCTTAAGAGCCCACTCGAGATTGAAAAAACAGAGCGTTTTAG
CTAAGTACTATTTTATGTCTCTGGTCAGTTTAAGTCTCTCCGTGCATGAGCGTACACTGT
TGGAAAAGCATGTACGTTACGGTCGGACTCGTGTTCTAAACTCGGAGTGTTTGCTACTTA
GATAAGCTCCAATGCTACACCACTAGGAGATTGTCTCGATTCCTAGTGTTAAAACTAGAG
ACGACTAGCCCCCCCCGTGCCCCATGGCGTCACTAGCCCT
>Rosalind_2270
TCTAACAAATGAGTCCGGCGCGTTGCTGGTAGCCGAAGATAATCACTGATGCGTCTAGCC
TTTTAGCAGACATAGTACCTATGCTTCATAATCTCCGATCTATGCCACGAAGAAGCGAAG
AGCTTACAAATAGTAGGTCTGTGAAATTTTGCTGCTGCCGATTATGGGAGTCTTGGGTGC
TGCGTACGCGACGGGCTAGCTATGGCCCAGTACCCGAGAGTGGTTATTCCGTATGGTGCG
TATAACGGTAGCCTGGCTACTATCGGAAAGGATACCCTATCAACTAAGGGAACTAGAAAT
TGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGA
TAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAA
CCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTA
TTTAATATGAAATCTGCATGTCTCCGCTCATCAGGCTGACGTTGTGCACTGTGAAACTTC
TGTCCTTCGGCTAGCCACATATTCGCACTCCAATCCGCTATCTTGTATGGCAAGAAATAG
TGTCGCGAAGACGAGACGTAGGAAAAGGCTCCAGAGAGGTAATCTAACGTTTGACCTTTA
GTGGATGACCTTCGGTTACCATGACAAAGCTGTGTCCTCACATCAACACATACATCTCCT
AATTTTCCGGCCCTCAGTGCAACCATATATGGGTTGGCCAGAACATAAGCGAACTCCGTG
GAACCGTAGGTTTTCTCAGGAACTCCGGGCGAAGGTCGTGAGACCGAGCTAGTCCTTACG
CAGGCTGCATCCCTCGAAAGCGGATAATACGAATGTATTAATCTTAGAAGCACATTCCCC
TGCCCCGTATGAAAAGCGACACGTCAACCCTCCATAGTACAATAGAGATAGACTTTACGT
GTAATTACGACGTTCTTTAGAGACACATCCCGGTTCGCCC
>Rosalind_3796
AAGCCTAAGGAACTCGCATGACGTGTTCCCCCCCGCACGGACGAATTGATTTCGGATTCA
ACTATCGCTGAGGGAGAAGGTCCTCCCGCCATTTATAACATGTAAGACTGTAAGCTCGTT
CCTCCGATGGATTGTCGTAATGGAATCTGATTACTCCGTACGGGGTCCCCCTTGTATGGC
TGGATGGTGGTGAGCCTGTGAATTCTTGCACTTCATCGTGGCAACATTCCGCAACATACA
TAAGATCCTTCATGCGCGCAGTACACCTTGGACACATGTTGCACGTCGAAGAAACGAAAA
CGTCAATGACTCGACGAAAATAACCTGGTAATTGTCAACTCGATTAGAGAGCCCATAAAT
CGCATGGGCTTTGAATTGAATCTTCAGCAAAATGAGTACCTCCTATAAATCTTTGGAGGT
CGGCAGTACGGAGCGACTGATGAATGCACTGTTGTCCTCCAAACACAAGAGTCTAATCGC
CCAGGCACGGGTCTCAGGGACATAATGTTGTAATACGCGCGGTTACTAACGCCCGTAAGC
CGTGCGTCTCTAGCCATTAGCACAAATGCAGTGATCACCAGTTAGTCCAGCATAACACCT
ATCGAAGTGGCCCCGACATGACACGCATGCTCTGTTTTGCTAGAAACAAGCTCATCCAGT
CGGACGACTAGTCCTAACCGTCAATTCGCGTGCAAATTGCGACCGACTATGCTTTTGTTC
CAGGCTAATGCTTTTCTTCGATGTTAATACGACGGTTCCGACGCCTCGAGGAAACTAAGG
GAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAA
CAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTG
AACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAAC
TGGTGGGCTTATTTAATATGAATTGTTATCAGTGCCTTGC
>Rosalind_4176
TGGTTATGGACACCGCCCTGGTTAGAACCCGTGATTATAGGGGAAAGAGTGCGTCCGCTT
CAAGGCACGGGTCCACCAACGTTAGTACTCTCGATGGGGTCTACCCGAGCGTCCATGGAC
TTCTACAGATTCCATGCGCCGAGACGTTCGCCCGAAAGACTTGATTCCCCGTGCAAGCAC
GCCCCGCGTCCCCCCATACTTCGCGACAGCCCGGACAGGAAGTGAGAGATCCGTTCTCTC
TGGGTCCTCGTGTCACACAGAATTACCCTTCCGGGGCTATTCGTGCACTGCTTGTGATGC
GAGTGCTCTCACATTATTGAATGGTCTAGTAGTTGGTGAAGACCTTACATAAGTTTCAAA
GCGGCAGACATGACAGTCCAGGGACACTGACTTCCACGGTATGGGCATTGATCCACCCGA
GATTCATAGCAAATTTCTGATAGCCGCACCTTTAATACGGTTCTAAGATCTGAACACACA
TGACGTGCCAATCCCAGCTGGAATGTCTTATGGTCTAAGGGAACTAGAAATTGAACCTAG
ATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTC
ACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTT
CGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATG
AAAGGCGGACTGGGGAGGAGCGAGAATAGCATGACTAGTCCACCCTCCGTCCTTGGAAGT
GCGGGCGTTGTTTAACCAAGGATGAGCATCGAGAGTGCTGGCGCTGAAAAAAGGCACCAA
TTTAGGCTGCGAGTCTTGTTGATTTGTGCAGAGGGGACTACGCCAAAGCTGGTCGATACG
GCCTGCCAGCGAAGTAGAGTTGGGTGCTCAAGCCGGTTTCACATTCCGTATGCTTCTGTG
CAGAAAAAGACCAGCAGGACGAGGGTTGGCAAGCTGTAAC
>Rosalind_5864
GACCCTATGTAGCTGCGTTATAACACATATTAGTAGTGCTTCCAGTATAGCTGAAACCTT
CCGCGATGTATCAAGGCAGAAGGCCAAGTAAAACACTTTTGACCCTGTGAAATTAGTTAT
CATGAAATGGGACTTTCTGTCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGA
GAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCA
TGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGA
CTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACTCGACACAAT
GGGTGGGTGAAAACCCGTCTTTTCTACGAGGAAGCTGTCTCAAGTCGACACCGCTGCTAA
CTATCACAGTATCATAAAAGACTTTTCTCTTAGGCAGGGATAATTACGGGCTCCGAATCT
AAACGCCAGACGAATGGACCCGATTGATTCAGCTATATAGCTCCTACGACACGGCTTGCT
CCGCCTGTAATGGTATAGGGCACATGTGCAATAATCAGGCCTATACTTTCGTGCCGTTTT
AGGATCGGCAAGCCTTGAGCTTAACGACACCGCGTTCAGCATAGCCTGTCCGCTGTAGTT
TCATTCTCGAATAGACGAGTCCGCCGCATTCAACTACGTTTAGGCCGTCTAACAGTCCCA
CTCTCACCTACAGAATCAATGTGGAAGAACGGTACCTTCTAACACTTGTGGCTCGTGAGG
TCTTTACACGTTTATTGATGCTCATGGCTATCCGCAACCGATGTAGCGACCATTCCGACT
TGGAACGTGCAACGAGTCGGCCCTGTACAGGTCTCTGGAGATGTACGCTCCACCATGCCT
AATACGGGAGGCGTGTGATGTCTTTCTACACAATGCGCTGACAATCGAGGGGTCATTCAT
TTTTCTGCGTTGACCCATCGCTGACTAGACTTTGGTCACA
>Rosalind_2263
GGATTATGACGCCCAGGTTGTATGTTCGAGAGACAATTGACTTCTCCCTACAGTATTATG
TGATGCGTCGCCGACTAGTCGTCCATTATCAGCCCTCTACCCACTTCATTGTGAGTTGCC
CTAAGCTCCGCGAATCGGCCTGAAACAACAAAGCTATCTAAATAAAGATGGCCTTCAGCA
AACCCAGTCTACGGTATTTGAGCACAAATGCCAACCGCCGACCGGGGGGCTCATCGGAAG
CGCGCGCACTTCAGCAGAGATCCCCCGCTCTGTTAGGCATAATTTGGAAATCATGGTATA
GCAGTACACGACGGATGCTTAACGCTCCTCTATGCCAGCGTGGCGATGTAGCTCGTCGTT
CAGTGTTTCGAAGCCGTCACAACGGAAGGCCTCTGCCTAAACGAGTGCCATCCATTTTTA
TTTATTCATCACATGGACTGTGGGAAAGCATCGGCATATGCTATCGTGTTATGATGATTT
CATCGCTGCTACCGGGTGATACGAACCACGTATCCTTTTTGCCCCCCCGTCAATGAATTT
GCCACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATG
GCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAAC
ATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGA
TTCCTCTAACTGGTGGGCTTATTTAATATGAATTCCATAAAGCACCTACAGTAAGAAGGA
TCTATCCCCGGAAAATGATTTGGCTTATTTAGTGAAGACTATCAAGCATTATCACTGAAA
GCTTAGGGCTGCGCCATTGTGATACCAGACACTAGGGTGTCAATCGACGGTACACCGAAA
CGACCGGGTTCAAGCCGACCTCAGAGGTGAGATGATGTAGACATCGTTGTCGAGGTAGGG
AGATTTGGTTGATGGCGACAAGCTGAATGATTCCTACTGC
>Rosalind_0326
TTTATACTCGGGAGTCGGACGTAGAGGTCTCTCCCGGTTCTCTTGCGGCACAGATTCCAA
GTCACCATTCGATAGATAGCGGGGGGTGATACAAAGTGAATCGGCACTAACCTGATAGGG
TATGCTTGCGATAAGGGGCAACTCCAGGACATACAAGCCCCATGGTTTGGAGCCGTTGCC
AGACGGCAGTGTCTAAATGCTAAGCCCCATGTGGCCAGATGTTTCGTGTGCAGGGTCGGA
GCCCCTTCGACCTGGTAGTGTCGTGCAAGAAAGCGCAGTCTCGAACTACTCCGTCATAAT
CCAAAATAGCTGCTTCTCACAGTGCTCCGCACCAATGGCGGCATTCCTTCTAGACTCGAT
CAGACGAAGGCCAAAGGATTAGCATAATTTGGGATGTACCATGTTCACGATCGCCCCTCC
CACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGC
TCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACAT
ATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATT
CCTCTAACTGGTGGGCTTATTTAATATGAACTATAGTCACCGTTCACCTCCGTGCAGACA
CCGTAATCTTACAACTCTGGTGATGTGAGTAGCCAGACAGCAGTTACTCGCGCAGGAGAC
TTAGATTATAGTTGTATGTACGTTTACTCTGGTGATGATATGTCCTGCGCTTTTAGTCTA
CTCAACTAGGCTACTGGAAGCGCCTTTTAGCCCGCTTGGGCGGTGGTATTGCCTCCGCTA
CCCTTACCTCCTTGCTCCCTTGGCACTGTCCTCGCAGGCATTTGTTAATTTGAATAAGGC
AGTAATGTCGTCTCGATATTTGTTACAGAACGTCGCCATTATTTCGTCAATCGGACTAGA
TGGTGGGCTAGAGATTTACGCCGGATTGGGAAATGACAGA
>Rosalind_2654
GTAGACGGGGTTCCGGCGACAAGCGCTGTTTGCGCGTATACGTCCAGCTAAATCTCTCAA
GTATACCTCCTACTAGGGTGTTTGATTGACACCCTCGATTCCAATGTCGGAAGCTGCCGT
AAGGAAATCAGTGGATGATAGGTCGTCATGATTCCGCCAAACGTTCGAACATGTCGCCAG
TTAGTTCAAGCGTCATCTCCCCGGCTGACAATCACGTGGTAGTAAGTATAGGACTTGGGC
CTCGCTGCTAGGCTCAGCACAATCGTTCGGGGTCGCAATCTAATGTCTGTCGCCCCCGTA
CTAACTAAAGGCGAATGCTTAGACAGGGAGGCTGTGGATTGGAAACTGCAACAGACTTAA
GACTGCGTGGGTACGACAGTACCTTCCTGCCGTGGGACGTTTTCCTTTCAACTAAGGGAA
CTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAG
GGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAAC
TGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGG
TGGGCTTATTTAATATGAACCTTTTGCAGATCTTTGCAATTCGGTAGGGAGTTCCCAGCC
TATCTCATGCGCCATTCGTGGTTATTAAGTTGACTGTTGACCAACCTCTCACGTGGCCCC
TATGCCTGCATTCTCACCGCCTATAACAAGCGAATCGACTGACAACCGTTACATGGATAG
GCCTCAATAGATTAGGTGAGACGCAACTGAGGGAAAGGAAATGAGAGCACGTTTCGTGAA
TACGATAAGGTTGTATAGCAAACGTTCTCACACGGGCTCATTGCTTCCAGCATCTAATCA
TCAGATCGGCGCGGGATCATTACATAAATCTTCACTACAAGCTCAGCGCTTGGATTACAG
CTCGATTATTACGCGTGATAGTTATGCGATAGAAATGAAC
>Rosalind_6883
GCCAAAGGCCGGCCACCATCTACGTTTTGTGGGTGTCAGGACCATAAAATTAGATGTTTA
ATTCGCTGGGCCCCTGGTAAAGTCCCGCGCTTCTGCGATAAACGGCTAAGGTGCGATTTG
GCCGCCACAACTCGTGCTTACTCTGCTCGGTTGGAGTAGAGATGCCAACTGGTGGTATTG
CAGATGACCTGACCGCCCCCTCCACCGGAGCTTAAGAGGCCCACGACTCTTTATGGCTCA
CGTTCCTCGCGACCTGGTACAGGAGCTACGAGTCAGGGCTCAGCATGTGCTTTATCAAAG
GCTGAAGTATCATGGCCCGCTAGACTGTGACCAAGTCGTTAAGTTGATATGTGGCCAACG
ATACGCCACGGGACTTGGGTAAGAACTTCATGGCCTCACTGTATTGGAGTGACGTCTGAA
AGGTACGAATACCGTCTGCTGGCTAGATTGCGATTTCCAGTCTAAGGGAACTAGAAATTG
AACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATA
ATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACC
GTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATT
TAATATGAAGCACCCAATGCCGGCTACCACTAAGGATTCGTTCGGTCGTTAAAACCCCGA
ACAGATTACTCTCGGTATTCCCACGACCCCCATGGCGTATACGTAGCGTAGAGACCTCGA
GTAGATCGGCGTGTACGAGGCCAGGAGCAGGTTTCGGGTGAGGGAATTGGCGAAGAAGCT
CGATACGACCATAGGCAGCCTCCTCGAAGGACAACAATTGACCGAACTGCTTTTTCGTTT
CAAGACACACAATGATATTTAGGTCAGGTACGCTCGAAACCAGCATATACATACACGACC
CAGTAGACTCGTTACCCTATCGCAGCCTCTCACTTCTTGA
>Rosalind_8737
TCTAGATGTCCAACGCGCATCCTCCGTCGGGCTGTGTTTTTCAAAAGAGTCGCTCCTCAC
AAACAGACTTGAACAAAGTCTACTAGCGACCATCTGACGCGCTCAAGCCTGCGACTCATC
TTAAGAGCTGTTAAGGTATGTTGCAAGAAATTCTGCTTGGCTTACGGACCAACAAGTAGC
CTTAAAGGGGGACGCCTCAGACTCCCCCCCGAGATTCTTAGGTCTCCGTTCCCGGGACTT
TGGAACGGGGGAGGAATATGTATGCCTAGTCTCGTGGCATAGCAACTGGTCCCCTGAGTG
GTGCGTCCAACTCTCAGCTTGATAGCTTTTTGGCCGGAGTTAACCGGTGAAGAGGTACCA
ACCAGAACTCAGGTAAGACCGCTACCGGAGCTGGAGTAGCTTTTGCAACTGGTGGGATAA
TTGCCTGTCGAGCTTAGTGTTGGCTCCTTGTGCAGCTGAGATGTCTAGGGTACTAAGGGA
ACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACA
GGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAA
CTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTG
GTGGGCTTATTTAATATGAATTCCCAGGTCAGTACATCATTTTCAGCCCATATGATCACC
TCGCAACGTAGGACGATAGTAGCCTTATCAAGCAAACACCGAGCGGCAATGGGATGATAA
TGCCGGGTGCCACAGGGGTCGCCTAAGTAGCTCTGGGGTCGGGTTAACTGGAATCAGAGC
ACGTTTGTTTTACGAGTTAAAGTCGTACATGATGGCGTCCGAAGTTCTCCCATTCCCAAA
CTGGACGTATCCGAAGGCGGAGAGCAGGGTCGTGGCGCCCTCGAAATAAACCGCGAACAG
CCACCTCCGCCTATTATTAATAGACTCACCCGCCTCAAAC
>Rosalind_0745
CTTGCGGAACCAGGGGGTTCGCCTGGACGCGGTCCGCATTCCATCATTTAACTTTTATAG
CAAGATTTCATAAAACTATTTATGGAAATACTTTAGATCCTCCCTACAGATACCCGTCGC
ACTACTGCAAGTATCAGCGCCAGATATAGTGGTGTCTGTTCAAAGTACTCGACGGTAGCG
TGAAGGTGTGGACAAGGGCTCAGTGATGTCACACTAAATCTATGTAGAGGAGCGGCACCA
CCAGTGATGATCCTGGAATGGATACTCCCATACGCTCCTACTTACTTTCGGAGACACGAC
CCAGTGGAAGGAGTACGCTACCGATAAGAGACTTGACGGGCACTCAGCACGAGTTAAGCG
GCGCTGAGCTTGTAAGCCAAGCACCGTGGACCTTTCCCCCGCAAGCTCCCCTCTGGTGCC
CGTGTAGTCTGGCCGCGCACTCCTCTGATGACGAGTGGGCTATAGTTAGACTCTTACCGT
GAAGCTTAATCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTAC
AATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATT
TCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAG
CGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGATATTTGGCCCGAGTAGCGGC
CGTTAGGTACTATCTCATTTTAAAGACTCTAGTAAATTGTGAACCCTCTAGTCTTATTTG
GCAATACGTCACTAGCCCCTGGGAAATATGGCTCCCGAGCTCGATTTACCTCCTGTTGAT
GACAATAGAGCAAATATATCCCATATATTAGTCTCCAATAAGAAAAATTCTCTCCCAGTA
GCATTACACGTGTTGTGTCGTACTGTATCAAACGAAGCGGTCCCCGTACCCGCATCGCCT
ATAGTATGTATGTGCGCTGAGCCAGAATAGAAATAGCCAA
>Rosalind_7479
GTTATGGGGCCCAGATGACGGGGAAATTTAACGATTCAGGTATTGATTAGCTAAATCTCT
GCGGACATGCTACGCCAAATTGGGGAATGTCATCCCAAAGCTAGCAACGATTGCCATGTC
TGCTACACCTTGCGCCGGACGTCTGTCCTTAATTACAGAACAGTAAGAGGCATGGAGATC
AGTGGGGCATAGCTCTAATAGGGGGGCGATTCCAGACGATTCTCTTACAGTCGCAGAACC
GGACGGACAAATTCACGACCTAGAATCCCAGGTACGGTTCGATGACCCACAGCGTAGGCT
CCTAGTAAGTCCTCAAACAAGTTATGCACGGAACGGTGGGGTATGGCGTCGGGGTCGCAC
TCCTGGAACTTGTGGAATCGGGTTTTTAGATGGATAGGCCAGGTTCCTTGTGTATCAGGA
GTTGTGTGACGAACAGACCTGGGTAGTATTCCTTTCCTGTGTATGTCCGCCTAGGCAGTT
CGTTCATCAGGTGAAGTGAGAGGCCCAGTCACCCAAGGCACTCGGCCCTACATTCCTTTG
ATGCGCAGCTATCGCCACGCGTCGCAGCAACACAAGAATCATAAAGAAATGCCGCTCTTT
GACGGCCTGCTTTAATAAGCGGAACATCCAATTTATTTACATTTGTTAATAGATTAAGGT
CATAAAGCCGATAAATGTCCCCCCAAAGAGGAGTGATTCTCCGGGATGGGTGCTTTGACG
TACCACCAGTTTGATTCTACAGGGCCGGTTCTAAGGGAACTAGAAATTGAACCTAGATCG
GTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGA
AGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCC
ATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAACT
GGGATTTACGCGTCCGCGCGCTGTTCCGAACACGAGTGTT
>Rosalind_3852
ACTGCTAGAACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTAC
AATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATT
TCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAG
CGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGGCTTTCTAAGGATGAGAACAA
ACCATTCTGCATTCTAGGATTCTAATTCACTCTTTGAGCAGATTGCCTGGCCGTGGTGTC
TTCGTTGTTGATGTGAGCTTGAAATTGAAGACTCTCTAGGTGAGGACGCCCAGACCAATT
GGGTCAAAGAAGTTAGTGAAGGTCATGTATAAAGGGGACCCTATAGCACAAGTCTGAAGG
GGTTGCCTACCCGCTCCCAGCTTTTTGGCTGCTCTACCACGGTACATGACTATTCGGTAC
CGGTCATAATTTCAGGATTGGAAGACTTGTTTGTAACCATGTCTGGTGCACGCACGCTGT
GGAGAGCTCATCCCAGCAAGAACGCACAGTAAATATCGCTAAGCGGGGGCGTGGACAGGG
GGCCTGCGCGACCAGTCTGCATGCATCGGAATAGGTGAAAACTTACAAGCCAGGGGTTCT
TTTGGCGCCGAACCATCTACGATACTATACATGATCGATAATTATTCCTGAATGTGCTCA
GAATGACAACGCATAATGTAACAAAAATGCTTAGGAAAACGTTAATAGCACTTCGGTGAA
GATGAGTCTATTGTCGAGGCAGCTCCGATTCATGCCCCCACCCGCGTCATTCCAGAGGTC
TTGCGGCGCTGTTAGTTTCAACCTCCCATCCCACAAAGTAGCGGTAATGCGTACGGTCCA
CGAAGTTGCGCACCTGTCCGGTTCCGCTAGGTTGACCGAAATCTAGTTTTTCCAGGAGGC
GGAAGAGACAAGGAACGTGGAAACCTCCGAAAATTAGCTT
>Rosalind_5890
TTATAGCTGAAACTAAGCTTACACAGCAGCTTGTGCGAAACATAGTTCTATAGATAGGCT
CATATCGTCAACCAACCCTACTCTACATCGTTTTTACGGTCGAGACCCGGATTCCGTTCA
CCTTTCTACGGAAATCCCCCAATTTCTCTCTCAAGTGCGGCTGCTCTGGGATTATGTTTC
TTTCGTCGGCTGGCAGGACATCCTCACGATATACGTCCTATCACTCCAAAAGTCTTATGA
AGAGATTCGAGTCGAAGTGTCCGCACGAGCGTCCGTTGATTAACGGAGGTGATGTTCGGG
CAGAGGACAGACTCCCTCTGACTGGCATAACTGCAAGATCCCGGGCATCCCTCCATCACA
CTGTCCTTGTTGCAGAAGGGACTCCTATCCCGCCCCTAAACCTCTGTATGTAGTACTACG
GTCAACTAGCTGGAACTGCTGCACCGCGGTATTAAAGGGCGCAGGGGCTACCAGGGGAGG
GTCCCGTGTGGTCTCGAACGAGGAGCTCATGTGGTTGGTCCGCCCCACTTCACGTTCGAC
AACCGTGCACCAGAATTATCAGTAACTCGAACGTAAAAACTCCCTCACCAGATCCGTCTT
ACCCGACGAACGGCTGTGCCTAGAGACAATAGATAACTAGTTGCAGTTGGTGTCCTATTG
AAACGCTACGATCACTCTTGTCTTACAGGGTACTCCTTTCCAATATCCACGAGCGCAAGC
CTACTACTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATA
TGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTA
ACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGA
GATTCCTCTAACTGGTGGGCTTATTTAATATGAAGAGACAGACGCAGCTACGAACCTCTA
CAAGGTCGCCGTAACATAAACAAAGCGACCCCTGACGAAT
>Rosalind_0826
TGTACCAAAGGTGTCTCGATGGCTGCATACCCATAAATTATACCCGGGGTTGAATTGCGT
GAGCGAAATTGACGTGTTCCACATATGTTCCCATGGGTTTCGCGGCATGGCTTCGTAACA
CGTCCTGATTCTCTAGCCTATGTCGGATAACGATCGCAAATGCTCGCGTGTGAAGAAAGG
ATCGTATGATAGCTTGGTGCTACGTAGACTTGATGACCACTGCTCTTCAGCGCGTGCATT
CCGCGCTTTCGTCATCATAAATGGGCGCCGGCGGGTCCGGTAGTATGGCGATAGCTCATA
GGCTCGTGAGGATCGCTTGCATAGTTAATAGCCGGCTGAACTAGAGACATCAGTCTCGAA
GACCGCAGTGTCATCAGGAAGCCTTCCGGGGCTAACTACAAGGACGGGAGCCCCAACGCG
CCCTCTCGGTGTGCGCACCCCTACCATGACCCACTCCGAGGTACCCGGGACGTAAGGGAA
GTCGAAGTAGACGCATTATCCACCGAGACGGTTTCTGCCTGGTCTCGCTTCAGACTCTAC
ACCCCCGCTACACAGAGCGCGCTAATATGTGGCGCAATCGCTAAGGGAACTAGAAATTGA
ACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAA
TCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCG
TATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTT
AATATGAACATACCTGAGAACCGGCGGTTTATCCGATCTGGAAGCTTGCAGATTAGCCAT
AATGCACGATCGCTGTTGCCCACGATCCTTATTAGAAGTTTAACGATTAGGCTCGGCTAT
ATGACAATCCTACGCTTTCGTCTCAGCGCCGAAGCATGCCCCCACAGGGCCATGGTGCCA
TTCGGTAATGCACGAGTACACCGCCAACGAATTCAGTATT
>Rosalind_6426
GCTAAAACGTCGTCTCAGAGGCATACGACGATTATCTCGTTAATATGTATGCTTTCCTCT
ACACAAGTAAAGAGTCCTAATGTATGAGTGCTCTCTTGGAACTGCCCCTCCAGAAAAGCG
GATGATCGAGGGCCTATCGCGCCCGGACTTACGCCTGGAGCGTTTGGAGGATCGCTCCCC
TTGTGGGCAGCCAACTACCTCCCTAAGACATATACTCTAACTTCACCGGTTCTTAGTGTT
CAACCGAGAAGTACACATTCTAATACTTACGCATTGGGTTTTCATGGAACTGCAGGCTTC
TACGGGTATTCACAGCATCAAGGGTCATCACCAGTAGGCGTCGTTAGTCCTTTTTCTCGG
ACAGTCCGATCTGAAACTCTATCACTTTTGGCGCTGGTAAAATCTAAACGGGGTTACAGG
CCCTCACAACCGCAGTGAGATATATCACCCTTTCGATTGGCCTCTTACTGAACAAGGTGT
CGTAGAAAACGTGTATGTAGAGAGGTCTCGCGGTGCACTCTGTACGCCCATGTTGATCTA
AGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAG
GAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCT
GTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCT
AACTGGTGGGCTTATTTAATATGAAACTTCTCTGGAAACGGATCATCGTGCTATCGGAGT
ATTACGATAATAGTCGGGCAACGAGGAGATAAGGCACCCAGCTCAGTCGAGGTTGTTGCC
GTACTCGCGCAAATCCGTAGATGCTTCTTCTCTCGGAATCGTGCAAAATGAGGAGAACGC
TTGCCTAATAGAATGTATAAGCCTTCGATGATAAGACTTGTTATAGGAAAACCCGGTTCA
CGTTAACGGGTCAGCCAAACCCTGATAGACCTGTCCGATC
>Rosalind_1115
CCAAGTGTCGGCCGACGTCTCACGCGTCCTGCCAATATGTGGAAATTAGTCATAATGGGG
GACTTAGGTATGACCTGAATTCATTAAAGGGACTCCTTACAACATGTCTGACGTCATTAT
GATGAGGCGCATGGGTCGCGTTGGTGAAAGATGAACACAACGTGTCCGCGGGAGTTTCTG
TCGTACCGTGACTACTGGAGTCAGGAGGCGTGCTTATGATCCTTACGAATCAGGGCCCCT
CTCTACCCACGCGCGAGTGTTGCCAACTAGGAATACTAAGACCTCGGAGGGTATCGTGTA
AGCTATGTGTTGTGCGGCGTTGACCAACAACCACTTAGACAGTTCGGGACTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAACGGAAGGGCCTTTATCAGTAATTAGAATCGAACCTCTATAGAA
CTGGTCTGTCAACTTACCATTATTGAATTCTGATTCTACGTGTACAGAACGCGTGGGTAC
TAAGGCAAAATTCGCTTCAAATGGGACACATAGGAGAGCAATGACCCCACCCAAACTCCG
CAGCCGGCCACGCTAAATACCTCCGGCTAATGTTAGAATCATAATCCTCCCTTTGCGGAT
CAGAATCAAGGACCCTCGGGTTGTTACGTGCTCTGGCTCCACCTCATCATCCGTGCCGCC
TTAGCCCAGGTAGCTAGATCAGGCAAGGTGCTGTAGATTCAAGTATGGCAACAGACGGGA
TTACATTGCCTCCCAAACTAGAGTCATGCAGCTGATTACGATTGACACGTCATTCAATGT
TCTCGATTGGCGGTTCTGCGATACATCCTCTCGGAGTGCT
>Rosalind_5958
CATATCCCCTGTAAGCGAGGAATCGCCACCACTAACGAAACTAGTTCCCTAGTTCTAGCG
AATTTTAATGTTCCTCTAGTAAAGGCCCCACGACTGATCGTAGACATATCGATAAACCGG
ATCGGCTCCACACAATGACGGATGCCTGGGACCTTGATGTGCATCCGGTGCATTTAGAGT
TTCTCCATTTCCCTGCGTGATGGACCAGTTTTACCTGCGTACAAAAAAATCCCCGTGGGC
TGAATGGTCTTGCTCCCGTTGGGCAGCCTGGATCAGGCATAAGGTATAGGTGTAGACGCG
TACTATAATAGGAACCGTCGCGTTGGACTTCAAGTTGTCGCAATAAATACGTAACGTCGC
CTTCCGAGCCCGAGCAGTAGAATCACTTTTGTGCGGCTAACCCCTCCACAACCATCGAAT
ATATATATTGCAACCTCGCCAGGAAAACGACGAATCAGCAACTCTGAGTGTAACGCAAAC
CGCACATGGCCGTGCGTACTTGTTAATGGCGGCTACGTTCATTAGTGATTTCACAATACA
TTCAGGACGTGCCGGGGACGCTCGTATTGATCTTAATCTATGGCTAAGGGAACTAGAAAT
TGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGA
TAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAA
CCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTA
TTTAATATGAAAACGTATGTCAGAAGACAGGCAATGCGTAGACGGACGAAATAAGCAGAC
ATCCCAAACTACTGGACCTGTACAAGGTGCCTGCCACGCTCAACTGGGAGACGTTGGCCC
TCTACTCGCAGGGTCCTGGCCGTACGTTAGAGCCAGGGGCGCGGTAACTTCTACTAGACA
AACCACTCAATAGGACGTCAGTCATGACTGGACGAGGCTT
>Rosalind_6563
TTACGTATTCAGGTACAAGCCTCGCGTCGCACCTCAAATAATGAGACGTGGCATGAGGCG
TGTCTCTATGGCCCAGGCCTAAGAAGCTCAATAACACTCCCGACCCCCCACCATCTATCG
GGAAAACGCAAGCGGCTCACATCTTTCAAAGGGACCTAGGATGAGGAGGTGCAATTGTAC
AAAATCGGAGGTTGAAGTATAGGAGCGTTTGTAGACCGTAGAACGCGAGAAGTTCGACCG
ATTCCGGTCACATACACCAAGACGAAATTCCGGGTTGAGAACGGCGGACCTTGGGTACTT
AACCCGTCGTACCTTGAATCCTTACCATACTTACGCGGCGGGCTGTGTGGATATAACTCC
TCGGTTCAAAAATCTGTTCGAATCTAAATAGCGACGAAGGAAGAATCACCCTGACCGACT
GTTACAGCGAGTTAAACAACATCTATTGAACAACAGTTAGCGAGGGCGACGCGGGAGAAT
GTTAGTATCGGAATAGTATTAGTACCTTCCGTAACTGGTAGGAGACGTCGGTTGCCCTGG
GAAGAATTAATTGTTGCATCAACACCAGAAGGCATTTATGACCTTCTGGGGCAGGACGCA
GCCGCCGCCTAGTTTTGGTGTATCGATACGTTGGTTGCCTCAACTTCCCTGTTCGGATGA
CAAGCGTGCGGACATATTAGGGGCCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGT
CGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACT
TCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTA
GGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAAATTTGCG
ACCTGAACCCTCAGGGATCAGATCTAAACTGCCAATTCCGCAAACGCAAGTACCCTGATG
CGATGTGTTTGTCGCCCGTACATGCTCTTTGAAGGGTCGA
>Rosalind_9173
CTTTTTGCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAAT
ATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCT
AACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGG
AGATTCCTCTAACTGGTGGGCTTATTTAATATGAATCAGGCTCGCCTGACTTTCTGATCG
TATGAAACTAAACCTCCGGCTATAACAGATCTATCGGCACTCAGAGGTCACTATGATAGT
TGGAATGTTCATCGCTTCTTGATGGCGATATCCAAGGACGGAGTGCAGCGCGGCCGTAAA
CACTCAACCAGCTTCACCGATATTTTATCAATGAGGAGGTTGCACGCAGAGTTTAGAGTT
TACTATCTAAACAGGTTGGGATATTCTACGTTACACTAAACCAAATCTCAAAGATTTCAG
AACAAAGCATGAACGGCTCTGCGTGACCATACTCAGATGATGCAGTAGCCTCCAACTTTT
GGGGAGAGAGCCGACCAAACTAGCGTCGCCGGGAGCGAGCCGGCGGGGGCAACTGTTTGA
CTGCGAGTAACCCCCTGCCCCGATTGGTACTACGTCAAGATACTAAGGCAAGCACGTTGA
CTTGCTAATAATTTTTTCCTCCATTTACTATATATGGTGAAGATGAGTAAGGACTAACAA
GTTACAGTGCGCATGAAAAGCGCACGAAGTAGACGACGGATCTGCCACCAGTGGTATACT
GAGCGAGTAGACCACATGGACTACAGAATAAATAAGAAAATAGTTGAAATTCTGGTCCTG
AGGTACCCGCACGCCCTCGGGGACTGCATGATCACCTTGCCCGCGGGATTGTGGCTGACG
CCTCAGACAACACTCAGCGTCATGACGCATGCTTTGTCACGAAGTGAAGTCTCTCAACAT
ATACCAGACGCAGACGATGAATACCGCTGGCTTTCAACTT
>Rosalind_8094
TCAGGGGTACGCCACAGTTTTCGTTCCATGGCAGGGCGAACTTGAATGTCATCAGGAGCC
CTTGGACGTCGTTAAATATGCCTTCCTCAAACGATCGCCAAGCGAGGGTAACCCACGGGC
CCGAAGGTCGAGATGTCCCTCGTCGTCTACCCGCGGTCTCTCTCTAACCTCACTCTGGAA
GTCGACGATCCCGAAATCAATCGTGACGTTACTCTTCGGTGCGACCAGGTACTGTTGAGA
CTGGTGAACGACCATAGCCCCCACTGAAAAAGACAAGATACCGGAACCCCGAAAAACGGT
AGCCCATTTCGGTCATCGGCATACGGAGAGCTGCACGGGCTAAGGGAACTAGAAATTGAA
CCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAAT
CCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGT
ATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTA
ATATGAAACGGATTGTACCCGATCAATAGGGGATATCGTGCCTTCTGCTCTATAATAGTT
GTCCGTAACAGCGCCAAGATGTAGGACCGCCTTTCAGTATTTGCCAGGAATAATGTGCCC
GAATCACTAGTTTGAGTTTTCTCTACCCAATGTATTCATCGCGTCGGTCGCTTGAACTTG
CGGGGGATTCGGTGATGATGTATAAATGTCGAGCTACCATCGTTTGTCGTTGAGTGTACC
GCCATTGCAATCTACCCTCGGATCGTATCAGAAACGAATTACGAAAGACAGCTTAGCCTT
CCGTGAAGCGCGCACTCTGGCAGGGCCCGTCCGCACGTTAGCAGACCCTGCATTCCTGGT
TCGGGGTGTTGTATGAAGGCAGACCTTTTGGGTGGTGAGCGTTACTACTCGTCGTGGGCG
GGGTCCTGTCTACGAGTAAAGAAGCCCCAGACCGGACCGT
>Rosalind_4722
TGGGTAAATGCGTAGCGACGTTGACACCAGTGCATCGTTACACTCTTCGTTTATAGGGGT
AATTGGATTAACGGGCTTGTGATTATGAGCGTCTGATCGACTAGGATCCAGATGTTGGAC
ATCCCCTCCACAGGCTGATCACAACAATATTTGACGAGACGATTCACAAATTCTCCGCGA
TTGGTGAATGGAGAGCCATCAATTTAAATAGGGGAGGCACTGATAAGGACGCCCATATCG
AGCAAGAACCCAGGGGCCGTCGTATTCTCGGGCTCTAATAGAATGGGACTGCAGCGGAAT
GCCCCGCGCCAATTATAATCTCTATGGTCAATATCGACCCTTTCCCTACTCGATCGTTGT
CTCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGC
TCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACAT
ATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATT
CCTCTAACTGGTGGGCTTATTTAATATGAATCCATTTGGTCAACGATAAATCAATCCGGC
ATCGACAAGTAGGACTGAATGTGACTGGTGAAGGCGCCTTGAAAGAGCTCCTGTTTAAGG
CGATTCACAGACTAGCCACGGTCCTAATTGAAGGCGGCCCCCTCGGGGACATCGCACTGA
AACCACTGCTAGCTCACCTCTAGTTGTGAGCCATGTCGCACGTCCACCCCCGTGCTTCCA
GTCCACGAAATTCCGACACGACATCGTCTTTGTATGCCCAATGCATACGCCGCGTTGTCC
GAGAGTGCCTTCACTTTGCCAAGCGATAGCTAGGCTACCCTAGCAGAGTCCGCGTTTCCG
GAAGAGCGTACAGTGATGCGAGTTAATCCATCGGGGGCCGCCGTTCTAATCCACTTATCG
ACTAAATTGTGGTAGCGCGAGTACGACTCACTCGAAGCGA
>Rosalind_4657
TCTCAAGTCCGCTGCCCCACGAGATTGTCTTGCTTTTTTGCTATACGTGGGATGCGACTA
TGAAGCCGTAGTAGCTATCGACTCAGCTGTGGGTATACCTAGGGCAATCGCTGGCGCTGA
GGCGCGTTATGGTTGTTGTACACCCGCCTGTGCTGAACGGTAACCGCCTGACCACTTCAC
CCATCCCGTCTGCCATTCTAGGAAGTCGTAAACGTTTGTCTAAACTTGGAGGCCGCTTTT
TCCGGTATGTGCTTGACGACAGGTGTCGCCATCAATCGACCTGTATATCTCCAGGCGCAC
ACTAGATTTCCTCTCCTGCAACCGGGGGTACAAGTTTCCACGTGAGTAGCTCCGGCTAAG
CGGGGCATGAAGAATCATCGAATCATACTGTGAGGCTGCCACGATGATCTACATTCGCGC
GGATGAAGTAGCCAGCATCGCGTTAGAATTTCTACCGCGCTCTAAATGGGGCTCGATGTC
CACTTCGATAGTCTTGATAGTAGACGCTGACCATAAACAAATAAACCTCCTCAGCAACTA
AGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAG
GAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCT
GTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCT
AACTGGTGGGCTTATTTAATATGAACGCCTTCTAAAGGGAGTCGACCCACGCATATGCCG
AGGTTTATCTCAAGTTCAGCCAAACTAAAGTCGAACCACATGAGTTGGTCATATAACTGA
ATACATAAGACGAAAGCGTCGTGATCAAGCACATTTTTGGAACAGCACCTAAACTACCAT
GGATGATCCGGGAGTAGCGGCCGGGTCGGGTTGTGCTGAGACAGGAGCGGTTATGCTTAG
TTAAGGCTAAGTCATATCGGTCGGTTTCTTCATACCTAAC
>Rosalind_1205
TTGCCCCCTGAGCCCGCCTAAGTACAGCACGATATTGCCTACTGTTGACAGGTAACGCCC
AGAGCTCAGAGTGACGAGACAACCTTCCACGACCCGGTGCTTGTCCCTATAGATCAGTGT
GGATTGTTTCTAAGAGGCCTTTCGATTCGAGGGATCACCTTACCCTCCCACGGCCTTGGA
CAGTACCATCCTTCCTTGCTAGTAGACTCGATTTTCGTGTAGAGATAATGCAAGTCCTTA
CAGGTACTTCGACGTCGGACGTGACGCAAACTGATAACAGCAGTAGTACCCAGTTGAAGT
AAGTGAGGGGGCAACTATATCTATGCGTCCTCTGGCCCCTTTCGTGGTCTGTCGATCTGC
TGCCCGAATCCACAGCGAAAGGAGCCGTCGCCGACCTTTTTTCTCAATAGAGCTAACGAA
CCGTTCATATACAAAGCTAAATGCCGAAGAAGTCAAGGCCAGCCACCCCCTAAGGGAACT
AGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGG
TGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTG
GGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTG
GGCTTATTTAATATGAAAACTCATCCTTCTTAGTGGAGGTGATATAACGCGGTGCTTAGG
AATCTATTGGTATTCATGGTGGTATTGCTGTTAGCCCCTTCTTAGTTGGGCTGAGAGAGA
GACTAAAATTATTATACCTGCATGAAGGATAAATGTATCTGGGACCCCCCAACGGAAATG
GTGTTATCAGGCTCCCGGCGCCTCGACATGCGGAGCAGCCGCGAACAGCCTATTATGGGA
GTTGGTCACTAGCTATATAAAGACTTTGATTAGTAATAGACGTGGGAGCAGTCAAACGAG
GCCTATTGCCGCAGGGGGACAGCCCCTCAGTTGTATAGAG
>Rosalind_0616
TATTGGATTAATTCAGCTCTGCGATTTATTTCCACTGTTAGGCGTATAATCATGAGGCGT
AGATGTAGCCTTTACTCCCGGTCCCTCTGATATTGCATCTCCCTAGTTGGGCAAGCAGGA
TGAGACACGTAGGGGCTAACGCACTACTATCGATGCTGGTACCCCTAGTAGCACCTCGGG
GTCGTTCGTCCCCTTTGCTTGCTATCTGAGCCAGTGTCAGTGGTCCAGAAAACGGAAACG
GCGCACGTTCTGGTAACCGGGGTGACAGCCAACGTGTAGGGTGTCGGTTAGAACCGTAAT
AGCGTTGGATAGGAAAGGACGGGATAGGATCCACAGACGTAGCACGCATAAGTAGCTCAT
GGTAATTTCCGGCCGACAAAGTTGATGTCAGCTATTGGTTCTAAACACGTTTATCGGACG
GGCTGTCGTTACTGAATGGAATAACCTGTGTGGATCTGTGTGGACTCTCGCCGAGAAGAG
TGACCCTACTTTTTCTTCTGCCTAGCCATTCTCTAAGGGAACTAGAAATTGAACCTAGAT
CGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCAC
GAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCG
CCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAA
TGGCGGGTTTTTGTCGGGGCGATCGGATCCCCAGAAATGCCTAACCCATTCAATGGTAGC
TTCGAACCGCCCCTTGATTGATAAGCTGCTGCCCTCTCCCTGACGAAGACTATATTTCCC
TTCTGATTAGAATGTGGTAATCTCGCCTGGCAACAGGTGCTAAGGGATACTTTGGACTCT
TAAAATATCTAGTTTGCAAGTCGACTAGATAACGACAGGCGCCCCGGAAAGATCGATAGG
CGAAACTACGGTTAATATCTGGAAACGACACGTTATATTC
>Rosalind_3382
AGCTAAGGCGGGCCCCAAGAATGGTAAACTGAAGTCACGGCGGCGAGTAAGACCAACTTC
AACTTCATCAGACTCAGTATATAATTCTTCTTGCCTCAGGGGAGGGTTTGCTCGACTATT
ATCGGAGCTGTGTGAATTGGAACCTCGGGACACGCGACGCGAAATTCTTGATCGTCCGAA
TGAAGCCTTAATATGTGTTCTGGGAATCGACCTCAGCGTACGGTAAGCAGCCCGCAATCT
ACAACTGTTCACTGCGCATGAGTTTTTACAATGCTGCCCTTACAGAAACTGCGAAGCCTC
GATTCAAGGGTCCAATGGTCGCCAGGTCTACGCTTTGGTATTAAGTCCTAAGGGAACTAG
AAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTG
ATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGG
CCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGG
CTTATTTAATATGAACAGCGTTCTATGTTCGCCTCTCTGACTACTTCAACGCATTATCGT
ACCGCACAAGTAGCTTGAGCATCAGAAGGTATCGCTAGGACAGTACTCCGGAACCTGGCT
ACAACACCACATCCGGCTCGGGAGGTTTCGTGGTCTTCACCAGTGGAAGGGACGCCGCAG
ATGTCCTACTAACCGCCTCAATTTGCAGTGGAGTAACCCGGTAATGACGGCATTATCCTA
AAGGGTTACTTTACCGATATAAGAGGTGCCCTGACGCAATTCAGCGTCGTGCGGATAGTT
ACAGACACAGACCTGAGCCTGATCATGAAACTCCGTTAGTGGACCACATCCCGACCCTGG
AGCCGGGACTGGGTGGGGTTGTACGGCTCATCACCCCTACCAGTGGTGGATGCGCGTAAT
TTTGAGAGAGGGGGTCTGCTAAAGTGCGTCTGCACGCTGG
>Rosalind_5878
GGGACTACAAACTGATGAGCGAAGAATAAGCCCTAACCGAAGTTGGCTGATACCCAACCT
AAGTATACACTAGGGTCTACCGCCCCTCGCGACATCCTCCTATCCGGACAATGCTCACCA
ATTGGGGAATAGAAAGGCCTACCTTGCTAGGCTAACCCAGTTCACTCAGTCTTTGATTCA
AAGTTTGAAATGGGATCTCTGCAAAATTCCCCATAGCCGGTCGCGGGCCCCGTAGAAGTT
GGCTAGATTGAGCTCATAGTTATCAAGAACTCTTCTACGGGATTATTACCGCGCCAGACA
TTGTTGGCTGTTCTCGCGAAGACTGAGACGTACCGTAACACTGTGCTCAACCATGCGCGG
GCCTGTGGCCCGTCGATGAATTTCTCTGCGTATACCTTCCCCCGGTATGGGCAGCCGTGT
CTATAGCTTAAGTCCGGGCTCCGGATTGACCTTGGAACTTAATGGTGGGGGGTCCCGACC
TGGGAGAGATGTAATAGGTTACCGACGCCACCAGCTCAGAACACGCACATAGCTTGGAAT
ACGACCCGTATGGGATCAGACGTCTCAACTTATCCACGCCTATAATTGTCCCGTGACCCA
CCCATTTAACGGCGTTCTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCG
CCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTA
GTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTAT
CGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAAGACTGAGGCGGTTGCT
CTACAAGGCATTCTAGGAGGTTTTGACTAAGGCGGCATATCGCTTGTGGCCCCGAGTCTT
ACTGTTGGGCGACCCTCTGGTCTCGGTACGCCCCCACTTAAACTATCTATTACCTCGAAC
GGGCCGCCCAAGCCAATAACCGAGTCCGTTCCAGCTGTCC
>Rosalind_4644
GACCAAAGCCCTTTGAAGTCTGCATACCGCCTTTTGGCATCAGTTGTACGGGAATCTAAG
GGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGA
ACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGT
GAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAA
CTGGTGGGCTTATTTAATATGAACGTTACGTTCCCCCCACGCCATCCTCCTCCACGCGAA
CCGGCCTAGCCCATAATGTACACAGATAACCCCGTGTATTCCGGCTCAATCGTTTAGAGA
ATGGACTCCGAACATGTCTGGGAACACTGTGCATTGTGCAACCTTGTCTGTGGGAGTACG
TAACGACCTATACTCGGCTGGTACAAGCCACTTCGAGTTCCCATATGCTCAAAAAGTGTA
TGGAAAGCTGCCATAGATCTAACTGTGCATGTTTTTCTTCCCGAAGTTAACAGCCCTAGT
TGAAACAGTCAGCTAATGAACGAAATATAATAATAGAAGTAGAAGGGGCACTTCATGCTA
AACTGATGCTTTGGATTCACTTATAAGGCACTCTCGGAAGGTCATCCCTGAAGGGGTCAC
GCGGCTTCAGGATGCTAGTTCCGCTAATGCAACGGCCGATATAGGGTTCTAAACTGGTGA
CCGACTTTAAAGGTGGAGAATGGGGACTAGGATGATCAGCGAATGAGATGGCTAAATAAC
ACCACTGTGTTACCGCCGAGATCGTAATTTTCCAATCGCGGTGCACTGCGGGCTTTTATC
CCAGGCCTCAGGGCCTCGCAGGTGCTGCCGCGCGAGCATACTGTCTGTGAGGCTCTTCAC
GCGGTGAGGTCACTGGTCTTAGAATTTGAACAATTCTGGGCCGATATGCGCGGTGTTTCA
TTTAACCGAAGATGCAATCGGAAAGATTATTGCTGCGACA
>Rosalind_1530
CGATGCCTGTCGTCACTGCCGGGCGTCAGTAACACTGGCTATACCTGGCAATAGCTAACC
ACGCGTGACGTGTACTAATGGACGAGACTTTAGGTAATTTGCCAGACACTATACTAACTC
GGCTTCACCACTACCCGATCTGCGCGCGTCGGCTTCACTCTCTATTCCATCCACCTAAGG
GAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAA
CAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTG
AACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAAC
TGGTGGGCTTATTTAATATGAACATAGTTCCCGACAAATGAAGCTAGCCGGGCTGTCTGA
TGAGATATTCGCCTGGTCTATGTTCTTCCTTTGTTTGTCCAAGCATGTGTGCGCCAGGAT
CTGAGATCTCTGCCTTGAGGTTTCTCGTTGCCTCTTGATCGCCCAATACGTGCTATAGCT
GCAATTGGGAGAGCGGACCTCATGTTGATGTTAAAGGTGTTCTTCTCCCACATAAGGCTC
AGGCGTCATTACGAGTACTGAATCTTACCGGCCTGGGGAGCATAAGCGGAGATCGACGCT
GGGCGACCCTGAAGGGTTACAGGTACGCGCTCTTAACGTAGTCTAGCAACAGGCTGCAAG
CCTGGCCAGACCCAATTCACAGTCCTACTCACCGCCATTTCCAGATGACAAAGGAAACCT
CTTGGGAAAGAGGAGAAAGAAGAAGATCTATTCATGGATAGATTAGGAAAGAGGGTAAGA
CTAATGCATCCACTAATGGATGGTATCTGTCCCGAGTAGATGTTTACACCTCCCTACCTA
ATAATGCATCAGTGACAGGGTTTCTCCTCTGGCGGGACGACTCTCGCCTGGGAAAGCCCA
CGTGTAGGGACCTGGGGAGTCAAGCATACCACGGGAAAAT
>Rosalind_3683
GGAGCACTTAGTGGCCTGGCACCACAATCGAACATACTCCCGGTTCAGGCGCTTCTAAGG
GAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAA
CAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTG
AACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAAC
TGGTGGGCTTATTTAATATGAATACACGGGAACGGGCCACGTGTAAACCGTTCAGAGGCA
GGTTCGCTTACTAATTGTTCGATAAGTACAAGCATAAGTGGCCTACGCAGGGTGCTCCAG
AGACCCGGAACATTACTAGTAGGAGCGAGAGGCTATTGACTTCGCCCGCATTAAAACTGC
AAGCGACATTCTATAGGTTGTTATGTATCACCACGGCTATGGCATCCAGAATGTTTCGCG
CAATGTTCTAAAGCACTTATTTGGGAAGTAGAGATGAGCCAGGCCCAAAATACCCCAAAG
AAGCCAGACGTGAATCTCGTTGTCATTGTTAAACGACTTGAGGGGCTAGTATAGCTTATG
GGTGAACAGATCACGAAGCCGCTGGACCTTCCAGATGATGGCATAATGATTTCGAGGCGC
GCCCGAGGGGGTTATTTTGCGGGGGTACACAGTCGGTCGCAGCTCACGAGATCTGATAAA
CCCGCAGAAGCATAAGACTGCCGACGCACGATTCTGCTCGTGCAAACCCTGGCCTACGAC
CATCGTTGCGCCAAGCGACTCGAGATTTGATAGGCAGTCTGATTAATAGCACTCCCCGTT
ATCCCTTCTCAAGGGGCAGGGTGTGGCAAGCGTCACTGAGCCGCATTCGTTAGAAATCGC
CCGTCGCCTAGCGGGATCGATAGCCTGATAGAAATATCTCACAAAGGACCATAATCATCT
AGCCTTTCATGTTGGTTCTAATGTTTCGCTTCGGCCGTCG"""


def read_fasta(fasta_input):
    """Read FASTA formatted input and return a list of DNA strings."""
    sequences = []
    lines = fasta_input.strip().split('\n')
    current_sequence = []

    for line in lines:
        if line.startswith('>'):
            if current_sequence:
                sequences.append(''.join(current_sequence))
                current_sequence = []
        else:
            current_sequence.append(line.strip())

    if current_sequence:
        sequences.append(''.join(current_sequence))

    return sequences

def longest_common_substring(sequences):
    """Find the longest common substring among a list of DNA strings."""
    if not sequences:
        return ""

    # Start with the first sequence
    first_sequence = sequences[0]
    longest_substring = ""

    # Check all substrings of the first sequence
    for length in range(1, len(first_sequence) + 1):
        for start in range(len(first_sequence) - length + 1):
            substring = first_sequence[start:start + length]
            # Check if this substring is in all other sequences
            if all(substring in seq for seq in sequences[1:]):
                if len(substring) > len(longest_substring):
                    longest_substring = substring

    return longest_substring

# Read the sequences from the FASTA input
sequences = read_fasta(fasta_input)

# Find and print the longest common substring
result = longest_common_substring(sequences)
print(result)

CTAAGGGAACTAGAAATTGAACCTAGATCGGTATGTCGAGAGCGCCGTACAATATGGCTCAAGGAACAGGGTGATGATAATCCTTCACGAAGGACTTCATGCTAGTCATTTCTAACATATGCTGTGAACTGGGCCAACCGTATGTTCGCCATTCTAGGACTTATCGGCAGCGGAGATTCCTCTAACTGGTGGGCTTATTTAATATGAA


#**Open Reading Frames** (ORF)

#In “Transcribing DNA into RNA”, we discussed the transcription of DNA into RNA, and in “Translating RNA into Protein”, we examined the translation of RNA into a chain of amino acids for the construction of proteins. We can view these two processes as a single step in which we directly translate a DNA string into a protein string, thus calling for a DNA codon table.

However, three immediate wrinkles of complexity arise when we try to pass directly from DNA to proteins. First, not all DNA will be transcribed into RNA: so-called junk DNA appears to have no practical purpose for cellular function. Second, we can begin translation at any position along a strand of RNA, meaning that any substring of a DNA string can serve as a template for translation, as long as it begins with a start codon, ends with a stop codon, and has no other stop codons in the middle. See Figure 1. As a result, the same RNA string can actually be translated in three different ways, depending on how we group triplets of symbols into codons. For example, ...AUGCUGAC... can be translated as ...AUGCUG..., ...UGCUGA..., and ...GCUGAC..., which will typically produce wildly different protein strings.

Problem
Either strand of a DNA double helix can serve as the coding strand for RNA transcription. Hence, a given DNA string implies six total reading frames, or ways in which the same region of DNA can be translated into amino acids: three reading frames result from reading the string itself, whereas three more result from reading its reverse complement.

An open reading frame (ORF) is one which starts from the start codon and ends by stop codon, without any other stop codons in between. Thus, a candidate protein string is derived by translating an open reading frame into amino acids until a stop codon is reached.

Given: A DNA string s
 of length at most 1 kbp in FASTA format.

Return: Every distinct candidate protein string that can be translated from ORFs of s
. Strings can be returned in any order.

Sample Dataset
>Rosalind_99
AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG
Sample Output
MLLGSFRLIPKETLIQVAGSSPCNLS
M
MGMTPRLGLESLLE
MTPRLGLESLLE

In [27]:
# Sample input in FASTA format
fasta_input = """>Rosalind_0109
TGGATCACCTCGGTATTTTGATTCAGATCCATAATTGATCTCCGCCTCCCGACTTGTAAT
AAGCATATACGTTTGTGCCTCTTATCGAACTGGGCAATACGAGGTAAGGGTTTTGCGTTG
ATACCGACTCGCGACCAGTGGCCTGGTTTCGCGCGCCCTCCACGGCCTTCCACTTTTGGA
AAAAGGGCAAACTAGGAAAATAATGGGCGTTAGGCGTACCATAACCGTGGTCCCATATCA
ACGCAAACATTAAGACCGTGATGCGGCCTGGGTGTGAGTTCCTGCGCCGACCTTGTGCCG
CCGCGCAACCTCATAACTAACGGTGCGCCGACGGGTGTTACGGTACTTCGCAGGATCTTC
GTGGGATGGCGCCTCTATTTACTTCACGCACATACTCTTGCTTTATCCAGCTGCTAGCAT
CTACCGTACACGCCCGAGAACATGTCATAATATACAAATAGTTAGCTAACTATTTGTATA
TTATGACATTAGTTCATAAACCGAATTGATCTTGAGGGTGAGTGCTGGCATAGAGATAAC
AGGGCGGTCAATAGTCGCAGTTGGCGCAAGTAACTTAACCCTAAACGATGCTGAAGCTGG
GTGAGGACAAGCTAAATTCCTGCGACGCTGCGTGCCGCGGTTGTAGGACTAGAGTCTCGG
AATTTACGCCTTTGGAGAACCGCCGAGGAAAGACCGTTCGGTCGTGAATAGAGACTATAG
CCACGTTATGGCAAGTTCTATGATTTGGGCCCACTGGTTCTCTTATCGAGCAGACCCTGA
TTAGTGTCTATAAGTCTATATTCGGTGCGCATGCGTCGCACATGGACTAGACGAAATGGC
CGGCGGGATCCGGAAAGGAGGTAGAGAGTAGTTCTGTATACCGGGGTAATGTCGCCTTGC
AATCGATGAACCAAACACGCGCTTACTCTA"""

def read_fasta(fasta_input):
    """Read FASTA formatted input and return the DNA string."""
    lines = fasta_input.strip().split('\n')
    return ''.join(line.strip() for line in lines[1:])  # Join all lines after the header

def translate_dna_to_protein(dna):
    """Translate a DNA sequence to a protein string."""
    # Codon to amino acid mapping
    codon_table = {
        'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
        'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
        'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
        'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
        'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
        'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
        'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
        'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
        'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
        'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
        'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
        'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
        'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
        'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
        'TAC': 'Y', 'TAT': 'Y', 'TAA': '',   'TAG': '',   # Stop codons
        'TGC': 'C', 'TGT': 'C', 'TGA': '',   'TGG': 'W',
    }

    protein = []

    # Iterate through the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna), 3):
        codon = dna[i:i + 3]
        if codon in codon_table:
            amino_acid = codon_table[codon]
            if amino_acid:  # Skip stop codons
                protein.append(amino_acid)
            else:
                break  # Stop translation at stop codon

    return ''.join(protein)

def find_orfs(dna):
    """Find all distinct candidate protein strings from ORFs in the DNA string."""
    proteins = set()
    n = len(dna)

    # Search for start codons (ATG)
    for i in range(n - 2):
        if dna[i:i + 3] == 'ATG':  # Start codon
            # Check for stop codons
            for j in range(i, n - 2, 3):
                codon = dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon
    return proteins  # Return the set of distinct proteins


# Read the DNA string from the FASTA input
dna_string = read_fasta(fasta_input)

# Find all distinct candidate protein strings from ORFs
proteins = find_orfs(dna_string)

# Print the results
for protein in proteins:
    print(protein)

MD
MIWAHWFSYRADPD
MRPGCEFLRRPCAAAQPHN
MLKLGEDKLNSCDAACRGCRTRVSEFTPLENRRGKTVRS
MTLVHKPN
MGVRRTITVVPYQRKH
MRRTWTRRNGRRDPERR
MSPCNR
MS
MAGGIRKGGRE
MAPLFTSRTYSCFIQLLASTVHAREHVIIYK
MASSMIWAHWFSYRADPD


In [40]:
# Sample input in FASTA format
fasta_input = """>Rosalind_99
AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG"""

def read_fasta(fasta_input):
    """Read FASTA formatted input and return the DNA string."""
    lines = fasta_input.strip().split('\n')
    return ''.join(line.strip() for line in lines[1:])  # Join all lines after the header

def translate_dna_to_protein(dna):
    """Translate a DNA sequence to a protein string."""
    # Codon to amino acid mapping
    codon_table = {
        'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
        'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
        'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
        'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
        'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
        'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
        'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
        'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
        'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
        'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
        'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
        'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
        'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
        'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
        'TAC': 'Y', 'TAT': 'Y', 'TAA': '',   'TAG': '',   # Stop codons
        'TGC': 'C', 'TGT': 'C', 'TGA': '',   'TGG': 'W',
    }

    protein = []

    # Iterate through the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna), 3):
        codon = dna[i:i + 3]
        if codon in codon_table:
            amino_acid = codon_table[codon]
            if amino_acid:  # Skip stop codons
                protein.append(amino_acid)
            else:
                break  # Stop translation at stop codon

    return ''.join(protein)

def find_orfs(dna):
    """Find all distinct candidate protein strings from ORFs in the DNA string."""
    proteins = set()
    n = len(dna)

    # Search for start codons (ATG)
    for i in range(n):
        if dna[i:i + 3] == 'ATG':  # Start codon
            # Check for stop codons
            for j in range(i, n - 2, 3):
                codon = dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon

    # Also check for ORFs in the reverse complement
    reverse_dna = dna[::-1].translate(str.maketrans('ATGC', 'TACG'))
    for i in range(len(reverse_dna)):
        if reverse_dna[i:i + 3] == 'ATG':  # Start codon in reverse
            for j in range(i, len(reverse_dna) - 2, 3):
                codon = reverse_dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = reverse_dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon

    return proteins  # Return the set of distinct proteins

# Read the DNA string from the FASTA input
dna_string = read_fasta(fasta_input)

# Find all distinct candidate protein strings
proteins = find_orfs(dna_string)

# Print the results
for protein in proteins:
    print(protein)

MGMTPRLGLESLLE
MLLGSFRLIPKETLIQVAGSSPCNLS
MTPRLGLESLLE
M


In [41]:
# Sample input in FASTA format
fasta_input = """>Rosalind_99
AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG"""

def read_fasta(fasta_input):
    """Read FASTA formatted input and return the DNA string."""
    lines = fasta_input.strip().split('\n')
    return ''.join(line.strip() for line in lines[1:])  # Join all lines after the header

def translate_dna_to_protein(dna):
    """Translate a DNA sequence to a protein string."""
    # Codon to amino acid mapping
    codon_table = {
        'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
        'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
        'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
        'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
        'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
        'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
        'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
        'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
        'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
        'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
        'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
        'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
        'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
        'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
        'TAC': 'Y', 'TAT': 'Y', 'TAA': '',   'TAG': '',   # Stop codons
        'TGC': 'C', 'TGT': 'C', 'TGA': '',   'TGG': 'W',
    }

    protein = []

    # Iterate through the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna), 3):
        codon = dna[i:i + 3]
        if codon in codon_table:
            amino_acid = codon_table[codon]
            if amino_acid:  # Skip stop codons
                protein.append(amino_acid)
            else:
                break  # Stop translation at stop codon

    return ''.join(protein)

def find_orfs(dna):
    """Find all distinct candidate protein strings from ORFs in the DNA string."""
    proteins = set()
    n = len(dna)

    # Search for start codons (ATG)
    for i in range(n):
        if dna[i:i + 3] == 'ATG':  # Start codon
            # Check for stop codons
            for j in range(i, n - 2, 3):
                codon = dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon

    # Also check for ORFs in the reverse complement
    reverse_dna = dna[::-1].translate(str.maketrans('ATGC', 'TACG'))
    for i in range(len(reverse_dna)):
        if reverse_dna[i:i + 3] == 'ATG':  # Start codon in reverse
            for j in range(i, len(reverse_dna) - 2, 3):
                codon = reverse_dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = reverse_dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon

    return proteins  # Return the set of distinct proteins

# Read the DNA string from the FASTA input
dna_string = read_fasta(fasta_input)

# Find all distinct candidate protein strings
proteins = find_orfs(dna_string)

# Print the results
for protein in proteins:
    print(protein)

MGMTPRLGLESLLE
MLLGSFRLIPKETLIQVAGSSPCNLS
MTPRLGLESLLE
M


In [39]:
# Sample input in FASTA format
fasta_input = """>Rosalind_4682
CACTAGTGTGCCCAAAGGTTGCCAGACGGTATCCTACGGTGTCTTGGGTTAGGCCTGAAA
ACTAAAGCGGCTCCGAGATAACCGTCGCGCGTTCAGGCCCGATATTAAATCCAAGGATCG
CGTCCACGGCTACAGCCATAACTGAGGGTCTGGCCGGGCTTTTTTATTTCCTTCGTCCGA
TAGTAACCCCTTTCGGAGCGAAATGGCCAAATTGATTATTCGGGCTCACTTGAAATCAGT
GTACTACTGCTGAGGTTACGGGACTGGCGTTAATCGGCGGGGCCGGCTAGCAGGTTCTTG
AAAGAAGCTCCTGCGTACTATATTTGGTATGTATCATGACATAGGGATGATACCGTCCCC
GGTCAACATAGGCGACATCGTAGTTTATTTACGGTCGGCAGGTCGGCTTGGATGCCCCAA
TGACCATCAATTGAAGGTCCGATCGTATATAGCTATATACGATCGGACCTTCAATTGATG
GTCATCTGTTTTATAACCTACGGCGGTACCCCATTCGATCTGCCAAACCGTAGCCATGTC
CGTATGCTGAAACATGGTATACAGGATTTGGGACAAGCCCCCCAGCCCATCCGTGGTGCT
CCGGGTGCGAACCTCAGGATGCGATTTGACAATACCGGCACTAAAGCCACGGATAGCCGT
CTCCGACCTCGAGGCTTGGGCACCAAGCTGAGCCCATGCTTTCTATGTCGTTTCATTCGA
ACCGAGCAGAGGGTTCAGTGGCTGTAGAACGCAACGGTTGAGTAAAGGTCTTGACTTAGA
ATCGTATCTCCAACTAATGATGTTATCCCGAAGGCCCCCACCGCAGCGCTTATCAACGGA
TCAACCCGACATCACAGAAGATAACGCGTGTAGGCCAGCCAGAACTACGATTCTGACTTT
CCGT"""

def read_fasta(fasta_input):
    """Read FASTA formatted input and return the DNA string."""
    lines = fasta_input.strip().split('\n')
    return ''.join(line.strip() for line in lines[1:])  # Join all lines after the header

def translate_dna_to_protein(dna):
    """Translate a DNA sequence to a protein string."""
    # Codon to amino acid mapping
    codon_table = {
        'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
        'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
        'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
        'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
        'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
        'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
        'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
        'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
        'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
        'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
        'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
        'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
        'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
        'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
        'TAC': 'Y', 'TAT': 'Y', 'TAA': '',   'TAG': '',   # Stop codons
        'TGC': 'C', 'TGT': 'C', 'TGA': '',   'TGG': 'W',
    }

    protein = []

    # Iterate through the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna), 3):
        codon = dna[i:i + 3]
        if codon in codon_table:
            amino_acid = codon_table[codon]
            if amino_acid:  # Skip stop codons
                protein.append(amino_acid)
            else:
                break  # Stop translation at stop codon

    return ''.join(protein)

def find_orfs(dna):
    """Find all distinct candidate protein strings from ORFs in the DNA string."""
    proteins = set()
    n = len(dna)

    # Search for start codons (ATG)
    for i in range(n - 2):
        if dna[i:i + 3] == 'ATG':  # Start codon
            # Check for stop codons
            for j in range(i, n - 2, 3):
                codon = dna[j:j + 3]
                if codon in ['TAA', 'TAG', 'TGA']:  # Stop codons
                    orf = dna[i:j + 3]  # Extract the ORF
                    protein = translate_dna_to_protein(orf)
                    if protein:  # Only add non-empty proteins
                        proteins.add(protein)
                    break  # Stop searching after the first stop codon
    return proteins  # Return the set of distinct proteins

# Sample input in FASTA format
fasta_input = """>Rosalind_99
AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG"""

# Read the DNA string from the FASTA input
dna_string = read_fasta(fasta_input)

# Find all distinct candidate protein strings
proteins = find_orfs(dna_string)

# Print the results
for protein in proteins:
    print(protein)


MGMTPRLGLESLLE
MTPRLGLESLLE
M


In [33]:

# Sample dataset
dna_sequence = "AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG"
# Define the genetic code for translation
genetic_code = {
    'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
    'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
    'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
    'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
    'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
    'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
    'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
    'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
    'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
    'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
    'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
    'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
    'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
    'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
    'TAC': 'Y', 'TAT': 'Y', 'TAA': '*', 'TAG': '*',
    'TGC': 'C', 'TGT': 'C', 'TGA': '*', 'TGG': 'W',
}

def translate_dna_to_protein(dna_sequence):
    protein = []
    # Iterate over the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna_sequence), 3):
        codon = dna_sequence[i:i+3]
        if len(codon) == 3:  # Ensure we have a full codon
            amino_acid = genetic_code.get(codon, '')
            if amino_acid == '*':  # Stop codon
                break
            protein.append(amino_acid)
    return ''.join(protein)

# Translate the DNA sequence to protein
protein_sequence = translate_dna_to_protein(dna_sequence)

# Print the protein sequence
print(protein_sequence)

# Additional outputs based on your example
# Assuming you want to print specific substrings of the protein sequence
print(protein_sequence[0:15])  # MLLGSFRLIPKETLI
print(protein_sequence[0])      # M
print(protein_sequence[1:15])   # MGMTPRLGLESLLE
print(protein_sequence[2:15])   # MTPRLGLESLLE

SHVANSGYMGMTPRLGLESLLE
SHVANSGYMGMTPRL
S
HVANSGYMGMTPRL
VANSGYMGMTPRL


In [38]:
# Sample dataset (this DNA sequence should yield the desired protein sequence)
# Replace this with the correct DNA sequence that translates to MLLGSFRLIPKETLIQVAGSSPCNLS
dna_sequence = "AGCCATGTAGCTAACTCAGGTTACATGGGGATGACCCCGCGACTTGGATTAGAGTCTCTTTTGGAATAAGCCTGAATGATCCGAGTAGCATCTCAG"
# Define the genetic code for translation
genetic_code = {
    'ATA': 'I', 'ATC': 'I', 'ATT': 'I', 'ATG': 'M',
    'ACA': 'T', 'ACC': 'T', 'ACG': 'T', 'ACT': 'T',
    'AAC': 'N', 'AAT': 'N', 'AAA': 'K', 'AAG': 'K',
    'AGC': 'S', 'AGT': 'S', 'AGA': 'R', 'AGG': 'R',
    'CTA': 'L', 'CTC': 'L', 'CTG': 'L', 'CTT': 'L',
    'CCA': 'P', 'CCC': 'P', 'CCG': 'P', 'CCT': 'P',
    'CAC': 'H', 'CAT': 'H', 'CAA': 'Q', 'CAG': 'Q',
    'CGA': 'R', 'CGC': 'R', 'CGG': 'R', 'CGT': 'R',
    'GTA': 'V', 'GTC': 'V', 'GTG': 'V', 'GTT': 'V',
    'GCA': 'A', 'GCC': 'A', 'GCG': 'A', 'GCT': 'A',
    'GAC': 'D', 'GAT': 'D', 'GAA': 'E', 'GAG': 'E',
    'GGA': 'G', 'GGC': 'G', 'GGG': 'G', 'GGT': 'G',
    'TCA': 'S', 'TCC': 'S', 'TCG': 'S', 'TCT': 'S',
    'TTC': 'F', 'TTT': 'F', 'TTA': 'L', 'TTG': 'L',
    'TAC': 'Y', 'TAT': 'Y', 'TAA': '*', 'TAG': '*',
    'TGC': 'C', 'TGT': 'C', 'TGA': '*', 'TGG': 'W',
}

def translate_dna_to_protein(dna_sequence):
    protein = []
    # Iterate over the DNA sequence in steps of 3 (codons)
    for i in range(0, len(dna_sequence), 3):
        codon = dna_sequence[i:i+3]
        if len(codon) == 3:  # Ensure we have a full codon
            amino_acid = genetic_code.get(codon, '')
            if amino_acid == '*':  # Stop codon
                break
            protein.append(amino_acid)
    return ''.join(protein)


# Translate the DNA sequence to protein
protein_sequence = translate_dna_to_protein(dna_sequence)

# Print the protein sequence
print(protein_sequence)

# Additional outputs based on your example
# Assuming you want to print specific substrings of the protein sequence
print(protein_sequence[0:15])  # MLLGSFRLIPKETLI
print(protein_sequence[0])      # M
print(protein_sequence[1:15])   # MGMTPRLGLESLLE
print(protein_sequence[2:15])   # MTPRLGLESLLE

SHVANSGYMGMTPRLGLESLLE
SHVANSGYMGMTPRL
S
HVANSGYMGMTPRL
VANSGYMGMTPRL
