# Bioinformatics with Python

Examples take from: http://hplgit.github.io/bioinf-py/doc/pub/html/main_bioinf.html. Also available from: http://hplgit.github.io/bioinf-py/doc/pub/bioinf-py.html.

## Counting Letters in DNA Strings


### List Iteration

In [1]:
rna = list('AUGC')
rna

['A', 'U', 'G', 'C']

In [2]:
for nucleobases in rna:
    print (nucleobases)

A
U
G
C


### String Iteration


In [3]:
def count_v1(dna, base):
    """
    count_v1 counts how many times one of the five nucleobases:
    adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U)
    is in a given DNA chain.

    :param dna: DNA chain
    :param base: nitrogenous base of interest
    :return: number of times the specified base exists in the DNA chain
    """ 
    
    # First convert string to list of letters
    dna = list(dna)
    # Initialise counter
    counter = 0            
    for nucleobase in dna:
        if nucleobase == base:
            counter += 1
    return counter

In [4]:
# Create a Nucleobase dictionary
nucleobases_dict = {
    "A": "Adenine",
    "C": "Cytosine",
    "G": "Guanine",
    "T": "Thymine",
#     "U": "Uracil"
} 

# Display the dictionary
print(nucleobases_dict)

{'A': 'Adenine', 'C': 'Cytosine', 'G': 'Guanine', 'T': 'Thymine'}


### First example

In [5]:
# Find how many times is cytosine in the following chain.
dna = 'ATGCGGACCTATCC'
base = 'C'
n = count_v1(dna, base)

print (f'{nucleobases_dict[base]} appears {n} times in {dna}.')

Cytosine appears 5 times in ATGCGGACCTATCC.
