# String Manipulation

There's a bunch of methods we can use on strings to mess with them.

## Count Elements of a String

Considering DNA base pairs, you can count the elements in the chain. Below uses 3A, 5C, 2G and 4T. 

In [1]:
# Initialise variables
string = 'A C T A G C T G C T A C T C' # 3A, 5C, 2G, 4T
base = ['A', 'C', 'G', 'T']
counts = []
pcages = []
total = 0

# Loop the `base` list to build the `counts` list.
for i in base:
    counts.append(string.count(i))

# Loop the `counts` list to get the total number of pairs
for j in counts:
    total += j
    
# Loop the `counts` list again to get the percentages    
for k in counts:
    pcages.append(k / total)
    
print(counts, total, pcages, sep = '\n')

[3, 5, 2, 4]
14
[0.21428571428571427, 0.35714285714285715, 0.14285714285714285, 0.2857142857142857]


## Replace Characters in String

There's a couple ways of doing this. The `maketrans` method is useful in a pinch. 

In [2]:
# Initialise the string
string = 'C A T G G T A T A C C A T G'
# base = ['A', 'C', 'G', 'T']
# conj = ['T', 'G', 'C', 'A']

# Use `.maketrans()` method to replace characters with their conjugate
myTable = string.maketrans('A C G T', 'T G C A')
print(string.translate(myTable))

G T A C C A T A T G G T A C


You can also do it by specifying a `dictionary`, that's essentially all `myTable` is in the example above.

Below, we swap the characters for Old Persian Cuneiform characters using their Unicode values.

In [6]:
# Initialise the string `alphabet` and make translation dictionary
alphabet = 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z'
myTable = alphabet.maketrans('A B C D E F G H I J K L M N O P Q R S T U V W X Y Z', '\U000103A0 \U000103A1 \U000103A2 \U000103A3 \U000103A4 \U000103A5 \U000103A6 \U000103A7 \U000103A8 \U000103A9 \U000103AA \U000103AB \U000103AC \U000103AD \U000103AE \U000103AF \U000103B0 \U000103B1 \U000103B2 \U000103B3 \U000103B4 \U000103B5 \U000103B6 \U000103B7 \U000103B8 \U000103B9')

# Print out the dictionairy
print('myTable =', myTable)

# Add blank line
print('')

# Perform translation
print(alphabet, '\n', alphabet.translate(myTable))

myTable = {65: 66464, 32: 32, 66: 66465, 67: 66466, 68: 66467, 69: 66468, 70: 66469, 71: 66470, 72: 66471, 73: 66472, 74: 66473, 75: 66474, 76: 66475, 77: 66476, 78: 66477, 79: 66478, 80: 66479, 81: 66480, 82: 66481, 83: 66482, 84: 66483, 85: 66484, 86: 66485, 87: 66486, 88: 66487, 89: 66488, 90: 66489}

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
 𐎠 𐎡 𐎢 𐎣 𐎤 𐎥 𐎦 𐎧 𐎨 𐎩 𐎪 𐎫 𐎬 𐎭 𐎮 𐎯 𐎰 𐎱 𐎲 𐎳 𐎴 𐎵 𐎶 𐎷 𐎸 𐎹
