# ILLUSTRATING PYTHON VIA BIOINFORMATICS
In order to illustrate the use of Python in bioinformatics one should be able to show how it can be used to analyze and manipulate biological data, such as DNA sequences. 
For example, you could demonstrate how to use Python libraries such as Biopython to perform tasks such as sequence alignment, gene annotation, and phylogenetic analysis. 
Additionally, you could show how Python can be used in conjunction with other bioinformatics tools, such as Blast, to automate and streamline various bioinformatics workflow. Another way to utilize the capabilities of Python in bioinformatics is through data visualization, which can be achieved by utilizing libraries such as matplotlib, seaborn and plotly.

## Table of Contents
1. ### Basics
    1.1 Python DNA Strings Manipulation  
    1.2 Python Symmetrical or Palindrome string Confirmation  
2. ### Python Libraries For Bioinformatics
    2.1 Gene Annotation  
        2.1.1 Biopython  
        2.1.2 Gffutils  
        2.1.3 HTSeq  
        2.1.4 Anvi'o  
    2.2 Sequence Alignment
    2.3 phylogenetic analysis. [opt]


In [2]:
from Bio import pairwise2

seq1 = "ACTGACG"
seq2 = "ACTAGCAG"

alignments = pairwise2.align.globalxx(seq1, seq2)

# print the best alignment
print(pairwise2.format_alignment(*alignments[0]))


ACTGA-C-G
||| | | |
ACT-AGCAG
  Score=6





Counting Letters in DNA String
One way to count the number of occurrences of each letter in a DNA string is to use the collections module, specifically the Counter class. Here's an example of how you might use it:

In [1]:
from collections import Counter

dna_string = input("Enter DNA String: ")
#for example - AGCTAGCTAGCTAGCT

counts = Counter(dna_string) 
#The output will be the count of each letter in the DNA string, in the form of a dictionary
print(counts)

Counter({'A': 4, 'G': 4, 'C': 4, 'T': 4})


Another way is using python's inbuilt function " count() "

In [2]:
dna_string = "AGCTAGCTAGCTAGCT"
print("A:",dna_string.count("A")) 
#This can be repeated for each individual letter in the DNA String

A: 4


Therefore we can develop a function within which a for loop is used to iterate over each letter in the string. Within the loop, the print() function is used to print out the current letter.
The function "dna_count" below will help Loop over the letters in the string, testing if the current letter equals the desired one, and if so, increase the counter. 
This is easily done by converting a string to a list: Looping over the letters is easier if the letters are stored in a list.

In [12]:
#Method One:
#
# When the function is called and passed a DNA string, it will print out each letter in the string 
# and return the frequency of each letter in the string
#
from collections import Counter
def dna_count(dna_string):
    i=0
    dna_string = input("Enter DNA String: ")
    for letter in dna_string:
        print(letter)
        i +=1
    return i
dna_count(dna_string)
print(Counter(dna_string))

A
T
G
C
G
G
A
C
C
T
A
T
Counter({'A': 4, 'G': 4, 'C': 4, 'T': 4})


# Method Two
This is a Python function called count() that takes two inputs, a DNA string and a base (A, T, C, or G).

The function uses a for loop to iterate over each letter in the input DNA string. Inside the loop, an if statement is used to check if the current letter being iterated over is equal to the input base. If the letter is equal to the base, the print() function is used to print out the current letter and the variable i is incremented by 1.

The function ends with the return statement that returns the value of i which is the number of times the base is found in the DNA string.

When the function is called and passed a DNA string and base, it will search the DNA string for the base, if found, it will print out the base and return the total number of times the base is found in the DNA string.

In [23]:

from collections import Counter
def count(dna, base):
    i=0
    for letter in dna:
        if letter == base:
            i+=1
            #print(Counter(letter))
    return Counter(letter)

#Take for example
dna = 'ATGCGGACCTAT'
base = input("Enter a DNA Letter")
n = count(dna, base)

# Adding a Check
We can also add a check to make sure that the base passed is valid. If the base passed is valid i.e one of the four letters of DNA, if the base passed is not valid, the function will return an error message.

In [None]:
def count(dna, base):
    if base not in ['A', 'T', 'C', 'G']:
        return "Invalid base, please enter A, T, C, or G"
    i=0
    for letter in dna:
        if letter == base:
            i+=1
    return i

Both methods can be used to count the letters in DNA strings, but the collections.
Counter method is more concise and efficient for longer DNA strings.

In [8]:
string = input("Enter a string")
half = int(len(string)/2)

if len(string)% 2 == 0: #even
    first_str = string[:half]
    second_str = string[half:]
else:
    first_str = string[:half]
    second_str = string[half+1:]

#symmetric
if first_str == second_str:
    print(string,'String is symmetrical')
else:
    print(string,'String is not symmetrical')

#Palindrome
if first_str ==second_str[::-1]:
    print(string,'String is Palindrone')
else:
    print(string, 'String is not Palindrome')

amaama String is symmetrical
amaama String is Palindrone


In [13]:
string = input("Enter String: ")
rev_str = string[::-1] #Reverse the string
if string  == rev_str:
    print(string,' =is the equal to=',rev_str)
else:
    print('Not Palindrome')

racecar  =is the equal to= racecar


#### Function to reverse words in a string
This function will take a string as an input, reverse the order of the words in the string and return the reversed string.

In [6]:
def rev_words(string):
    words = string.split(' ') 
    rev_string = ' '.join(reversed(words))
    return rev_string