## Plotting in Python

Matplotlib is a Python 2D plotting library which produces publication quality figures.

You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc, with just a few lines of code.

Check the link for more details: http://matplotlib.org/


-----------------------------------------------------------------------------------------

Use the following line to allow the plots to be displayed as part of the jupyter notebook:

    %matplotlib inline

In [None]:
# The following line to allow the plots to be displayed as part of the jupyter notebook
%matplotlib inline

# Import matplotlib's function pyplot to make simple plots
import matplotlib.pyplot as plt # To avoid using big names rename the module as plt
plt.plot([1,2,3,4],[2,4,6,12]) 
plt.show()

### Plotting

Add a second line to the plot. Try to use different line colors, markers and linestyles.

Check the link for plotting: http://matplotlib.org/users/pyplot_tutorial.html

In [None]:
# Import matplotlib's function pyplot to make simple plots

import matplotlib.pyplot as plt

# Plot a line
plt.plot([1,2,3,4],[2,4,6,12],label='Series1') 
plt.plot([1,2,3,4],[1,4,9,16],label='Series2', marker='o', linestyle='--', color='r')
plt.legend()

# Add Label for the X axis 
plt.xlabel('X') 

# Add Label for the Y axis 
plt.ylabel('Y') 

# Add Title to the plot 
plt.title('Test Plot') 

plt.show()

## Line Plots: Exercise



In [None]:
# Import matplotlib's function pyplot to make simple plots
# Import math module to access sin and cosine functions

import matplotlib.pyplot as plt
import math

x=[] # Empty list to store x axis values
cos_y=[] # Empty list to store cosine values
sin_y=[] # Empty list to store sine values

for i in range(60):
    x += [0.1*i] 
    cos_y += [math.cos(0.1*i)] 
    sin_y += [math.sin(0.1*i)] 

plt.plot(x, cos_y, label='cos', marker='s', color='b') 
plt.plot(x, sin_y, label='sin', marker='o', linestyle='--', color='r') 
plt.legend() 

plt.xlabel('X') 
plt.ylabel('Y') 
plt.title('Sin-Cos Plot') 

plt.show()

## Histograms

To make histograms, let us first create some data which is normally distributed.

To create a histogram use hist() function of matplotlib.pyplot

In [None]:
import matplotlib.pyplot as plt
import random 

#Create an empty list to store data 
data=[] 

#Generate 500 data values 
for i in range(500): 
    data += [random.normalvariate(10, 3)] 

#Plot a histogram 
plt.hist(data) 

#Set X and Y labels and plot title 
plt.xlabel('Values') 
plt.ylabel('Frequency') 
plt.title('Normal Distribution') 

plt.show()

## Plotting: Exercise

* Exercise 1
	- Create 2 random DNA sequences (random_seq1 and random_seq2) 	of length 500.
	- Use your own dna_tools module to count nucleotide usage (A,T,G, 	and C) in seq_r1 and seq_r2. 
	- Make a line plot to display the nucleotide usage. 
	- Use different markers for different sequences.
* Exercise 2:
	- Generate 100 random DNA sequences of length 500. 
	- Plot a histogram for 'A' nucleotide usage in the 100 random DNA 	sequences. 
	- Add histograms of other nucleotide usage in the same histogram.


In [None]:
# Import module to help us generate random numbers/entries
# Import counting function you created earlier
import random
from dna_tools import get_counts

random_seq1 = '' 
random_seq2 = '' 

# Use for loop to create two random sequences, each of length = 500
for i in range(500): 
    random_seq1 += random.choice('ATGC') 
    random_seq2 += random.choice('ATGC') 

# Use your the function that you imported to get A,T,G, and C composition of the random sequences
random_seq1_counts = get_counts(random_seq1, base=1)
random_seq2_counts = get_counts(random_seq2, base=1)

# Plot the A,T,G, and C compositions of the two random sequences in a single plot
# Use different markers, labels etc to distinguish between two lines
plt.xticks([1,2,3,4], random_seq1_counts.keys()) 
plt.plot([1,2,3,4], random_seq1_counts.values(), marker='s', label='seq_1') 
plt.plot([1,2,3,4], random_seq2_counts.values(), marker='o', label='seq_2') 
plt.legend()
plt.show()

In [None]:
# Import module to help us generate random numbers/entries
# Import counting function you created earlier

# Make a list to store 100 sequences, each of length 500
seqs = [None]*100 

# Use for loop to create 100 random sequences, each of length = 500
for n in range(100):
    seqs[n] = ''
    for i in range(500):
        seqs[n] += random.choice('ATGC')

labels = ['A','C','G','T'] 
colors = ['r','g','b','y'] 
counts = [[],[],[],[]] 

# Use for loop to count bases in each sequence
for seq in seqs:
    counts[0].append(seq.count('A'))
    counts[1].append(seq.count('C'))
    counts[2].append(seq.count('G'))
    counts[3].append(seq.count('T'))

# Plot the results in the form of a histogram
plt.figure()
plt.hist(counts[0], bins=15)

plt.figure()
plt.hist(counts, bins=15,label=labels, color=colors)
plt.legend()

plt.show()