Python for Bioinformatics
-----------------------------

![title](https://s3.amazonaws.com/py4bio/tapabiosmall.png)

This Jupyter notebook is intented to be used alongside the book [Python for Bioinformatics](http://py3.us/)



**Note:** Before opening the file, this file should be accesible from this Jupyter notebook. In order to do so, the following commands will download these files from Github and extract them into a directory called samples.

Chapter 14: Graphics in Python
-----------------------------

**USING BOKEH**

In [0]:
!pip install holoviews

In [0]:
!pip install -U ipykernel

In [0]:
!curl https://raw.githubusercontent.com/Serulab/Py4Bio/master/samples/samples.tar.bz2 -o samples.tar.bz2
!mkdir samples
!tar xvfj samples.tar.bz2 -C samples

**Listing 14.1:** basiccircle.py: A circle made with Bokeh

In [0]:
from bokeh.plotting import figure, output_file, show

p = figure(width=400, height=400)
p.circle(2, 3, radius=.5, alpha=0.5)
output_file("out.html")
show(p)

**Listing 14.2:** fourcircles.py: 4 circles made with Bokeh

In [0]:
from bokeh.plotting import figure, output_file, show

p = figure(width=500, height=500)
x = [1, 1, 2, 2]
y = [1, 2, 1, 2]
p.circle(x, y, radius=.35, alpha=0.5, color='red')
output_file("out.html")
show(p)

**Listing 14.3:** plot1.py: A minimal plot

In [0]:
from bokeh.plotting import figure, output_file, show

x = [1, 2, 3, 4, 5, 6, 7, 8]
y = [.7, 1.4, 2.1, 3, 3.85, 4.55, 5.8, 6.45]

p = figure(title='Mean wt increased vs. time',
           x_axis_label='Time in days',
           y_axis_label='% Mean WT increased')
p.circle(x, y, legend='Subject 1', size=10)
output_file('test.html')
show(p)

**Listing 14.4:** plot2.py: Two data series plot

In [0]:

from bokeh.plotting import figure, output_file, show

x = [1, 2, 3, 4, 5, 6, 7, 8]
y = [.7, 1.4, 2.1, 3, 3.85, 4.55, 5.8, 6.45]
z = [.5, 1.1, 1.9, 2.5, 3.1, 3.9, 4.85, 5.2]

p = figure(title='Mean wt increased vs. time',
           x_axis_label='Time in days',
           y_axis_label='% Mean WT increased')
p.circle(x, y, legend='Subject 1', size=10)
p.circle(x, z, legend='Subject 2', size=10, line_color='red',
         fill_color='white')
p.legend.location = 'top_left'
output_file('test.html')
show(p)

**Listing 14.5:** fishpc.py: Scatter plot

In [0]:
from bokeh.plotting import figure, show, output_file
from pandas import read_csv
from bokeh.models.markers import marker_types
from bokeh.transform import factor_cmap, factor_mark

df = read_csv('samples/fishdata.csv')
df['feeds_and_species'] = df['feeds'] + ', ' + df['species']
all_markers = [mt for mt in marker_types]
SPECIES = list(set(df['species']))
MARKERS = all_markers[:len(SPECIES)]
feeds = list(set(df['feeds']))
ttl = 'Metabolic variations based on 1H NMR profiling of fishes'

p = figure(plot_height=600, plot_width=700, title = ttl)
p.xaxis.axis_label = 'Principal Component 1: 35.8%'
p.yaxis.axis_label = 'Principal Component 2: 15.1%'
p.scatter('PC1', 'PC2', source=df, size=12, fill_alpha=0.3,
          marker=factor_mark('species', MARKERS, SPECIES),
          color=factor_cmap('feeds', 'Category10_3', feeds),
          legend_field='feeds_and_species')
p.legend.location = 'top_left'
p.legend.click_policy = 'hide'
output_file('scatter.html')
show(p)




**Listing 14.6:** heatmap.py: Plot a gene expression file

In [0]:
import numpy as np
import pandas as pd
import holoviews as hv
from holoviews import opts
hv.extension('bokeh')

DATA_FILE = 'samples/GSM188012.CEL'
dtype = {'x': int, 'y': int, 'lux': float}
ds = pd.read_csv(DATA_FILE, sep='\t', dtype=dtype)
ds = ds[['x', 'y','lux']]
heatmap = hv.HeatMap(ds).aggregate(function=np.mean)
heatmap.opts(opts.HeatMap(tools=['hover'], colorbar=True,
             width=400, height=400, toolbar='above'))

**Listing 14.7:** chord.py: A Chord diagram

In [0]:
import holoviews as hv
from holoviews import opts, dim
import pandas as pd

data = pd.read_csv('samples/test3.csv')
links = pd.DataFrame(data[['name_x','name_y', 'value']])
hv.extension('bokeh')
hv.output(size=400)
chord = hv.Chord(links)
chord.opts(opts.Chord(
           cmap='Category20', edge_cmap='Category20',
           edge_color=dim('name_x').str(), labels='name_x',
           node_color=dim('index').str()))
