# Plotting the pulsar data
I'm using Bokeh for this exercise. First, I'll import all the requied modules: Pandas for data structures and Bokeh for plotting.

In [1]:
import pandas as pd
import bokeh.plotting as plt
import bokeh.models as mdls
import bokeh.models.tools as tls

Then, I'll prepare the environment.  First, I'll read the CSV data from the file without chaning the column names and letting their data type to be infered automatically. Second, I'll set the output of Bokeh to iPython notebook.

In [2]:
data = pd.read_csv('pulsar_data_test.csv')
plt.output_notebook()

The plot I'm going to construct will plot the pulsar period along the X axis and its derivative along the Y axist. Additionaly, I want all the binary systems to be colored as red and all the non-binary systems - as blue. Also, I want the plot points to be of a size proportional to how many time of arrivals we have for a given pulsar.

To accomplish all this, I'll add some new columns to the dataset based on the existing ones. I'll prefix their names by `_`, since I want to distinguish between technical and non-technical columns later when implementing the mouse hover tool text.

In [3]:
data.loc[data['Binary'] == 'Y','_color'] = 'red'
data.loc[data['Binary'] != 'Y','_color'] = 'blue'
data['_size'] = data['TOAs'].astype(float) / 1000 + 4

Finally, I'll plot the data as described above. The `HoverTool` will output the values of all the columns except the ones prefixed by `_`.

In [4]:
p = plt.figure()

p.circle(x='Period', y='Period Derivative', size='_size', alpha=0.4, color='_color', source=mdls.ColumnDataSource(data))
p.add_tools(tls.HoverTool(tooltips = [(n, '@{%s}' % n) for n in data.axes[1] if not n[0] == '_']))
plt.show(p)

<bokeh.io._CommsHandle at 0x8bb42b0>