# Creating a mass plot from CERN OpenData
This is a Jupyter notebook and it allows us to type text, format it with headings and links, and have separate cells for code. You can click the "play" icon above to run a highlighted cell and go on to the next one. The really cool thing is you can edit the code and run it again to see how the output changed.

In this example, we'll import some detector data and make a plot of the masses of the particles detected. To begin, click on this cell, then click the "play" icon to run each cell and see the output.

In [None]:
# First, we'll "import" the software packages needed.
import pandas
import numpy
%matplotlib inline
import matplotlib.pyplot as plt

# Starting a line with a hashtag tells the program not to read the line.
# That way we can write "comments" to humans trying to figure out what the code does.
# Blank lines don't do anything either, but they can make the code easier to read.

## Importing a data set
Now let's choose some data to plot. In this example we'll pull data from CERN's CMS detector and make a histogram of invariant mass. You can find more at CERN OpenData

This next cell will take a little while to run since it's grabbing a pretty big data set. This one contains 100,000 collision events. The cell label will look like "In [\*]" while it's still thinking and "In [2]" when it's finished.

In [None]:
# Whenever you type "something =" it defines a new variable, "something", 
# and sets it equal to whatever follows the equals sign. That could be a number, 
# another variable, or in this case an entire table of numbers.
dataset = pandas.read_csv('http://opendata.cern.ch/record/303/files/dimuon.csv')

# Analyze dielectron data instead by referencing instead:
# http://opendata.cern.ch/record/304/files/dielectron.csv

We can view the first few rows of the file we just imported.

In [None]:
# The .head(n) command displays the first n rows of the file.
dataset.head(5)

## Plotting a histogram

In [None]:
# Create variables for histogram parameters
mass = dataset.M
nbins = 50
xmin = 30
xmax = 120

# Calculate the source data (a frequency table) for the histogram
mass_plot = numpy.histogram(mass, bins=nbins, range=(xmin, xmax))

# Formatting how the histogram will display the frequency data.
hist, bins = mass_plot
width = 1.0*(bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2

plt.bar(center, hist, align='center', width=width,  color='k', linewidth=1, edgecolor='g')
plt.xlim(xmin, xmax)
plt.xlabel('invariant mass (GeV)')
plt.ylabel('number of events')

# This actually shows the histogram
plt.show()

You can find more information on how numpy makes histograms [here](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.histogram.html).

## Modify and re-plot
If we're looking for particles that decay into a pair of oppositely-charged particle, we'll need to filter events with two like-charge products.

In [None]:
oppQ_events = dataset[dataset.Q1 != dataset.Q2]
oppQ_mass = oppQ_events['M']

mass_plot = numpy.histogram(oppQ_mass, bins=nbins, range=(xmin, xmax))

hist, bins = mass_plot
width = 1.0*(bins[1] - bins[0])
center = (bins[:-1] + bins[1:]) / 2

# Formatting the histogram
plt.bar(center, hist, align='center', width=width,  color='k', linewidth=1, edgecolor='g')
plt.xlim(xmin, xmax)
plt.xlabel('invariant mass (GeV)')
plt.ylabel('number of events')

# This actually shows the histogram
plt.show()

And now you've made a mass plot! Try editing some code and re-running the cell to see the effects. For more information on formatting the markdown text in a cell like this one, go to Help > Markdown > Basic Writing and Formatting Text.

To save your work: go to File > Save and Checkpoint. That only saves your edits as long as you're working inthis notebook. All is lost after 10 minutes of inactivity.

To really save your work: go to File > Download as > iPython notebook (or save as pdf if you just want to show someone a snapshot of what your code and output look like.