Now that we've seen some basic operations in Python and read in a .csv file, the next logical step is to plot the data in the file in as many permutations as possible to get an idea of what the data looks like.  Knowing what the data looks like will give us insight into what the data means and what things we might want to investigate further.

While the method of reading in .csv files we used previously was easy and didn't use anything that's not part of standard Python, we'll need to do some more work to make plots with it.  One way to do this is to add the data points to lists as we read them in.  We can't use tuples this time because they are immutable!

We'll start by creating blank lists that we'll populate with data later.  We need to preserve the order we had previously to keep the data types straight since we're not relying on the .csv reader to do that for us.

In [1]:
Type = []
RunNo = []
EvNo = []
E1 = []
px1 = []
py1 = []
pz1 = []
pt1 = []
eta1 = []
phi1 = []
Q1 = []
E2 = []
px2 = []
py2 = []
pz2 = []
pt2 = []
eta2 = []
phi2 = []
Q2 = []
M = []

Then we want to populate these lists with data from the .csv file.

In [2]:
import csv

with open('dimuon.csv') as csvfile:
    READdimu = csv.reader(csvfile, delimiter = ',')
    
    rowcount = 0
    for row in READdimu:
        if rowcount > 0:
            Type.append(row[0])
            RunNo.append(float(row[1]))
            EvNo.append(float(row[2]))
            E1.append(float(row[3]))
            px1.append(float(row[4]))
            py1.append(float(row[5]))
            pz1.append(float(row[6]))
            pt1.append(float(row[7]))
            eta1.append(float(row[8]))
            phi1.append(float(row[9]))
            Q1.append(float(row[10]))
            E2.append(float(row[11]))
            px2.append(float(row[12]))
            py2.append(float(row[13]))
            pz2.append(float(row[14]))
            pt2.append(float(row[15]))
            eta2.append(float(row[16]))
            phi2.append(float(row[17]))
            Q2.append(float(row[18]))
            M.append(float(row[19]))
        rowcount = rowcount + 1

We will want to use Python to create two kinds of plots, one which you are certainly familiar with and one which you might not be.

We'll want to create x-y scatter plots and histograms.  Scatter plots let us see the relationship of one variable to another and decide if that relationship means anything useful to us.  Histograms let us see how many times a certain value of a certain variable occurs in our data and see if the pattern of that variable's variation is has useful meaning.  In some cases we want to do plots using the whole data set with the raw variables we start with.  Sometimes we want to create plots from manipulated data.  We'll see examples of all these things in this notebook.

We'll start with scatterplots.  This will plot variables against each other on an x-y axis to see what relationships exist between them.  Remember that Type isn't graphable data and that RunNo and EvNo are indexing data.  We'll use plotting the x momentum versus the y momentum for muon 1 as an example.

This code will label our axes, give us a title, and open the plot in its own window.  If it does not appear above this notebook, check your desktop: it might be there!

In [4]:
%matplotlib
import matplotlib.pyplot as plt
plt.scatter(px1, py1)
plt.xlabel('px1')
plt.ylabel('py1')
plt.title('Muon 1: momentum in x vs. momentum in y')
plt.show()

Using matplotlib backend: Qt5Agg


Note that there are some handy tools in this window.  You can zoom in on an area of the plot, you can save the plot as a .png file (note that this will not preserve the tools), and you can use the arrows to toggle between views you've chosen.

Now we'll plot a histogram.  Histograms show the distribution of values of a variable, allowing us to determine whether a certain variable has patterns in how often it occurs over what values.  Sometimes this will reflect the shape or size of the detectors, sometimes show an even distribution that might reflect random chance, and sometimes will show more interesting peaks or gaps in the values present in the data.

Histograms are a little like bar graphs in that they use the height of a plotted segment to indicate the number of something.  In the case of a histogram, the segments are called "bins" and they represent a lowest and highest value of the variable to add to the number counted for that segment.  The code below allows matplotlib to determine how large those bins should be rather than using Python's default of 10 or making us write code to optimize the size of the bins.  It also gives us a labeled x-axis (labeling the y-axis isn't vital because it's always a count in a histogram) and a title.

In [5]:
plt.hist(px1, bins='auto')
plt.xlabel('px1')
plt.title('Muon 1: momentum in x')
plt.show()

On to plotting the results of calculations!

Unlike in the notebook on reading in data, we're not going to re-read the data and calculate with it while we do.  We already have lists containing our data!  We'll instead use one of many methods to get new lists by performing calculations with our existing lists.  This is not the only method for doing this, but it should work with any calculation you need to do.

In [7]:
p1 = []
p1 = [((px1[i]**2)+(py1[i]**2)+(py1[i]**2))**0.5 for i in range(len(px1))]
plt.hist(p1, bins='auto')
plt.xlabel('total p')
plt.title('Muon 1: magnitude of total momentum')
plt.show()

Now you should plot scatter plots of each pair of graphable, relevant variables and histograms for each variable on its own to see its distribution.  Save the histograms as files with different names so you can compare them later.

In [None]:
# insert your own code here!