In order to work with Python and .csv data files, we should first learn how some basic things work in Python.

The code box below contains the first data line of the 100k dimuon set read into Python as a construct called a "tuple."  Effectively, a tuple is a kind of comma separated entry like a list.  This isn't a proper "list" in Python: a list can have items added or taken away from it while a tuple cannot.  This keeps us from accidentally messing up our data while we work with it.

The data in the code box is the following:
Type, RunNo, EvNo, E1, px1, py1, pz1, pt1, eta1, phi1, Q1, E2, px2, py2, pz2, pt2, eta2, phi2, Q2, M

In [None]:
data_line_1 = ("GT", 146511, 25343052, 7.3339, 2.06042, 5.8858, -3.85836, 6.23602, -0.584812, 1.23406, -1, 5.20755, -1.55016, -1.81976, 4.62525, 2.3905, 1.41411, -2.27636, 1, 11.8282)

As a quick note, observe that there are no comment lines in many of the code boxes here.  This is unusual for code.  Most of the things that would normally be comment lines are in the text boxes.  This means that the comments have been eliminated when possible to prevent redundancy.

To work with data in a tuple, we need to have the number of where the data is in the tuple, starting our numbering with zero.  To keep us from having to count, we can use the labels for the data as variables and set them equal to the data's position in the tuple.

In [None]:
Type = 0
RunNo = 1
EvNo = 2
E1 = 3
px1 = 4
py1 = 5
pz1 = 6
pt1 = 7
eta1 = 8
phi1 = 9
Q1 = 10
E2 = 11
px2 = 12
py2 = 13
pz2 = 14
pt2 = 15
eta2 = 16
phi2 = 17
Q2 = 18
M = 19

This allows us to use any part of the tuple by putting the name of the variable in square brackets rather than having to know the number of the place in the tuple.

For example, the following line will allow us to print any individual piece of data by putting the desired variable in the square brackets.  It is set up to start out printing the event number (EvNo)

In [None]:
print (data_line_1[EvNo])

Now we can start working with the data.  For example, if we want the total momentum of the first muon, we can use the following code.

In [None]:
p1 = ((data_line_1[px1] ** 2) + (data_line_1[py1] ** 2) + (data_line_1[pz1] ** 2)) ** 0.5
print (p1)

Note that the doubled stars mean "to the power of" in Python.

If we want to find the total momentum of muon 1 and muon 2, we can use the following.

In [None]:
px = data_line_1[px1] + data_line_1[px2]
py = data_line_1[py1] + data_line_1[py2]
pz = data_line_1[pz1] + data_line_1[pz2]
ptot = ((px**2) + (py**2) + (pz**2))**0.5
print(ptot)

Note that it is sometimes better to do parts of the calculations separately (in this case, adding vector components and then adding them via the Pythagorean theorem) than it is to do it all at once.  It allows for easier comprehension of the code.

You can do any trial manipulations of the data that you wish in the code box below.  You may want to try calculations with this single line of data before you try it in all the data with the full data file.

In [None]:
# Type your own code here!