<a href="https://colab.research.google.com/github/Peter-Apps/data-camp/blob/main/Theta_%26_Phi_Graphing_Activity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Theta & Phi Graphing Activity
This is a Jupyter notebook with blocks of code called cells. You can press shift+ENTER to run a cell and go on to the next one. You can also edit the code and run it again to see how the output changes.

You may see a popup window the first time saying "Warning". Don't worry, it's safe. Click on "run anyway".

Try running the following cells by pressing SHIFT and ENTER (at the same time) for each one.

You won't hurt anything by experimenting. If you break it, close the tab and open the activity again to start over.

###Here's some background info on the detector that was used to collect this data

The CMS detector uses several angles to describe the path of the collision products.

<div>
<img src="https://tikz.net/wp-content/uploads/2021/09/axis3D_CMS-001.png", width=300/> 
</div>

The beamline is aligned with the $z$ axis and with the $x$ and $y$ axis representing the plane transverse to the beamline.
The polar angle $\theta$ measures the particle's path between the $y$ axis and the beamline while $\phi$ measures the particle's angle in the transverse plane.

Here's a couple of sites to browse if you need a bit more detail. 
* [Geometry of a Collider Detector](https://quarknet.org/page/geometry-collider-detector)
* [CMS Coordinate System](https://tikz.net/axis3d_cms/)

In [None]:
# Importing some useful tools
import pandas as pd # <-- Pandas lets us easily create and use data tables
import numpy as np # <-- Numpy is useful for doing math on a large amount of data
from matplotlib import pyplot as plt # <-- Pyplot to create pretty graphs

In [None]:
# Creating our 3 data files. 
# Reading in the csv file. 
# This CMS dataset has over 1 million events with dilepton products

# Here's the electron dataset we'll be using
data = pd.read_csv("https://drive.google.com/uc?export=download&id=17I_r75CK-19ixyG-czq9iphHwD3p4qsV&confirm=t")

# If you're interested, you can uncomment this line to use the muon dataset instead
#data = pd.read_csv("https://drive.google.com/uc?export=download&id=1ykN2rQ2MsNt-F_Ky4lliaaWx5O_8oiPn&confirm=t")

# Removing columns we don't need and creating a few we will use
data["theta_rad"] = 2 * np.arctan(np.exp(-1 * data.eta))
data["theta_deg"] = np.degrees(data["theta_rad"])

# Comment out this line if you're using the muon dataset
data = data.drop(["entry", "subentry", "run", "luminosityBlock", "event","nElectron", "pt", "eta", "charge", "pfRelIso03_all", "dxy", "dxyErr", "dz", "dzErr", "cutBasedId", "pfId", "sip3d"], axis = 1)

# Uncomment this line if you're using the muon dataset
#data = data.drop(["entry", "subentry", "luminosityBlock", "event", "nMuon", "pt", "charge", "pfRelIso03_all", "pfRelIso04_all"])

# Splitting the dataset up into 3 different sizes
small = data.iloc[:5000,:].copy()
medium = data.iloc[:300_000,:].copy()
large = data.copy()


In [None]:
# You can look at the size of each file by using the shape command to see the number of rows and columns
data.shape

In [None]:
# You can see the headers by using the .head command
data.head(5)

# Checkpoint:
* How many data points are there in each dataset? Hint: Use the ```.shape``` command
* What are the column titles in each dataset? What do they refer to? Hint: Use the ```.head()``` command


In [None]:
# Create a histogram of the small data set. In this case, we're using the phi column.
plt.hist(small["phi"], bins = 10, range = [0, 10])
plt.title("I'm a title")
plt.xlabel("I'm an X-axis label")
plt.ylabel("I'm a Y-axis label")
plt.show()

In [None]:
# Here's the same histogram from above but with error bars added. 
# For this example, we're using the square root of the count as the uncertainity

# Creating the histogram and saving the number of bins, and their edges as variables
n, bin_edges, _ = plt.hist(small["phi"], bins = 10, range = [0, 10], color = 'r')

# Calculating the center of each bin
bin_centers = 0.5 * (bin_edges[1:] + bin_edges[:-1])

# Adding error bars to the plot. Feel free to experiment with changing the fmt, ecolor and capsize
plt.errorbar(bin_centers, n, yerr = np.sqrt(n), fmt = ".", ecolor = 'k', capsize = 5)

plt.show()

# Your Task:
* Make a claim about how phi is distributed in the histogram
* Adjust the bin width, plot range, and amount of data. Add error bars to your plot.
 * Did any of these changes affect your claim? If so how?
* Repeat this process using theta

