# Luminosity at the FCC-ee Collider - Data Analysis

The steps needed to make your comparison of Standard Model predictions to **simulated** data for Bhabha scattering are described in this notebook.

The data we will examine come from a simulation of an experimental analysis for the process $e^+e^- \rightarrow e^+ e^-$. We aim to study the distribution as a function of scattering angle:
$$\frac{\mathrm{d}\sigma}{\mathrm{d}\theta},$$
which is measured in units of $\mathrm{pb}$. This will plotted in bins of $\theta$, measured in radians.

In this notebook you will use small angle Bhabha scattering to determine the integrated luminosity of the FCC-ee.

___
## Structuring data

Information can be stored in many form on computing systems, **data** needs to be **structured** in sensible and comprehensible ways that simplify interaction and interpreting.

A useful data structure in python (which exists in other programming languages) is the **array**, part of the `numpy` library.

Arrays are **containers** of variables which can be accessed individually, looped over, or altered by functions. Examples of how to use them are shown below - experiment with different values and functionalities!

In [None]:
## Example 1: Simple array manipulations
import numpy as np

# Construct arrays of numbers
array_1 = np.array([1,2,3,4])
array_2 = np.array([5,6,7,8])

print("Printing arrays")
print(array_1)
print(array_2)

# Operations for arrays
print("\nSimple array operations")
print("array_1 * array_2 = ", array_1 * array_2) # multiplies each element of array_1 by the corresponding element of array_2
print("array_1 + array_2 = ", array_1 + array_2) # adds each element of array_1 to the corresponding element of array_2
print("array_1 - array_2 = ", array_1 - array_2) # subtracts from each element of array_1 the corresponding element of array_2
print("array_1 / array_2 = ", array_1 / array_2) # divides each element of array_1 by the corresponding element of array_2

# Indexing for access to specific elements
# The first element of the array is '0', the second is '1', ...
print("\nIndexing arrays")
print("array_1[0] = ", array_1[0])
print("array_1[1] = ", array_1[1])
print("array_1[2] = ", array_1[2])
print("array_1[3] = ", array_1[3])

Functions can be used to modify components of arrays in many ways:

In [None]:
## Example 2: The following function
# transforms an array by multiplying
# each element by its index
import numpy as np

def transform_array(array):
    # enumerate gives access to index
    # and elements of a container
    for idx, num in enumerate(array):
        # inline multiplication multiplies
        # the element by the index
        array[idx] *= idx

    return array

test_array = np.array([1,2,3,4])
print("test array = ", test_array)
print("transformed array = ", transform_array(test_array))

Arrays can also be **multi-dimensional**!

In [None]:
## Example 2: Multi-dimensional arrays
import numpy as np

multi_array_1 = np.array([[1,2],[3,4]]) # This forms a 2x2 matrix!
multi_array_2 = np.array([[5,6],[7,8]])

# Multi-dimensional array operations:
print("Multi-dimensional array operations")
print("multi_array_1 = \n", multi_array_1)
print("\nmulti_array_2 = \n", multi_array_2)
print("\nmulti_array_1 + multi_array_2 = \n", multi_array_1 + multi_array_2)

# Indexing is more complicated:
print("\n---\n\nIndexing multi-dimensional arrays")
print("multi_array_1[0,0] = \n", multi_array_1[0,0])
print("\nmulti_array_1[0,1] = \n", multi_array_1[0,1])
print("\nmulti_array_1[1,0] = \n", multi_array_1[1,0])
print("\nmulti_array_1[1,1] = \n", multi_array_1[1,1])

# The 'shape' function tells you the dimensions of the array
print("\n---\n\nShape of multi_array_1 = ", np.shape(multi_array_1)) # 2x2 matrix
print("\nShape of multi_array_2 = ", np.shape(multi_array_2)) # 2x2 matrix


___
#### Exercise 4a: Array transformations

You are tasked to compare data published by an experiment to theoretical predictions. The experimental data is provided as a list of data points in the following format:
$$[\mathrm{ \tt x\_low \quad x\_high \quad value \quad error}],$$
where $\mathrm{\tt x\_low, x\_high}$ are the edges of the bins in $x$, $\mathrm{\tt value}$ is the value of the histogram at that point, and $\mathrm{\tt error}$ is the error of $y$.
Meanwhile the theoretical predictions, similar to those you produced in Exercise 3d will be in the form:
$$[\mathrm{ \tt x\_centre \quad value \quad error}],$$

Produce a function that can convert the data to the form $$[\mathrm{ \tt x\_centre \quad x\_width \quad experiment/theory \quad error}].$$

In [None]:
# Define function to calculate the ratio of experimental data to theoretical predictions
import numpy as np

def convert_bin_contribs(experimental_data, theory_data):
    # Write some code here
    return # Return value here

# TODO: Change data values
experimental_example = np.array([0.05, 0.1, 5, 0.5])
theory_example = np.array([0.075, 5.1, 0.0001])
print(convert_bin_contribs(experimental_example, theory_example))

___

## Reading external files

Data from experiments and from theoretical predictions can be imported in a variety of formats, often from external files.

We have included data in a format similar to the one from Exercise 4a, in the `.dat` files in the same folder as this notebook.
We can read these in python in a number of different ways:

In [None]:
import numpy as np

# Native python open() function
# 'r' argument means 'read-only'
f = open("small-angle-data.dat", "r")
print(f.read())
print(type(f.read()),"\n\n") # The native python 'open' function converts the file read to a string - not a useful format

# Try with numpy - which has some smarter functions!
example_histogram = np.genfromtxt("small-angle-data.dat", dtype=float)
print(example_histogram)
print(type(example_histogram)) # The method has converted the data to a numpy array!
print(np.shape(example_histogram)) # The dimensions are right too - 5 bins, 4 columns for each bin!

___
#### Exercise 4b: Complex array transformations

Using the solution to Exercise 4a (i.e. not writing a new function from scratch), write a function convert the entire datafile to the format we want

In [None]:
# Define function to convert the whole datafile,
# using the function(s) you have created earlier.
import numpy as np

def convert_datafile(datafile):
    # Hints:
    # Can you use some of the tools
    # we introduced in the previous
    # notebook to enable the use of
    # the function from Exercise 4a?
    return # Return converted data here

___
#### Exercise 4c: Reading files and comparing predictions with data

Using the solution to Exercise 4b, and the example for reading files, create a new variable that holds the converted experimental data.
Next, use the solution to Exercise 3c to calculate the theory prediction for the cross-section, as integrated over the experimental angular bins.

In [1]:
# Load the data from the two text files, and use your function from exercise 3b to combine them

___

## Plotting revisited

We will now plot our estimate of the integrated luminosity from each different bin of the data from `small-angle-data.dat`.
Remember from the first notebook that integrated luminosity is the ratio of the number of events to the cross-section.

Also remember the guidance from the introduction of what goes into a good plot!

#### Exercise 4d: Producing the plot

Produce the required plot described above, pay attention to the special guidance from the introduction notebook.

You can use the previous variables and functions you have created/used earlier.

Some useful links for matplotlib:
- [Documentation for errorbar function](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.errorbar.html)
- [Example 1](https://matplotlib.org/stable/gallery/statistics/errorbar.html) and [Example 2](https://matplotlib.org/stable/gallery/statistics/errorbar_features.html) of using the errorbar function

In [7]:
import matplotlib.pyplot as plt

# Tell matplotlib to use LaTeX rendering, and a large font size
plt.rc('text', usetex=True)
plt.rc('font', family='serif', size=18)

# Load the data from the text files

# Plot the ratio of experimental data to theoretical prediction with errors using the plt.errorbar function

# Add x and y axis labels

# Now show the plot you have made
plt.show()

Does the value for the luminosity in each bin look consistent?
If so, average the five values to produce a combined luminosity value.