# Model prediction maker

Model setup is as follows: we have dijet asymmetry data prepared, where the asymmetry AJ is defined as the difference between the two jets divided by the sum. Specifically,

$$A_{\mathrm{j}} = \frac{p_{\mathrm{T, 1}} - p_{\mathrm{T, 2}}}{p_{\mathrm{T, 1}} + p_{\mathrm{T, 2}}}$$

We will construct a model to describe the energy loss observed in the dijet asymmetry.  For this model, we consider back-to-back dijets.  Each jet can lose energy, and the lost energy is parameterized as

$$ \Delta p_{\mathrm{T}} / p_{\mathrm{T}} \sim Gaus(A, B)$$

In addition to the energy loss contribution, we have extra "apparent" smearing on the AJ coming from the fact that we have other processes going on in the events (three jets etc).  It is parameterized as a Gaussian smearing on AJ with width C. So there are three total parameters: A, B, and C.

The measurement is done in two bins of centrality.  One in central event, where (A, B, C) are all relevant, and another one in very peripheral event, where only the parameter (C) is relevant.

The goal here in this notebook is to make the inputs needed for Bayesian inference to learn about A, B and C from the provided data

In [None]:
import numpy as np
import os

Folder = 'input/AJHomework/'

if not os.path.exists(Folder):
    os.mkdir(Folder)

In [None]:
DataXMin        = 0.000
DataXMax        = 1.000
DataXBin        = 0.025

DataNBin        = int((DataXMax - DataXMin) / DataXBin)

# how many design points do you want to generate?
NDesign         = ______

# What is the upper parameter range (one each for A, B, C)?
# The lower range for each parameter is 0 by construction.
# Hint: start with a large-range guess!  Then we can come back and reduce range
ParameterRanges = [______]

## The "prediction" function

Let's write a function, where we do the required smearing, make a histogram on the final AJ, and return the prediction

In [None]:
def Predict(A, B, C):
    N = 100000
    
    Hist = np.zeros(DataNBin)
    
    for i in range(N):
        # Jet 1 and jet 2 PT (J1 and J2) after quenching.
        # Assuming initial energy is 100 GeV, and (delta PT / PT) ~ gaus(A, B), calculate the final energy
        # Jet PT = 100 GeV * (?)
        # Note that the initial energy cancels out in AJ
        # Useful function: np.random.normal(1, 2) gives you a random sampling with gaussian mean 1 and sigma 2
        J1 = ______
        J2 = ______
        # Calculate AJ from the PTs
        AJ = (J1 - J2) / (J1 + J2)
        # Adding extra gaussian smearing from parameter C
        AJ = AJ + ______
        # AJ is defined to be leading - subleading -> positive!
        AJ = np.abs(AJ)

        # put things into bins
        Bin = int((AJ - DataXMin) / DataXBin)
        if Bin < 0:   # underflow
            Bin = 0
        if Bin >= DataNBin:   # overflow
            Bin = DataNBin - 1
        
        Hist[Bin] = Hist[Bin] + 1
        
    return Hist / N

### Test the prediction (cross check for yourself)

In [None]:
# Test predicting one point - to see if the output makes sense or not
# Once you are happy, we move on!
example_prediction = Predict(______, ______, ______)
example_prediction

In [None]:
# Alternatively (or in addition), plot the AJ distribution for our single point
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(5,5))
ax.plot(np.arange(DataXMin, DataXMax, DataXBin) + (DataXBin / 2), example_prediction, marker="o", linestyle="")

## Making the design points

Let's start with a very simple random array :D

In reality we would use something more complicated to distribute the points better, but let's start simple.  Fancy stuff is just a better way to achieve the same purpose.

In [None]:
Design = np.random.rand(NDesign, 3) * ParameterRanges

## Preparing the model predictions

Let's loop over the design points, and call the predict function we just wrote to make a big table!

This step takes a while, like a few minutes.  Please be patient.

In [None]:
# Generate prediction for "central" data
Y1 = [Predict(______, ______, ______) for i in Design]
# Generate prediction for "peripheral" data.  Note here A and B are irrelevant.  So we set them to 0
Y2 = [Predict(______, ______, ______) for i in Design]

## Write everything out

In [None]:
with open(Folder + 'Prediction_Selection1.dat', 'w') as f:
    f.write('# Version 1.0\n')
    f.write('# Data Data_Selection1.dat\n')
    f.write('# Design Design.dat\n')
    np.savetxt(f, np.transpose(Y1))

In [None]:
with open(Folder + 'Prediction_Selection2.dat', 'w') as f:
    f.write('# Version 1.0\n')
    f.write('# Data Data_Selection2.dat\n')
    f.write('# Design Design.dat\n')
    np.savetxt(f, np.transpose(Y2))

In [None]:
with open(Folder + 'Design.dat', 'w') as f:
    f.write('# Version 1.0\n')
    f.write('# Parameter A B C\n')
    np.savetxt(f, Design)