# [ATLAS-CONF-2014-054](https://cds.cern.ch/record/1951322/files/ATLAS-CONF-2014-054.pdf)

## Combination of ATLAS and CMS top quark pair cross section measurements in the $e\mu$ final state using proton-proton collisions at $\sqrt{s} = 8$ TeV

In [1]:
from blue import Blue
import numpy as np
import pandas as pd

This notebook performs the BLUE combination of the $t\bar{t}$ cross-section measurements performed by ATLAS and CM at $\sqrt{s} = 8$ TeV. 

The first step of performing the combination is to read in the data (the measured values, the uncertainties and the correlations between uncertainties). This data is stored in a csv file and we use `pandas` to read it in. Let's read it in and have a look. This information is essentially the same as Table 1 in the combination note (without the combined result column).

In [2]:
df = pd.read_csv('./data/lhc_ttbar_xsec.csv')
df.T

Unnamed: 0,ATLAS,CMS,Correlations
Experiment,242.4,239.0,0.0
Stat,1.7,2.6,0.0
Trig,0.4,3.6,0.0
LepS,1.2,0.2,0.0
LepI,1.7,4.0,0.0
JetR,1.2,3.0,0.0
JetI,0.1,0.0,0.0
btag,1.0,1.7,0.0
Pile,0.0,2.0,0.0
JESu,0.6,4.3,0.0


The next thing we need to do is prepare this in the right format. The `Blue` class takes a pandas dataframe that represents the measured values and their uncertainties, and a mapping of correlations. We get these by seperating the correlations column from the data we have read in above (and dropping the dummy Experiment correlation value).

In [3]:
correlations = df.loc['Correlations'].drop('Experiment')
df = df.drop('Correlations')

We can peek at the first five rows of the dataframe using head. We see that there is no longer a correlation column.

In [4]:
df.T.head()

Unnamed: 0,ATLAS,CMS
Experiment,242.4,239.0
Stat,1.7,2.6
Trig,0.4,3.6
LepS,1.2,0.2
LepI,1.7,4.0


Next, we construct our instance of the `Blue` class with our data and correlation assumptions.

In [5]:
comb = Blue(df, correlations)

The combined cross-section value, using the BLUE method is:

In [6]:
comb.combined_result

241.46454311755085

which can be compared with the combined value quoted in the combination note of 241.5. So far so good!

We can create a new `pandas.Series` from our result and the combined uncertainties and then append this to our input data. We can even add a total uncertainty column by summing the individual uncertainties of each measurement (including the combination) in quadrature.

In [7]:
res = pd.Series(
    {comb.results_column: comb.combined_result, 
     **comb.combined_uncertainties}, name='Combination',
)

In [8]:
final_df = df.append(res)
final_df['Total uncertainty'] = np.sqrt((final_df.drop('Experiment', axis=1)**2).sum(axis=1))
np.round(final_df[[*df.columns, 'Total uncertainty']].T, 1)

Unnamed: 0,ATLAS,CMS,Combination
Experiment,242.4,239.0,241.5
Stat,1.7,2.6,1.4
Trig,0.4,3.6,1.0
LepS,1.2,0.2,0.9
LepI,1.7,4.0,1.7
JetR,1.2,3.0,1.2
JetI,0.1,0.0,0.1
btag,1.0,1.7,0.9
Pile,0.0,2.0,0.6
JESu,0.6,4.3,1.3


This is very close to Table 1 of the combination note now. Note there are some very subtle differences in some of the values which could be due to the precision used in the input data.

We can look at the weights each experiment contributed.

In [9]:
comb.weights

array([ 0.72486562,  0.27513438])

and we can test the consistency of the measurements with the combination using a $\chi^2$ test:

In [10]:
from scipy.stats import chi2
chi2_ndf = comb.chi2_ndf
print('chi2 =', chi2_ndf[0], ', ndf =', chi2_ndf[1], ', p-value =', chi2.sf(*chi2_ndf))

chi2 = 0.054031315728 , ndf = 1 , p-value = 0.816191335626


In [11]:
print('Correlation between ATLAS and CMS measurements = {:.1%}'.format(comb.total_correlations[0, 1]))

Correlation between ATLAS and CMS measurements = 23.5%


These numbers are in good agreement with the values quoted in the combination note.

### Result from $\chi^2$ minimization

The BLUE method is equivalent to a $\chi^2$ minimisation. The $\chi^2$ as a function of the combined value $x$, for measurements $\sigma$ can be written:

$\chi^2(x) = (\sigma - x)_{T} \Sigma^{-1} (\sigma - x)$ ,

where $\Sigma^{-1}$ is the inverse of the total covaraince matrix. We can get the combined covariance matrix using:

In [12]:
cov = comb.total_covariance
cov

array([[  89.   ,   30.135],
       [  30.135,  185.22 ]])

We'll use `scipy` to minimize the $\chi^2$, the function is written as:

In [13]:
from scipy.optimize import minimize

def chi_square_fit(res, cov):
    inv_cov = np.linalg.inv(cov)
    def func(x):
        return (res.values - x).T @ inv_cov @ (res.values - x)
    return func

which can then be minmized using the following function call:

In [14]:
min_res = minimize(chi_square_fit(df.Experiment, cov), [0])
min_res

      fun: 0.054031315727974004
 hess_inv: array([[ 36.40208917]])
      jac: array([  4.65661287e-10])
  message: 'Optimization terminated successfully.'
     nfev: 21
      nit: 3
     njev: 7
   status: 0
  success: True
        x: array([ 241.46454312])

The result from the minimization can then be compared with the result from the BLUE method:

In [15]:
comb.combined_result, min_res.x[0]

(241.46454311755085, 241.46454312233828)

which are pretty, pretty, pretty close.