<p style="font-family: Arial; font-size:3em;color:purple; font-style:bold"><br>Cuts Optimization using Extra Gradient Boosting
<br></p><br>

Over the last years, **Machine Learning** tools have been successfully applied to problems in high-energy physics. For example, for the classification of physics objects. Supervised machine learning algorithms allow for significant improvements in classification problems by taking into account observable correlations and by learning the optimal selection from examples, e.g. from Monte Carlo simulations.


# Importing the Libraries

**Numpy** is a powerful library that makes working with python more efficient, so we will import it and use it as np in the code. **Pandas** is another useful library that is built on numpy and has two great objects *series* and *dataframework*. Pandas works great for *data ingestion* and also has *data visualization* features. From **Hipe4ml** we import **TreeHandler** and with the help of this function we will import our *Analysis Tree* to our notebook.

**Matplotlib** comes handy in plotting data while the machine learning is performed by **XGBOOST**. We will import data splitter from **Scikit-learn** as *train_test_split*. **Evaluation metrics** such as *confusion matrix*, *Receiver operating characteristic (ROC)*, and *Area Under the Receiver Operating Characteristic Curve (ROC AUC)*  will be used to asses our models.

A **Confusion Matrix** $C$ is such that $C_{ij}$ is equal to the number of observations known to be in group $i$ and predicted to be in group $j$. Thus in binary classification, the count of true positives is $C_{00}$, false negatives $C_{01}$,false positives is $C_{10}$, and true neagtives is $C_{11}$.

If $ y^{'}_{i} $ is the predicted value of the $ i$-th sample and $y_{i}$ is the corresponding true value, then the fraction of correct predictions over $ n_{samples}$ is defined as 
$$
True \: positives (y,y^{'}) =  \sum_{i=1}^{n_{samples} } 1 (y^{'}_{i} = y_{i}=1)
$$ 

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import xgboost as xgb
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

from sklearn.model_selection import RandomizedSearchCV, cross_val_score
from scipy.stats import uniform

import weakref 

from bayes_opt import BayesianOptimization
#from root_pandas import read_root


from data_cleaning import clean_df
from KFPF_lambda_cuts import KFPF_lambda_cuts
from plot_tools import AMS, preds_prob, plot_confusion_matrix
from tree_importer import tree_importer
import uproot


#To save some memory we will delete unused variables
class TestClass(object): 
    def check(self): 
        print ("object is alive!") 
    def __del__(self): 
        print ("object deleted") 
        
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(8)

import gc

# Importing the data
CBM has a modified version of the cern's root software and it contains the simulated setup of CBM. Normally, a model generated input file, for example a URQMD 12 AGeV, is passed through different macros. These macros represent the CBM setup and it is like taking particles and passing them through a detector. These particles are registered as hits in the setup. Then particles' tracks are reconstructed from these hits using cellular automaton and Kalman Filter mathematics.


CBM uses the **tree** format of cern root to store information. To reduce the size of these root files a modified tree file was created by the name of Analysis tree. This Analysis tree file contains most of the information that we need for physics analysis. 

In this example, we download three Analysis Trees. The first one contains mostly background candidates for lambda i.e. protons and pions which do not come from a lambda. The second file contains mostly signal candidates of lamba i.e. it contains protons and pions which come from a lambda decay. The third one contains 10k events generated using URQMD generator with 12 AGeV energy.

In [None]:
# We import three root files into our jupyter notebook
signal = tree_importer('/home/shahid/cbmsoft/Data/PFSimplePlainTreeSignal.root','PlainTree')
# We only select lambda candidates
sgnal = signal[(signal['LambdaCandidates_is_signal']==1) & (signal['LambdaCandidates_mass']>1.108)
               & (signal['LambdaCandidates_mass']<1.1227)]
del signal

# Parallel processing

In [None]:
df_clean_signal = uproot.open('dcm_100k_signal.root:t1').arrays(library='pd')
gc.collect()
#df_clean_signal['issignal']=((df_clean_signal['issignal']>0)*1)
signal = df_clean_signal[(df_clean_signal['issignal']==1) & (df_clean_signal['mass']>1.108)
               & (df_clean_signal['mass']<1.1227)]
#del df_clean_signal

In [None]:
signal

In [None]:
del signal

In [None]:
signal['mass'].hist[]

In [None]:
df_clean_urqmd = tree_importer('/home/shahid/Mount/gsi/u/flat_trees/PFSimplePlainTree_urqmd.root','PlainTree')
gc.collect()

In [None]:
del df_clean_urqmd

In [None]:
from concurrent.futures import ThreadPoolExecutor
file = uproot.open('/home/shahid/Mount/gsi/u/flat_trees/PFSimplePlainTree_urqmd_5k.root:PlainTree',library='pd').arrays(labels,library='np')

In [None]:
labels=["LambdaCandidates_chi2geo", "LambdaCandidates_chi2primneg", "LambdaCandidates_chi2primpos",
         "LambdaCandidates_distance", "LambdaCandidates_ldl","LambdaCandidates_mass", "LambdaCandidates_pT", "LambdaCandidates_rapidity", "LambdaCandidates_is_signal"]

new_labels=['chi2geo', 'chi2primneg','chi2primpos', 'distance', 'ldl','mass', 'pT', 'rapidity','issignal']

df_urqmd_5k= pd.DataFrame(data=file)
del file
df_urqmd_5k.columns = new_labels
df_urqmd_5k['issignal']=((df_urqmd_5k['issignal']>0)*1)
with pd.option_context('mode.use_inf_as_na', True):
    df_urqmd_5k = df_urqmd_5k.dropna()

In [None]:
plot_labels=['$\chi^{2}_{geometrical}$', '$\chi^{2}_{primary\ \pi^-}$','$\chi^{2}_{primary\ proton}$', 'DCA (cm)', 'L/$\Delta$L','mass', '$p_{T}$', '$y_{LAB}$','issignal']

In [None]:
df3_base

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
bin1 = 300 
plt.hist(df3_base[df3_base['issignal']==0]['distance'],bins = bin1, color = 'red',alpha = 0.3,label='Background')
plt.hist(df3_base[df3_base['issignal']==1]['distance'],bins = bin1, color = 'blue',label='Signal', alpha =0.3)
plt.yscale('log')
plt.grid()
plt.ylabel('counts (log scale)', fontsize = 18)
#plt.xlabel('$\chi^{2}_{geometrical}$', fontsize = 18)
plt.legend(fontsize=18)
ax.tick_params(axis='both', which='major', labelsize=18)
#ax.text(0, 1500, r'CBM Performance', fontsize=15)
#ax.text(0, 500, r'URQMD, Au+Au @ 12 $A$GeV/$c$', fontsize=15)
plt.xlabel('DCA (cm)', fontsize = 18)
plt.xlim([0,0.2])
plt.show()
fig.tight_layout()
fig.savefig('hists.png')

In [None]:
fig, ax = plt.subplots(figsize=(8,6))
bin1 = 300 
range1=[0,4000]
plt.hist(df_urqmd_5k[df_urqmd_5k['issignal']==0]['ldl'],bins = bin1,range=range1, color = 'red',alpha = 0.3,label='Background')
plt.hist(df_urqmd_5k[df_urqmd_5k['issignal']==1]['ldl'],bins = bin1, range=range1, color = 'blue',label='Signal', alpha =0.3)
#plt.vlines(x=4,ymin=-1,ymax=10000, color='r', linestyle='-')
plt.yscale('log')
plt.grid()
plt.ylabel('counts (log scale)', fontsize = 18)
plt.xlabel(plot_labels[4], fontsize = 18)
plt.legend(loc='upper right',fontsize=18)
ax.tick_params(axis='both', which='major', labelsize=18)
ax.text(0.3, 15000, r'CBM Performance', fontsize=15)
ax.text(0.3, 5000, r'URQMD, Au+Au @ 12 $A$GeV/$c$', fontsize=15)
#plt.ticklabel_format(style='sci', axis='x', scilimits=(0,0))
#ax.text(4, 10000, r'$PFSimple$', fontsize=20, color ='r')
#plt.xlim([0,20])
plt.show()
fig.tight_layout()
fig.savefig('hists.png')

In [None]:
df_clean = tree_importer('/home/shahid/Mount/gsi/u/flat_trees/apr20_fr_18.2.1_fs_jun19p1/dcmqgsm_smm_pluto/auau/12agev/mbias/sis100_electron_target_25_mkm/PFSimplePlainTree_dcm.root','PlainTree')
gc.collect()

In [None]:
bg = df_original[(df_original['LambdaCandidates_is_signal'] == 0)
                & ((df_original['LambdaCandidates_mass'] > 1.07)
                & (df_original['LambdaCandidates_mass'] < 1.108) | (df_original['LambdaCandidates_mass']>1.1227) 
                   & (df_original['LambdaCandidates_mass'] < 2))]

## Renaming the columns

In [None]:
#The labels of the columns in the df data frame are having the prefix LambdaCandidates_ so we rename them
new_labels= ['chi2geo', 'chi2primneg', 'chi2primpos', 'chi2topo', 'cosineneg',
       'cosinepos', 'cosinetopo', 'distance', 'eta', 'l', 'ldl',
       'mass', 'p', 'pT', 'phi', 'px', 'py', 'pz', 'rapidity',
             'x', 'y', 'z', 'daughter1id', 'daughter2id', 'isfrompv', 'pid', 'issignal']



sgnal.columns = new_labels
#bg.columns = new_labels

#Let's see how the dataframe object df looks like
#df_original.columns=new_labels

The above data frame object has some columns/features and for them at the very last column the true Monte Carlos information is available. This MC information tells us whether this reconstructed particle was originally produced as a decaying particle or not. So a value of 1 means that it is a true candidate and 0 means that it is not.

# Data Cleaning
Sometimes a data set contains entries which do not make sense. For example, infinite values or NaN entries. We clean the data by removing these entries. Ofcourse, we lose some data points but these outliers sometimes cause problems when we perform analysis. 

Since our experiment is a fixed target experiment so there are certain constraints which have to be applied on the data as well.

In [None]:
def clean_df(df):
    import numpy as np
    import pandas as pd
    with pd.option_context('mode.use_inf_as_na', True):
        df = df.dropna()
    return df


In [None]:
#Creating a new data frame and saving the results in it after cleaning of the original dfs
#Also keeping the original one
#bcknd = clean_df(bg)
signal = clean_df(sgnal)

#del bg
del sgnal
gc.collect()

In [None]:
signal = df_clean[(df_clean['issignal']==1) & (df_clean['mass']>1.108)
               & (df_clean['mass']<1.1227)]

In [None]:
signal_selected= signal
background_selected = df_clean_urqmd[(df_clean_urqmd['issignal'] == 0)
                & ((df_clean_urqmd['mass'] > 1.07)
                & (df_clean_urqmd['mass'] < 1.108) | (df_clean_urqmd['mass']>1.1227) 
                   & (df_clean_urqmd['mass'] < 1.3))].sample(n=3*(signal_selected.shape[0]))
gc.collect()

# Selecting Background and Signal
Our sample contains a lot of background (2178718) and somewhat signal candidates (36203). For analysis we will use a signal set of 4000 candidates and a background set of 12000 candidates. The background and signal candidates will be selected by using MC information.

In [None]:
# We randomly choose our signal set of 4000 candidates
#signal_selected= signal

#background = 3 times the signal is also done randomly
#background_selected = bcknd.sample(n=3*(signal_selected.shape[0]))


#del signal
#del bcknd
#gc.collect()

#Let's combine signal and background
dfs = [signal_selected, background_selected]
df_scaled = pd.concat(dfs)

# Let's shuffle the rows randomly
df_scaled = df_scaled.sample(frac=1)
del dfs, signal_selected, background_selected
# Let's take a look at the top 10 entries of the df
df_scaled.iloc[0:10,:]
#del signal

In [None]:
print(df_scaled.shape)
df_scaled[df_scaled['issignal']==1].shape


y = 0.5 * (np.log(E+P/E-P))


https://cbm-wiki.gsi.de/foswiki/bin/view/PWG/CbmCollisionEnergies


y = 0.5 * (np.log((12+10)/(12-10)))


using this the rapidity is 3.1992 for Ebeam =12.04 and pbeam =12 and mid rapidity is y/2=  1.5996

In [None]:
range1 = (1.077, 1.18)
fig, axs = plt.subplots(figsize=(10, 6))
#df_scaled['mass'].plot.hist(bins = 300, range=range1,grid=True,sharey=True)
(df_scaled[df_scaled['issignal']==0])['mass'].plot.hist(bins = 300, facecolor='yellow',grid=True,range=range1, label='Background')
(df_scaled[df_scaled['issignal']==1])['mass'].plot.hist(bins = 300, facecolor='magenta',grid=True, range=range1, label ='Signal')
#plt.vlines(x=1.108,ymin=-1,ymax=48000, color='black', linestyle='-')
#plt.vlines(x=1.1227,ymin=-1,ymax=48000, color='black', linestyle='-')
plt.ylabel("Counts (log scale)", fontsize=15)
plt.xlabel("Mass in GeV/$c^2$", fontsize= 15)
plt.xticks(fontsize=15)
plt.yticks(fontsize=15)
#plt.title('Test and Train Lambda Invariant Mass', fontsize = 15)
plt.legend( fontsize = 15)
axs.tick_params(axis='both', which='major', labelsize=18)
axs.text(1.13, 9500, r'CBM Performance', fontsize=15)
axs.text(1.13, 6000, r'DCM-QGSM-SMM, Au+Au @ 12 $A$GeV/$c$', color = 'magenta',  fontsize=15)
axs.text(1.13, 4000, r'URQMD, Au+Au @ 12 $A$GeV/$c$', fontsize=15)
plt.yscale("log")
fig.tight_layout()
fig.savefig("hists.png")

# Creating Train and Test sets
To make machine learning algorithms more efficient on unseen data we divide our data into two sets. One set is for training the algorithm and the other is for testing the algorithm. If we don't do this then the algorithm can overfit and we will not capture the general trends in the data. 

In [None]:
# The following columns will be used to predict whether a reconstructed candidate is a lambda particle or not
cuts = [ 'chi2primneg', 'chi2primpos', 'ldl', 'distance', 'chi2geo']


x = df_scaled[cuts].copy()

# The MC information is saved in this y variable
y =pd.DataFrame(df_scaled['issignal'], dtype='int')

## Whole set

In [None]:
# The following columns will be used to predict whether a reconstructed candidate is a lambda particle or not
x_whole = df_clean[cuts].copy()
# The MC information is saved in this y variable
y_whole = pd.DataFrame(df_clean['issignal'], dtype='int')

# KFPF

In [None]:
#returns a new df 

#new_check_set=KFPF_lambda_cuts(df_original)
new_check_set=KFPF_lambda_cuts(df_clean)
del df_original
gc.collect()

<p style="font-family: Arial; font-size:3em;color:purple; font-style:bold"><br>XGB Boost 
<br></p><br>

## Bayesian
In order to find the best parameters of XGB for our data we use Bayesian optimization. Grid search and and random search could also do the same job but bayesian is more time efficient.

In [None]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=324)
dtrain = xgb.DMatrix(x_train, label = y_train)
dtest = xgb.DMatrix(x_whole, label = y_whole)
dtest1=xgb.DMatrix(x_test, label = y_test)
gc.collect()

In [None]:
x_whole_1 = df_clean_urqmd[cuts].copy()
# The MC information is saved in this y variable
y_whole_1 = pd.DataFrame(df_clean_urqmd['issignal'], dtype='int')
dtest2 = xgb.DMatrix(x_whole_1, label = y_whole_1)

### Hyper parameters

*subsample* [default=1]
Subsample ratio of the training instances. Setting it to 0.5 means that XGBoost would randomly sample half of the training data prior to growing trees. and this will prevent overfitting. Subsampling will occur once in every boosting iteration.
range: (0,1]

*eta* [default=0.3, alias: learning_rate]
Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative.
range: [0,1]


*gamma* [default=0, alias: min_split_loss]
Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger gamma is, the more conservative the algorithm will be.
range: [0,∞]


*alpha* [default=0, alias: reg_alpha]
L1 regularization term on weights. Increasing this value will make model more conservative.

*Lasso Regression* (Least Absolute Shrinkage and Selection Operator) adds “absolute value of magnitude” of coefficient as penalty term to the loss function.

In [None]:
#Bayesian Optimization function for xgboost
#specify the parameters you want to tune as keyword arguments
def bo_tune_xgb(max_depth, gamma, alpha, n_estimators ,learning_rate):
    params = {'max_depth': int(max_depth),
              'gamma': gamma,
              'alpha':alpha,
              'n_estimators': n_estimators,
              'learning_rate':learning_rate,
              'subsample': 0.8,
              'eta': 0.3,
              'eval_metric': 'auc', 'objective':'binary:logistic', 'nthread' : 6}
    cv_result = xgb.cv(params=params, dtrain=dtrain, num_boost_round=10, nfold=5)
    return  cv_result['test-auc-mean'].iloc[-1]

#Invoking the Bayesian Optimizer with the specified parameters to tune
xgb_bo = BayesianOptimization(bo_tune_xgb, {'max_depth': (4, 10),
                                             'gamma': (0, 1),
                                            'alpha': (2,20),
                                             'learning_rate':(0,1),
                                             'n_estimators':(100,500)
                                            })

#performing Bayesian optimization for 5 iterations with 8 steps of random exploration with an #acquisition function of expected improvement
xgb_bo.maximize(n_iter=15, init_points=8, acq='ei')
#0.9951

In [None]:
from sklearn.metrics import roc_auc_score, roc_curve
import numpy as np
import matplotlib.pyplot as plt
from numpy import sqrt, log, argmax
import itertools

"""
A receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot which illustrates the performance of a binary classifier system as its
discrimination threshold is varied. This function requires the true binary value and the target scores, which can either be probability estimates of
the positive class, confidence values, or binary decisions.
The function roc_auc_score computes Area Under the Receiver Operating Characteristic Curve (ROC AUC) from prediction scores.

To find the best threshold which results more signal to background ratio for lambda candidates we use the parameter S0 called the approximate median significance
by the higgs boson  ML challenge (http://higgsml.lal.in2p3.fr/documentation,9.)
"""
def AMS(y_true, y_predict, y_true1, y_predict1):
    roc_auc=roc_auc_score(y_true, y_predict)
    fpr, tpr, thresholds = roc_curve(y_true, y_predict,drop_intermediate=False ,pos_label=1)
    S0 = sqrt(2 * ((tpr + fpr) * log((1 + tpr/fpr)) - tpr))
    S0 = S0[~np.isnan(S0)]
    xi = argmax(S0)
    S0_best_threshold = (thresholds[xi])

    roc_auc1=roc_auc_score(y_true1, y_predict1)
    fpr1, tpr1, thresholds1 = roc_curve(y_true1, y_predict1,drop_intermediate=False ,pos_label=1)
    S01 = sqrt(2 * ((tpr1 + fpr1) * log((1 + tpr1/fpr1)) - tpr1))
    S01 = S01[~np.isnan(S01)]
    xi1 = argmax(S01)
    S0_best_threshold1 = (thresholds[xi1])

    fig, ax = plt.subplots(figsize=(10, 6), dpi = 100)
    plt.plot(fpr, tpr, linewidth=3 ,linestyle=':',color='darkorange',label='ROC curve train (area = %0.4f)' % roc_auc)
    plt.plot(fpr1, tpr1, color='green',label='ROC curve test (area = %0.4f)' % roc_auc1)
    plt.plot([0, 1], [0, 1], color='navy', linestyle='--', label='Random guess')
    #plt.scatter(fpr[xi], tpr[xi], marker='o', color='black', label= 'Best Threshold train set = '+"%.4f" % S0_best_threshold +'\n AMS = '+ "%.2f" % S0[xi])
    plt.scatter(fpr1[xi1], tpr1[xi1], marker='o', s=80, color='blue', label= 'Best Threshold test set = '+"%.4f" % S0_best_threshold1 +'\n AMS = '+ "%.2f" % S01[xi1])
    plt.xlabel('False Positive Rate', fontsize = 18)
    plt.ylabel('True Positive Rate', fontsize = 18)
    plt.legend(loc="lower right", fontsize = 18)
    plt.title('Receiver operating characteristic', fontsize = 18)
    ax.tick_params(axis='both', which='major', labelsize=18)
    plt.xlim([-0.01, 1.0])
    plt.ylim([0, 1.02])
    #axs.axis([-0.01, 1, 0.9, 1])
    fig.tight_layout()
    fig.savefig('hists.png')
    plt.show()
    return S0_best_threshold, S0_best_threshold1

# XGB models

In [None]:
max_param = xgb_bo.max['params']
param= {'alpha': max_param['alpha'], 'gamma': max_param['gamma'], 'learning_rate': max_param['learning_rate'],
        'max_depth': int(round(max_param['max_depth'],0)), 'n_estimators': int(round(max_param['n_estimators'],0))
        , 'objective': 'binary:logistic'}

#Fit/train on training data
bst = xgb.train(param, dtrain)

#predicitions on training set
bst_train= pd.DataFrame(data=bst.predict(dtrain, output_margin=False),  columns=["xgb_preds"])
y_train=y_train.set_index(np.arange(0,bst_train.shape[0]))
bst_train['issignal']=y_train['issignal']

#predictions on test set
bst_test = pd.DataFrame(data=bst.predict(dtest1, output_margin=False),  columns=["xgb_preds"])
y_test=y_test.set_index(np.arange(0,bst_test.shape[0]))
bst_test['issignal']=y_test['issignal']

#ROC cures for the predictions on train and test sets
train_best, test_best = AMS(y_train, bst_train['xgb_preds'],y_test, bst_test['xgb_preds'])

#The first argument should be a data frame, the second a column in it, in the form 'preds'
preds_prob(bst_test,'xgb_preds', 'issignal','test')

#Applying XGB on the 10k events data-set
df_clean['xgb_preds'] = bst.predict(dtest, output_margin=False)
#preds_prob(df_clean,'xgb_preds', 'issignal','test')

df_clean_urqmd['xgb_preds'] = bst.predict(dtest2, output_margin=False)

In [None]:
def preds_prob(df, preds, true, dataset):
    if dataset =='train':
        label1 = 'XGB Predictions on the training data set'
    else:
        label1 = 'XGB Predictions on the test data set'
    fig, ax = plt.subplots(figsize=(12, 8))
    bins1=100
    plt.hist(df[preds], bins=bins1,facecolor='green',alpha = 0.3, label=label1)
    TP = df[(df[true]==1)]
    TN = df[(df[true]==0)]
    #TP[preds].plot.hist(ax=ax, bins=bins1,facecolor='blue', histtype='stepfilled',alpha = 0.3, label='True Positives/signal in predictions')
    hist, bins = np.histogram(TP[preds], bins=bins1)
    err = np.sqrt(hist)
    center = (bins[:-1] + bins[1:]) / 2

    
    hist1, bins1 = np.histogram(TN[preds], bins=bins1)
    err1 = np.sqrt(hist1)
    plt.errorbar(center, hist1, yerr=err1, fmt='o',
                 c='Red', label='Background in predictions')
    
    plt.errorbar(center, hist, yerr=err, fmt='o',
                 c='blue', label='Signal in predictions')
    
    
    ax.annotate('Cut on probability', xy=(0, 10),  xycoords='data',
            xytext=(0.13, 0.65), textcoords='axes fraction',
            arrowprops=dict(facecolor='black', shrink=0.08),
            horizontalalignment='right', verticalalignment='top',fontsize=15
            )

    ax.set_yscale('log')
    plt.xlabel('Probability',fontsize=18)
    plt.ylabel('Counts', fontsize=18)
    plt.legend(fontsize=18)
    ax.set_xticks(np.arange(0,1.1,0.1))
    ax.tick_params(axis='both', which='major', labelsize=18)
    ax.tick_params(axis='both', which='minor', labelsize=18)
    plt.show()
    fig.tight_layout()
    fig.savefig('0.png')

In [None]:
preds_prob(bst_test,'xgb_preds', 'issignal', 'test')

In [None]:
# The following function will display the inavriant mass histogram of the original 10k event set along with the mass histoigram after we apply a cut
# on the probability prediction of xgb
def cut_visualization(df, variable,cut, range1=(1.09, 1.19), bins1= 300 ):
    mask1 = df[variable]>cut
    df3=df[mask1]
    
    fig, ax2 = plt.subplots(figsize=(12, 8), dpi = 300)
    color = 'tab:blue'
    ax2.hist(df['mass'],bins = bins1, range=range1, facecolor='blue' ,alpha = 0.35, label='before selection')
    ax2.set_ylabel('Counts', fontsize = 15, color=color)
    ax2.tick_params(axis='y', labelcolor=color)
    ax2.legend( fontsize = 15, loc='upper left')
    ax2.tick_params(axis='both', which='major', labelsize=15)
    ax2.grid()
    ax2.set_xlabel("Mass (GeV/${c^2}$)", fontsize = 18)
    
    
    
    color = 'tab:red'
    ax1 = ax2.twinx()
    ax1.hist(df3['mass'], bins = bins1, range=range1, facecolor='red',alpha = 0.35, label="XGB (with a cut > %.2f"%cut+')')
    ax1.set_xlabel('Mass in GeV', fontsize = 15)
    ax1.set_ylabel('Counts ', fontsize = 15, color=color)
    ax1.tick_params(axis='y', labelcolor=color)
    ax1.tick_params(axis='both', which='major', labelsize=15)
    ax1.legend( fontsize = 18,loc='upper right' )

    plt.title("The original sample's Invariant Mass along with mass after selection of XGB", fontsize = 15)
    plt.text(1.14, 8000, 'CBM Performance', fontsize=18)
    plt.text(1.14, 7000, 'URQMD, Au+Au @ 12A GeV/$c$', fontsize=18)
    #plt.text(0.02, 0.1, r'cut > %.4f'%cut, fontsize=15)
    plt.show()
    fig.tight_layout()
    fig.savefig("test_best.png")

In [None]:
cut_visualization(df_clean_urqmd,'xgb_preds',0.954)

In [None]:
cut3 = 0.95
mask1 = df_clean_urqmd['xgb_preds']>cut3
df3_base=df_clean_urqmd[mask1]
fig, axs = plt.subplots(figsize=(12, 8))

range1= (1.105, 1.14)
bins1 = 150

#xgb

#issignal has 0,1,2 . So we convert all signals above zero to 1



df3_base['mass'].plot.hist(bins = bins1, range=range1, facecolor='red',alpha = 0.3,grid=True,sharey=True, label='XGB selected $\Lambda$s')
#df3_base[df3_base['issignal']==1]['mass'].plot.hist(bins = 300, range=range1,facecolor='blue',alpha = 0.3,grid=True,sharey=True, '\n True positives = \n (MC =1)\n signal in \n the distribution')
#df3_base[df3_base['issignal']==1]['mass'].plot.hist(bins = bins1, range=range1,facecolor='magenta',alpha = 0.3,grid=True,sharey=True )
df3_base[df3_base['issignal']==0]['mass'].plot.hist(bins = bins1, range=range1,facecolor='green',alpha = 0.3,grid=True,sharey=True, label ='\n False positives = \n (MC =0)\n background in \n the distribution')

plt.legend( fontsize = 18, loc='upper right')
#plt.rcParams["legend.loc"] = 'upper right'
plt.title("XGB selected $\Lambda$ candidates with a cut of %.3f "%cut3 +"on the XGB probability distribution", fontsize = 18)
plt.xlabel("Mass (GeV/${c^2}$)", fontsize = 18)
plt.ylabel("Counts", fontsize = 18)
axs.text(1.123, 4000, 'CBM Performance', fontsize=18)
axs.text(1.123, 3500, 'URQMD, Au+Au @ 12A GeV/$c$', fontsize=18)
axs.tick_params(labelsize=18)
fig.tight_layout()
fig.savefig("whole_sample_invmass_with_ML.png")

## Confusion Matrix

By definition a confusion matrix $C$ is such that $C_{i, j}$ is equal to the number of observations known to be in group $i$ and predicted to be in group $j$.

Thus in binary classification, the count of true positives is $C_{0,0}$, false positives is $C_{1,0}$, true negatives is $C_{1,1}$ and false negatives is $C_{0,1}$.

The following function prints and plots the confusion matrix. Normalization can be applied by setting `normalize=True`.

In [None]:
df_clean['xgb_preds1'] = ((df_clean['xgb_preds']>cut3)*1)
cnf_matrix = confusion_matrix(y_whole, df_clean['xgb_preds1'], labels=[1,0])
#cnf_matrix = confusion_matrix(new_check_set['issignal'], new_check_set['new_signal'], labels=[1,0])
np.set_printoptions(precision=2)
fig, axs = plt.subplots(figsize=(10, 8))
axs.yaxis.set_label_coords(-0.04,.5)
axs.xaxis.set_label_coords(0.5,-.005)
plot_confusion_matrix(cnf_matrix, classes=['signal','background'], title='Confusion Matrix for XGB for cut > '+str(cut3))
plt.savefig('confusion_matrix_extreme_gradient_boosting_whole_data.png')

In [None]:
30269/ (50269+120980)

In [None]:
ax = xgb.plot_importance(bst)
plt.rcParams['figure.figsize'] = [5, 3]
plt.show()
ax.figure.tight_layout() 
ax.figure.savefig("hits.png")

In [None]:
xgb.plot_tree(bst,num_trees=2)
plt.rcParams['figure.figsize'] = [80, 160]
plt.rcParams['figure.dpi']=300
#plt.show()
plt.savefig("hists.pdf")

In [None]:
bst.to_graphviz(xg_reg, fmap='', num_trees=0, rankdir=None, yes_color=None, no_color=None, condition_node_params=None, leaf_node_params=None)

# Cut visualization

In [None]:
from matplotlib import gridspec

range1= (1.08, 1.22)


fig, axs = plt.subplots(2, 1,figsize=(15,10), sharex=True,  gridspec_kw={'width_ratios': [10],
                           'height_ratios': [8,4]})

ns, bins, patches=axs[0].hist((df3_base['mass']),bins = 300, range=range1,Fill=True, color='red', facecolor='red',alpha = 0.3)
ns1, bins1, patches1=axs[0].hist((new_check_set['mass']),bins = 300, Fill=True, range=range1,facecolor='blue',alpha = 0.3)
#plt.xlabel("Mass in GeV", fontsize = 15)
axs[0].set_ylabel("counts", fontsize = 15)
#axs[0].grid()
axs[0].legend(('XGBoost Selected $\Lambda$s','KFPF selected $\Lambda$s'), fontsize = 15, loc='upper right')

#plt.rcParams["legend.loc"] = 'upper right'
axs[0].set_title("The lambda's Invariant Mass histogram with KFPF and XGB selection criteria on KFPF variables", fontsize = 15)
axs[0].grid()
axs[0].tick_params(axis='both', which='major', labelsize=15)
#fig.savefig("whole_sample_invmass_with_ML.png")


hist1, bin_edges1 = np.histogram(df3_base['mass'],range=(1.09, 1.17), bins=300)
hist2, bin_edges2 = np.histogram(new_check_set['mass'],range=(1.09, 1.17), bins=300)

#makes sense to have only positive values 
diff = (hist1 - hist2)
axs[1].bar(bins[:-1],     # this is what makes it comparable
        ns / ns1, # maybe check for div-by-zero!
        width=0.001)
plt.xlabel("Mass in $\dfrac{GeV}{c^2}$", fontsize = 15)
axs[1].set_ylabel("XGB / KFPF", fontsize = 15)
axs[1].grid()
axs[1].tick_params(axis='both', which='major', labelsize=15)

plt.show()
fig.tight_layout()
fig.savefig("whole_sample_invmass_with_ML.png")

In [None]:
del dtest, dtrain, dtest1, df_scaled, x, y, x_whole, y_whole, x_train, x_test, y_train, y_test
gc.collect()

In [None]:
dcm_100k = df_clean.copy()
del df_clean

In [None]:
#bdt cut 0.7
df0 = df3_base
df0 = df0[(df0['mass']>1.07)&(df0['mass']<1.3)]
df0 = df0[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
#bdt cut 0.8
df1 = df3_base
df1 = df1[(df1['mass']>1.07)&(df1['mass']<1.3)]
df1 = df1[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
#bdt cut 0.9
df2 = df3_base
df2 = df2[(df2['mass']>1.07)&(df2['mass']<1.3)]
df2 = df2[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
#0.92
df3 = df3_base
df3 = df3[(df3['mass']>1.07)&(df3['mass']<1.3)]
df3 = df3[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
#test_best
df4 = df3_base
df4 = df4[(df4['mass']>1.07)&(df4['mass']<1.3)]
df4 = df4[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
#test_best
df4_urqmd = df3_base
df4_urqmd = df4_urqmd[(df4_urqmd['mass']>1.07)&(df4_urqmd['mass']<1.3)]
df4_urqmd = df4_urqmd[['rapidity', 'mass', 'pT', 'issignal']]
del df3_base

In [None]:
df4=df4[df4['issignal']==1]
#df4 = df4[['rapidity', 'mass', 'pT']]

In [None]:
del df_clean

## Curve Fitting

# PyRoot

In [None]:
import sys, ROOT
from ROOT import TF1, TCanvas,TMath, TColor

class Linear:
    def __call__( self, x, par ):
        return par[0] + x[0]*par[1]

class lorenztian:
    def _call_(self, x, p):
        return 0.5*p[0]*p[1] /( ((x[0]-p[2])**2) + ((0.5 * (p[1])**2))) 

class gaus:
    def _call_(self, x ,p):
        return p[0]*np.exp(-0.5*((x[0]-p[2])/p[1])**2)

In [None]:
import math
def truncate(number, decimals=2):
    """
    Returns a value truncated to a specific number of decimal places.
    """
    if not isinstance(decimals, int):
        raise TypeError("decimal places must be an integer.")
    elif decimals < 0:
        raise ValueError("decimal places has to be 0 or more.")
    elif decimals == 0:
        return math.trunc(number)

    factor = 10.0 ** decimals
    return math.trunc(number * factor) / factor


def background_selector(df):
    df1 = df[(df['mass']<1.108)]
    df2 = df[df['mass']>1.13]
    df3 = pd.concat([df1, df2])
    return df3['mass'] 

In [None]:
truncate(0.39)

In [None]:
#percentile binning
#df0 = df3_base
#df = df0[(df0['mass']>1.07)&(df0['mass']<1.3)]
df = df4
out0, bins0 =pd.qcut(df['rapidity'], q=4, retbins=1)
lowest_rapidity = df[df['rapidity']<bins0[1]]
low_rapidity = df[(df['rapidity']>bins0[1])&(df['rapidity']<bins0[2])]
mid_rapidity = df[(df['rapidity']>bins0[2])&(df['rapidity']<bins0[3])]
high_rapidity = df[(df['rapidity']>bins0[3])]
    
out1, bins1 =pd.qcut(lowest_rapidity['pT'], q=3, retbins=1)
low_pT_lowest_rapidity = lowest_rapidity[lowest_rapidity['pT']<bins1[1]]
mid_pT_lowest_rapidity = lowest_rapidity[(lowest_rapidity['pT']>bins1[1]) & (lowest_rapidity['pT']<bins1[2])]
high_pT_lowest_rapidity =lowest_rapidity[(lowest_rapidity['pT']>bins1[2])]

out2, bins2 =pd.qcut(low_rapidity['pT'], q=3, retbins=1)
low_pT_low_rapidity = low_rapidity[low_rapidity['pT']<bins2[1]]
mid_pT_low_rapidity = low_rapidity[(low_rapidity['pT']>bins2[1]) & (low_rapidity['pT']<bins2[2])]
high_pT_low_rapidity= low_rapidity[(low_rapidity['pT']>bins2[2])]
    
out3, bins3 =pd.qcut(mid_rapidity['pT'], q=3, retbins=1)
low_pT_mid_rapidity = mid_rapidity[mid_rapidity['pT']<bins3[1]]
mid_pT_mid_rapidity = mid_rapidity[(mid_rapidity['pT']>bins3[1]) & (mid_rapidity['pT']<bins3[2])]
high_pT_mid_rapidity=mid_rapidity[(mid_rapidity['pT']>bins3[2])]
    
out4, bins4 =pd.qcut(high_rapidity['pT'], q=3, retbins=1)
low_pT_high_rapidity = high_rapidity[high_rapidity['pT']<bins4[1]]
mid_pT_high_rapidity = high_rapidity[(high_rapidity['pT']>bins4[1]) & (high_rapidity['pT']<bins4[2])]
high_pT_high_rapidity=high_rapidity[(high_rapidity['pT']>bins4[2])]

#del out0, lowest_rapidity, mid_rapidity, high_rapidity, out1, out2, out3, out4, df

In [None]:
def background_selector(df):
    df1 = df[(df['mass']<1.108)]
    df2 = df[df['mass']>1.13]
    df3 = pd.concat([df1, df2])
    return df3['mass'] 

list1 = [low_pT_lowest_rapidity, mid_pT_lowest_rapidity, high_pT_lowest_rapidity, low_pT_low_rapidity,
         mid_pT_low_rapidity, high_pT_low_rapidity, low_pT_mid_rapidity, mid_pT_mid_rapidity,
        high_pT_mid_rapidity, low_pT_high_rapidity, mid_pT_high_rapidity, high_pT_high_rapidity]

In [None]:
#df4['mass'].describe()[1]-1.2*(df4['mass'].describe()[2])+0.2* (df4['mass'].describe()[2])
df4['mass'].describe()[1]+1.2*(df4['mass'].describe()[2])+0.2* (df4['mass'].describe()[2])

## Lorentzian
Lorenztian with second chebyshev 2nd order polynom

The describe of BDT score > 70 shows that the sigma of the data mean is at 1.178052 with an std of 0.059818. So 1.55sigma below the mean is 1.0883250000000002 and 1.55 sigma above the mean is 1.267779. So let's choose 1.55, 1.5 and 1.45 below the mean and vice versa.

In [None]:
pol2 = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
pol3 = TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
one_var_pol2=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
one_var_pol3=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])

In [None]:
#lorentzian + second order pol
lorentzian_pol2 = []
pt_min=[]
y_min = []
#lorentzian_3rd_order_pol = []


df = df4_urqmd


mass_range_min = [df['mass'].describe()[1]-1.2*(df['mass'].describe()[2])]
fit_limit_low=[0,0.1* (df['mass'].describe()[2]),   0.2* (df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.1* (df['mass'].describe()[2]),
                df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.2* (df['mass'].describe()[2])]

y_bin_low = -0.2
y_bin_up =0.0
for i in range(0,15,1):
    y_bin_low = truncate(y_bin_low + 0.2)
    y_bin_up = truncate(y_bin_up+0.2)
    df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
    
    pt_bin_low =-0.2
    pt_bin_up =0
    for i in range(0,15,1):
        pt_bin_low = truncate(pt_bin_low+0.2)
        pt_bin_up = truncate(pt_bin_up+0.2)
        df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
        mc_counts = df_pt[df_pt['issignal']>0].shape[0]
        
        for mm in mass_range_min:
            for mmm in range(0,3,1):
                
                #canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
                #canvas.Draw()
                #canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
                binning = [40,70,100,130]
                for b in binning:
                    tot_sig_3_sigma = 0
                    tot_bac_3_sigma = 0
                    tot_sig_3_point_5_sigma = 0
                    tot_bac_3_point_5_sigma = 0
                    tot_sig_2_point_5_sigma = 0
                    tot_bac_2_point_5_sigma = 0
                    tot_sig_2_sigma = 0


                    #step 0
                    if df_pt.shape[0]>500:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fb.SetParameters(0,0,0);
                        #fb =TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #fb.SetParameters(0,0,0,0);
                        h0.Fit(fb,"RIEMQN");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                        
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        f1.SetParameters(1,par[0], par[1], par[2]);
                        #f1=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f1.SetParameters(1,par[0], par[1], par[2],par[3]);
                        h1.Fit(f1,"RNIQ");
                        par1 = f1.GetParameters()

                        #canvas .Clear ()
                        #pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                        #pad1 . Draw ()
                        #pad1 . cd ()
                        #pad1. Clear()





                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3]);
                        #f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x+[6]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2.SetNpx(100000);
                        #f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3],par1[4]);
                        #f2.SetLineColor(ROOT.kRed)
                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIRQ"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fs.SetNpx(100000);
                        #fs.SetLineColor(ROOT.kGreen)
                        #fb.SetLineStyle(4)
                        #fb.SetLineColor(ROOT.kBlue)
                        #fb.SetNpx(100000);
                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5]);
                        #fb.SetParameters(par2[3],par2[4],par2[5],par2[6]);


                        #h1.SetTitleOffset(-1)
                        #h1.SetFillStyle(3003);
                        #h1.SetLineWidth(2)
                        #h1.SetStats (0)
                        #h1.SetYTitle("Entries")
                        #h1.SetLineColor(ROOT.kBlack)
                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        #h3.SetLineWidth(2)
                        #h3.SetStats (0)
                        #h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                        #h1.Draw("pe")
                        #fs.Draw("SAME")
                        #fb.Draw("SAME")
                        #f2.Draw("SAME")

                        bin1 = h1.FindBin(fit_limit_low[mmm]+mm);
                        bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            t_value = h1.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                        #h2.Sumw2()

                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));
                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        #tot = f2.Integral(integral_min,integral_max)/binwidth;
                        #sigma_integral = f2.IntegralError(integral_min,integral_max);

                        signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                        if signal_under_peak>0:
                            tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak                
                        #sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        #man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        #if backgnd_under_peak<0:
                            #print('Negative background')

                        Significance = signal_under_peak/TMath.Sqrt(tot);

                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        #bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        if signal_under_peak_3_point_5_sigma>0:
                            tot_sig_3_point_5_sigma= tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma                
                        #tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        #sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        #man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        #bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        if signal_under_peak_2_point_5_sigma>0:
                            tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        #tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        #sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        #man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)

                        signal_under_peak_2_sigm = (fs.Integral(par2[2] - (TMath.Abs(2*par2[1])),par2[2] + (TMath.Abs(2.*par2[1])))/binwidth);
                        #bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        if signal_under_peak_2_sigm>0:
                            tot_sig_2_sigma = tot_sig_2_sigma+signal_under_peak_2_sigm

                        #std = par2 [1]
                        #estd = f2.GetParError(1)
                        del h0, h1, h2, h3, f1, f2, fb, fs
                        #latex = ROOT . TLatex ()
                        #latex . SetNDC ()
                        #latex . SetTextSize (0.02)
                        #latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                        #latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                        #latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                        #latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                        #latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))
                        #latex . DrawLatex (0.4 ,0.55," True signal (MC=1) = %.f"%(mc_counts))

                        #legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                        #legend.AddEntry(h1,"Invariant mass of lambda","l");
                        #legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
                        #legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                        #legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                        #legend . SetLineWidth (0)
                        #legend.Draw()

                        #canvas . cd ()
                        #pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                        #pad2 . Draw ()
                        #pad2 . cd ()
                        #pad2.Clear()


                        #h3.SetLineColor(TColor.GetColor(5))
                        #h3.SetYTitle("d-f/#Deltad")
                        #h3.Draw()
                        #line = ROOT . TLine (mm,0 ,1.23 ,0)
                        #line . SetLineColor ( ROOT . kRed )
                        #line . SetLineWidth (2)
                        #line . Draw (" same ")


                        #pad1 . SetBottomMargin (0)
                        #pad2 . SetTopMargin (0)
                        #pad2 . SetBottomMargin (0.25)

                        #h1 . GetXaxis (). SetLabelSize (0)
                        #h1 . GetXaxis (). SetTitleSize (0)
                        #h1 . GetYaxis (). SetTitleSize (0.05)
                        #h1 . GetYaxis (). SetLabelSize (0.03)
                        #h1 . GetYaxis (). SetTitleOffset (0.6)

                        #h3 . SetTitle ("")
                        #h3 . GetXaxis (). SetLabelSize (0.12)
                        #h3 . GetXaxis (). SetTitleSize (0.12)
                        #h3 . GetYaxis (). SetLabelSize (0.1)
                        #h3 . GetYaxis (). SetTitleSize (0.15)
                    #ratio . GetYaxis (). SetTitle (" Data /MC")
                        #h3 . GetYaxis (). SetTitleOffset (0.17)
                    #207,512 divisions
                        #h3 . GetYaxis (). SetNdivisions (207)
                        #h1 . GetYaxis (). SetRangeUser (0.5 ,3000)
                        #h1 .GetYaxis().SetNdivisions(107)
                        #h3 . GetXaxis (). SetNdivisions (207)
                        gc.collect()
                        #canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
                    else:
                        tot_sig_2_point_5_sigma=tot_sig_2_point_5_sigma+0
                        tot_sig_3_sigma=tot_sig_3_sigma+0
                        tot_sig_3_point_5_sigma=tot_sig_3_point_5_sigma+0
                        #tot_sig_2_sigma = tot_sig_2_sigma+0
            #lorentzian_pol2.append(tot_sig_2_sigma)
                    lorentzian_pol2.append(tot_sig_2_point_5_sigma)
                    lorentzian_pol2.append(tot_sig_3_sigma)
                    lorentzian_pol2.append(tot_sig_3_point_5_sigma)
                    pt_min.append(pt_bin_low+0.2)
                    pt_min.append(pt_bin_low+0.2)
                    pt_min.append(pt_bin_low+0.2)
                    y_min.append(y_bin_low+0.2)
                    y_min.append(y_bin_low+0.2)
                    y_min.append(y_bin_low+0.2)
                    
            gc.collect()
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")       
print(y_bin_low)

In [None]:
#total yield
#lorentzian + second order pol
lorentzian_pol2 = []
#lorentzian_3rd_order_pol = []


df = df4_urqmd


mass_range_min = [df['mass'].describe()[1]-1.2*(df['mass'].describe()[2])]
fit_limit_low=[0,0.1* (df['mass'].describe()[2]),   0.2* (df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.1* (df['mass'].describe()[2]),
                df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.2* (df['mass'].describe()[2])]
for mm in mass_range_min:
    for mmm in range(0,3,1):
        #canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
        #canvas.Draw()
        #canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")


        binning = [70,100,130]
        for b in binning:
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            tot_sig_2_sigma = 0
            y_bin_low=-0.2
            y_bin_up =0
            for i in range(0,15,1):
                y_bin_low = truncate(y_bin_low+0.2)
                y_bin_up = truncate(y_bin_up+0.2)
                df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
                pt_bin_low =-0.2
                pt_bin_up =0
                for i in range(0,15,1):
                    pt_bin_low = truncate(pt_bin_low+0.2)
                    pt_bin_up = truncate(pt_bin_up+0.2)
                    df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
                    mc_counts = df_pt[df_pt['issignal']>0].shape[0]
                    #step 0
                    if df_pt.shape[0]>500:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fb.SetParameters(0,0,0);
                        #fb =TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #fb.SetParameters(0,0,0,0);
                        h0.Fit(fb,"RIEMQ");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                        
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        f1.SetParameters(1,par[0], par[1], par[2]);
                        #f1=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f1.SetParameters(1,par[0], par[1], par[2],par[3]);
                        h1.Fit(f1,"RNIQ");
                        par1 = f1.GetParameters()

                        #canvas .Clear ()
                        #pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                        #pad1 . Draw ()
                        #pad1 . cd ()
                        #pad1. Clear()





                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3]);
                        #f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x+[6]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2.SetNpx(100000);
                        #f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3],par1[4]);
                        #f2.SetLineColor(ROOT.kRed)
                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIRQ"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fs.SetNpx(100000);
                        #fs.SetLineColor(ROOT.kGreen)
                        #fb.SetLineStyle(4)
                        #fb.SetLineColor(ROOT.kBlue)
                        #fb.SetNpx(100000);
                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5]);
                        #fb.SetParameters(par2[3],par2[4],par2[5],par2[6]);


                        #h1.SetTitleOffset(-1)
                        #h1.SetFillStyle(3003);
                        #h1.SetLineWidth(2)
                        #h1.SetStats (0)
                        #h1.SetYTitle("Entries")
                        #h1.SetLineColor(ROOT.kBlack)
                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        #h3.SetLineWidth(2)
                        #h3.SetStats (0)
                        #h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                        #h1.Draw("pe")
                        #fs.Draw("SAME")
                        #fb.Draw("SAME")
                        #f2.Draw("SAME")

                        bin1 = h1.FindBin(fit_limit_low[mmm]+mm);
                        bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            t_value = h1.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                        #h2.Sumw2()

                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));
                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        #tot = f2.Integral(integral_min,integral_max)/binwidth;
                        #sigma_integral = f2.IntegralError(integral_min,integral_max);

                        signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                        if signal_under_peak>0:
                            tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak                
                        #sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        #man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        #if backgnd_under_peak<0:
                            #print('Negative background')

                        Significance = signal_under_peak/TMath.Sqrt(tot);

                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        #bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        if signal_under_peak_3_point_5_sigma>0:
                            tot_sig_3_point_5_sigma= tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma                
                        #tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        #sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        #man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        #bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        if signal_under_peak_2_point_5_sigma>0:
                            tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        #tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        #sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        #man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)

                        signal_under_peak_2_sigm = (fs.Integral(par2[2] - (TMath.Abs(2*par2[1])),par2[2] + (TMath.Abs(2.*par2[1])))/binwidth);
                        #bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        if signal_under_peak_2_sigm>0:
                            tot_sig_2_sigma = tot_sig_2_sigma+signal_under_peak_2_sigm

                        #std = par2 [1]
                        #estd = f2.GetParError(1)
                        del h0, h1, h2, h3, f1, f2, fb, fs
                        #latex = ROOT . TLatex ()
                        #latex . SetNDC ()
                        #latex . SetTextSize (0.02)
                        #latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                        #latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                        #latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                        #latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                        #latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))
                        #latex . DrawLatex (0.4 ,0.55," True signal (MC=1) = %.f"%(mc_counts))

                        #legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                        #legend.AddEntry(h1,"Invariant mass of lambda","l");
                        #legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
                        #legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                        #legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                        #legend . SetLineWidth (0)
                        #legend.Draw()

                        #canvas . cd ()
                        #pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                        #pad2 . Draw ()
                        #pad2 . cd ()
                        #pad2.Clear()


                        #h3.SetLineColor(TColor.GetColor(5))
                        #h3.SetYTitle("d-f/#Deltad")
                        #h3.Draw()
                        #line = ROOT . TLine (mm,0 ,1.23 ,0)
                        #line . SetLineColor ( ROOT . kRed )
                        #line . SetLineWidth (2)
                        #line . Draw (" same ")


                        #pad1 . SetBottomMargin (0)
                        #pad2 . SetTopMargin (0)
                        #pad2 . SetBottomMargin (0.25)

                        #h1 . GetXaxis (). SetLabelSize (0)
                        #h1 . GetXaxis (). SetTitleSize (0)
                        #h1 . GetYaxis (). SetTitleSize (0.05)
                        #h1 . GetYaxis (). SetLabelSize (0.03)
                        #h1 . GetYaxis (). SetTitleOffset (0.6)

                        #h3 . SetTitle ("")
                        #h3 . GetXaxis (). SetLabelSize (0.12)
                        #h3 . GetXaxis (). SetTitleSize (0.12)
                        #h3 . GetYaxis (). SetLabelSize (0.1)
                        #h3 . GetYaxis (). SetTitleSize (0.15)
                    #ratio . GetYaxis (). SetTitle (" Data /MC")
                        #h3 . GetYaxis (). SetTitleOffset (0.17)
                    #207,512 divisions
                        #h3 . GetYaxis (). SetNdivisions (207)
                        #h1 . GetYaxis (). SetRangeUser (0.5 ,3000)
                        #h1 .GetYaxis().SetNdivisions(107)
                        #h3 . GetXaxis (). SetNdivisions (207)
                        gc.collect()
                        #canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
                    else:
                        tot_sig_2_point_5_sigma=tot_sig_2_point_5_sigma+0
                        tot_sig_3_sigma=tot_sig_3_sigma+0
                        #tot_sig_3_point_5_sigma=tot_sig_3_point_5_sigma+0
                        #tot_sig_2_sigma = tot_sig_2_sigma+0
            #lorentzian_pol2.append(tot_sig_2_sigma)
            lorentzian_pol2.append(tot_sig_2_point_5_sigma)
            lorentzian_pol2.append(tot_sig_3_sigma)
            #lorentzian_pol2.append(tot_sig_3_point_5_sigma)
            gc.collect()
canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")       

In [None]:
len(lorentzian_pol2)
#lorentzian_pol2
#lorentzian_3rd_order_pol
#lorentzian_pol2.mean()
#15*15*3*9

In [None]:
configurations = 15*15
binning = 4
size = configurations*3*3*binning
#yields = {'yields':np.zeros(size)}
#df_yields = pd.DataFrame(yields, columns = ['yields'])
#df_yields['yields']= lorentzian_pol2
#df_yields['pt_min']= pt_min
#df_yields['y_min']= y_min
new_yy = df_yields[(df_yields['pt_min']>0.4) & (df_yields['pt_min']<0.8)]
new_yy[(new_yy['y_min']>1) & (new_yy['y_min']<1.4) & (new_yy['yields']>1)]

In [None]:
for i in range(0,27,1):
    df_yields['yields'].iloc[i+2*27] = lorentzian_pol2[i]
    df_yields['yields'].iloc[i+3*27] = lorentzian_3rd_order_pol[i]

In [None]:

df_new = df_yields[(df_yields['yields']<119844+3000)&(df_yields['yields']>119844-3000)]
df_new

In [None]:
#df_yields[df_yields['yields']>0]['yields'].hist(bins=20)
#df_yields[df_yields['yields']>100000]['yields'].mean()
plt.hist(lorentzian_pol2, bins=20)

In [None]:
configurations = 3
size = configurations*3*3*3*2
yields = {'yields':np.zeros(size)}
df_yields = pd.DataFrame(yields, columns = ['yields'])
df_yields['sigma']=np.zeros(size)
df_yields['fit_lim']=np.zeros(size)
df_yields['bins'] = np.zeros(size)
df_yields['numbering'] = np.arange(0,size,1)
df_yields['function'] = np.zeros(size)
df_yields['BDT_cut'] = np.zeros(size)
for i in range(0,27,1):
    df_yields['function'].iloc[i] = 'lorentzian_pol2'
    df_yields['function'].iloc[i+27] = 'lorentzian_pol3'
    df_yields['yields'].iloc[i] = lorentzian_pol2[i]
    df_yields['yields'].iloc[i+27] = lorentzian_3rd_order_pol[i]
    df_yields['function'].iloc[i+2*27] = 'lorentzian_pol2'
    df_yields['function'].iloc[i+3*27] = 'lorentzian_pol3'
        
for i in range(0,size,3):
    df_yields['sigma'].iloc[i]  ='2.5 sigma'
    df_yields['sigma'].iloc[i+1]='3 sigma'
    df_yields['sigma'].iloc[i+2]='3.5 sigma'
#for i in range(0,162,1):
#    df_yields['yields'].iloc[162+i] = third_order_pol[i]

for k in range(0,3,1):
    for l in range (0,12,1):
        df_yields['bins'] .iloc[k+l*9] = 70
        df_yields['bins'] .iloc[k+3+l*9] = 100
        df_yields['bins'] .iloc[k+6+l*9] = 130

        
for i in range(0,4,1):
    for j in range(0,9,1):
            df_yields['fit_lim'].iloc[i*27+j]=mass_range_min[0]+fit_limit_low[0]
            df_yields['fit_lim'].iloc[i*27+9+j]=mass_range_min[0]+fit_limit_low[1]
            df_yields['fit_lim'].iloc[i*27+18+j]=mass_range_min[0]+fit_limit_low[2]

for i in range(0,2*27,1):
    df_yields['BDT_cut'].iloc[i] = 'test_best'
    df_yields['BDT_cut'].iloc[i+2*27] = '0.9'
    df_yields['BDT_cut'].iloc[i+4*27] = '0.8'
#for aj in range(0,int(size/2),1):
#    df_yields['function'].iloc[aj]='Lorentzian plus 2nd order chebyshev'
#    df_yields['function'].iloc[aj+int(size/2)]='Lorentzian plus 3rd order chebyshev'

    
    
import matplotlib
import matplotlib.cm as cm
def yield_plot(variable1, variable2):
    fig, axs = plt.subplots(figsize=(12,10))
    bins1 = 19
    colors = cm.rainbow(np.linspace(0, 1, len(variable1)))
    axs.plot(variable1, variable2,label='', alpha =0.3)
        #axs.set_ylabel('Starting Mass')   

    #axs.legend(loc=(1.04,0.7), fontsize=13)

    
    
yield_plot(df_yields['numbering'],df_yields['yields'])


#df_yields[(df_yields['yields']>(df_yields['yields'].mean()-10)) & (df_yields['yields']<(df_yields['yields'].mean()+10))]
df_yields

In [None]:
#lorentzian 3rd order pol
lorentzian_3rd_order_pol = []

df = df4


mass_range_min = [df['mass'].describe()[1]-1.2*(df['mass'].describe()[2])]
fit_limit_low=[0,0.1* (df['mass'].describe()[2]),   0.2* (df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.1* (df['mass'].describe()[2]),
                df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.2* (df['mass'].describe()[2])]
for mm in mass_range_min:
    for mmm in range(0,3,1):
        canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
        canvas.Draw()
        canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")


        binning = [70,100,130]
        for b in binning:
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            y_bin_low=-0.2
            y_bin_up =0
            for i in range(0,18,1):
                y_bin_low = truncate(y_bin_low+0.2)
                y_bin_up = truncate(y_bin_up+0.2)
                df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
                pt_bin_low =-0.2
                pt_bin_up =0
                for i in range(0,18,1):
                    pt_bin_low = truncate(pt_bin_low+0.2)
                    pt_bin_up = truncate(pt_bin_up+0.2)
                    df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
                    #step 0
                    if df_pt.shape[0]>1000:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fb.SetParameters(0,0,0,0);
                        h0.Fit(fb,"EM");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",low_limit,upper_limit);
                        #f1 = TF1("step1","[0]*exp(-0.5*((x-1.115683)/0.0014)^2)+[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        f1.SetParameters(1,par[0], par[1], par[2], par[3]);
                        h1.Fit(f1,"RNI");
                        par1 = f1.GetParameters()

                        canvas .Clear ()
                        pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                        pad1 . Draw ()
                        pad1 . cd ()
                        pad1. Clear()





                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x+[6]*x*x*x",low_limit,upper_limit)
                        #f2 = TF1("full","[0]*exp(-0.5*((x-[2])/[1])^2)+[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f2.SetNpx(100000);
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3], par1[4]);
                        f2.SetLineColor(ROOT.kRed)
                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIR"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",low_limit,upper_limit);
                        #fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fs.SetNpx(100000);
                        fs.SetLineColor(ROOT.kGreen)
                        fb.SetLineStyle(4)
                        fb.SetLineColor(ROOT.kBlue)
                        fb.SetNpx(100000);
                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5],par2[6]);


                        h1.SetTitleOffset(-1)
                        h1.SetFillStyle(3003);
                        h1.SetLineWidth(2)
                        h1.SetStats (0)
                        h1.SetYTitle("Entries")
                        h1.SetLineColor(ROOT.kBlack)
                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3.SetLineWidth(2)
                        h3.SetStats (0)
                        h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                        h1.Draw("pe")
                        fs.Draw("SAME")
                        fb.Draw("SAME")
                        f2.Draw("SAME")

                        bin1 = h1.FindBin(low_limit);
                        bin2 = h1.FindBin(upper_limit);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            t_value = h1.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                        h2.Sumw2()

                                #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
                    #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));
                    #To integrate area under the signal plus background curve we take 3 sigma and integrate
                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        tot = f2.Integral(integral_min,integral_max)/binwidth;
                        sigma_integral = f2.IntegralError(integral_min,integral_max);
                    #To find the signal, we integrate just the gaussian peak with 3 sigma 
                        signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                        if signal_under_peak<0:
                            print('Negative signal')                
                        sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
                    #Background
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        if backgnd_under_peak<0:
                            print('Negative background')
                        sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                        tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
                    #Significance = signal/(signal+background)^0.5
                        Significance = signal_under_peak/TMath.Sqrt(tot);

                        #3.5 sigma
                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                        tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)


                        std = par2 [1]
                        estd = f2.GetParError(1)

                        latex = ROOT . TLatex ()
                        latex . SetNDC ()
                        latex . SetTextSize (0.02)
                        latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                        latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                        latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                        legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                        legend.AddEntry(h1,"Invariant mass of lambda","l");
                        legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}+Ex^{3}","l");
                        legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                        legend.AddEntry(fb,"B+Cx+Dx^{2}+Ex^{3}","l");
                        legend . SetLineWidth (0)
                        legend.Draw()

                        canvas . cd ()
                        pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                        pad2 . Draw ()
                        pad2 . cd ()
                        pad2.Clear()


                        h3.SetLineColor(TColor.GetColor(5))
                        h3.SetYTitle("d-f/#Deltad")
                        h3.Draw()
                        line = ROOT . TLine (mm,0 ,1.23 ,0)
                        line . SetLineColor ( ROOT . kRed )
                        line . SetLineWidth (2)
                        line . Draw (" same ")


                        pad1 . SetBottomMargin (0)
                        pad2 . SetTopMargin (0)
                        pad2 . SetBottomMargin (0.25)

                        h1 . GetXaxis (). SetLabelSize (0)
                        h1 . GetXaxis (). SetTitleSize (0)
                        h1 . GetYaxis (). SetTitleSize (0.05)
                        h1 . GetYaxis (). SetLabelSize (0.03)
                        h1 . GetYaxis (). SetTitleOffset (0.6)

                        h3 . SetTitle ("")
                        h3 . GetXaxis (). SetLabelSize (0.12)
                        h3 . GetXaxis (). SetTitleSize (0.12)
                        h3 . GetYaxis (). SetLabelSize (0.1)
                        h3 . GetYaxis (). SetTitleSize (0.15)
                    #ratio . GetYaxis (). SetTitle (" Data /MC")
                        h3 . GetYaxis (). SetTitleOffset (0.17)
                    #207,512 divisions
                        h3 . GetYaxis (). SetNdivisions (207)
                        h1 . GetYaxis (). SetRangeUser (0.5 ,3000)
                        h1 .GetYaxis().SetNdivisions(107)
                        h3 . GetXaxis (). SetNdivisions (207)

                        canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            lorentzian_3rd_order_pol.append(tot_sig_2_point_5_sigma)
            lorentzian_3rd_order_pol.append(tot_sig_3_sigma)
            lorentzian_3rd_order_pol.append(tot_sig_3_point_5_sigma)
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")       

In [None]:
#gaussian + second order pol
gaussian_2nd = []

df = df4


mass_range_min = [df['mass'].describe()[1]-1.2*(df['mass'].describe()[2])]
fit_limit_low=[0,0.1* (df['mass'].describe()[2]),   0.2* (df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2]),
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.1* (df['mass'].describe()[2]),
                df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.2* (df['mass'].describe()[2])]
for mm in mass_range_min:
    for mmm in range(0,3,1):
        canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
        canvas.Draw()
        canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")


        binning = [70,100,130]
        for b in binning:
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            y_bin_low=-0.2
            y_bin_up =0
            for i in range(0,18,1):
                y_bin_low = truncate(y_bin_low+0.2)
                y_bin_up = truncate(y_bin_up+0.2)
                df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
                pt_bin_low =-0.2
                pt_bin_up =0
                for i in range(0,18,1):
                    pt_bin_low = truncate(pt_bin_low+0.2)
                    pt_bin_up = truncate(pt_bin_up+0.2)
                    df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
                    #step 0
                    if df_pt.shape[0]>1000:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fb.SetParameters(0,0,0);
                        h0.Fit(fb,"EM");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",low_limit,upper_limit);
                        #f1 = TF1("step1","[0]*exp(-0.5*((x-1.115683)/0.0014)^2)+[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        f1.SetParameters(1,par[0], par[1], par[2]);
                        h1.Fit(f1,"RNI");
                        par1 = f1.GetParameters()

                        canvas .Clear ()
                        pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                        pad1 . Draw ()
                        pad1 . cd ()
                        pad1. Clear()





                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",low_limit,upper_limit)
                        #f2 = TF1("full","[0]*exp(-0.5*((x-[2])/[1])^2)+[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f2.SetNpx(100000);
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3]);
                        f2.SetLineColor(ROOT.kRed)
                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIR"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",low_limit,upper_limit);
                        #fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fs.SetNpx(100000);
                        fs.SetLineColor(ROOT.kGreen)
                        fb.SetLineStyle(4)
                        fb.SetLineColor(ROOT.kBlue)
                        fb.SetNpx(100000);
                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5]);


                        h1.SetTitleOffset(-1)
                        h1.SetFillStyle(3003);
                        h1.SetLineWidth(2)
                        h1.SetStats (0)
                        h1.SetYTitle("Entries")
                        h1.SetLineColor(ROOT.kBlack)
                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3.SetLineWidth(2)
                        h3.SetStats (0)
                        h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                        h1.Draw("pe")
                        fs.Draw("SAME")
                        fb.Draw("SAME")
                        f2.Draw("SAME")

                        bin1 = h1.FindBin(low_limit);
                        bin2 = h1.FindBin(upper_limit);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            t_value = h1.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                        h2.Sumw2()

                                #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
                    #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));
                    #To integrate area under the signal plus background curve we take 3 sigma and integrate
                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        tot = f2.Integral(integral_min,integral_max)/binwidth;
                        sigma_integral = f2.IntegralError(integral_min,integral_max);
                    #To find the signal, we integrate just the gaussian peak with 3 sigma 
                        signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                        if signal_under_peak<0:
                            print('Negative signal')                
                        sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
                    #Background
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        if backgnd_under_peak<0:
                            print('Negative background')
                        sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                        tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
                    #Significance = signal/(signal+background)^0.5
                        Significance = signal_under_peak/TMath.Sqrt(tot);

                        #3.5 sigma
                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                        tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)


                        std = par2 [1]
                        estd = f2.GetParError(1)

                        latex = ROOT . TLatex ()
                        latex . SetNDC ()
                        latex . SetTextSize (0.02)
                        latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                        latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                        latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                        legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                        legend.AddEntry(h1,"Invariant mass of lambda","l");
                        legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
                        legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                        legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                        legend . SetLineWidth (0)
                        legend.Draw()

                        canvas . cd ()
                        pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                        pad2 . Draw ()
                        pad2 . cd ()
                        pad2.Clear()


                        h3.SetLineColor(TColor.GetColor(5))
                        h3.SetYTitle("d-f/#Deltad")
                        h3.Draw()
                        line = ROOT . TLine (mm,0 ,1.23 ,0)
                        line . SetLineColor ( ROOT . kRed )
                        line . SetLineWidth (2)
                        line . Draw (" same ")


                        pad1 . SetBottomMargin (0)
                        pad2 . SetTopMargin (0)
                        pad2 . SetBottomMargin (0.25)

                        h1 . GetXaxis (). SetLabelSize (0)
                        h1 . GetXaxis (). SetTitleSize (0)
                        h1 . GetYaxis (). SetTitleSize (0.05)
                        h1 . GetYaxis (). SetLabelSize (0.03)
                        h1 . GetYaxis (). SetTitleOffset (0.6)

                        h3 . SetTitle ("")
                        h3 . GetXaxis (). SetLabelSize (0.12)
                        h3 . GetXaxis (). SetTitleSize (0.12)
                        h3 . GetYaxis (). SetLabelSize (0.1)
                        h3 . GetYaxis (). SetTitleSize (0.15)
                    #ratio . GetYaxis (). SetTitle (" Data /MC")
                        h3 . GetYaxis (). SetTitleOffset (0.17)
                    #207,512 divisions
                        h3 . GetYaxis (). SetNdivisions (207)
                        h1 . GetYaxis (). SetRangeUser (0.5 ,3000)
                        h1 .GetYaxis().SetNdivisions(107)
                        h3 . GetXaxis (). SetNdivisions (207)

                        canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            gaussian_2nd.append(tot_sig_2_point_5_sigma)
            gaussian_2nd.append(tot_sig_3_sigma)
            gaussian_2nd.append(tot_sig_3_point_5_sigma)
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")       

## 3rd order chebyshev back

In [None]:
third_order_pol = []

mass_range_min = [1.0853341]
fit_limit_low=[0,0.05*0.059818,0.1*0.059818,1.2707699000000001,1.2707699000000001+(0.05*0.059818),1.2707699000000001 +(0.1*0.059818)]
for mm in mass_range_min:
    canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
    canvas.Draw()
    canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    

    binning = [70,100,130]
    for b in binning:

        bins=b
        for mmm in range(0,3,1):
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            for entry in list1:
                distribution = entry
            #Step 1
                data0 = background_selector(entry)
                h0 = ROOT.TH1F("Background","Background without peak",bins,mm,1.23)
                for i in range(0,data0.shape[0]):
                    h0.Fill(data0.iloc[i])
                fb = TF1("fb","[0]*x*x*x-[1]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fb.SetParameters(0,0);
                h0.Fit(fb,"RNIFCWW");
                par = fb.GetParameters()

            #Step 1
                data = distribution['mass']
        #the minimum x (lower edge of the first bin)=mm        
                h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(distribution['rapidity'].min(),distribution['rapidity'].max(),distribution['pT'].min(),distribution['pT'].max(), mm, bins),bins,mm,1.23)
                for i in range(0,data.shape[0]):
                    h1.Fill(data.iloc[i])
                f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]*x*x*x-[2]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f1.SetParameters(1,par[0],par[1]);
                h1.Fit(f1,"RNI");
                par1 = f1.GetParameters()


            #Step2
                canvas .Clear ()
                pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                pad1 . Draw ()
                pad1 . cd ()
                pad1. Clear()


                h1.SetTitleOffset(-1)
                h1.SetFillStyle(3003);
                h1.SetLineWidth(2)
                h1.SetStats (0)
                h1.SetYTitle("Entries")
                h1.SetLineColor(ROOT.kBlack)
                h2 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3.SetLineWidth(2)
                h3.SetStats (0)
                h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ 0.25*[1]*[1]) +[3]*x*x*x-[4]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f2.SetNpx(100000);
                f2.SetParameters(par1[0],0.0001,1.115,par1[1], par1[2]);
                f2.SetLineColor(ROOT.kRed)
                h1.Fit(f2,"MNIR");
                par2 = f2.GetParameters()


                h1.Draw("pe")
                fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fs.SetNpx(100000);
                fs.SetLineColor(ROOT.kGreen)
                fb.SetLineStyle(4)
                fb.SetLineColor(ROOT.kBlue)
                fb.SetNpx(100000);
                fs.SetParameters(par2[0],par2[1],par2[2]);
                fb.SetParameters(par2[3],par2[4]);
                fs.Draw("SAME")
                fb.Draw("SAME")
                f2.Draw("SAME")

                bin1 = h1.FindBin(mm+fit_limit_low[mmm]);
                bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                for i in range(bin1,bin2):
                    f_value= f2.Eval(h1.GetBinCenter(i));
                    t_value = h1.GetBinContent(i)
                    h2.SetBinContent(i,f_value)
                    if (h1.GetBinError(i) > 0):
                        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                h2.Sumw2()

            #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
            #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                integral_max = par2[2] + (TMath.Abs(3*par2[1]));
            #To integrate area under the signal plus background curve we take 3 sigma and integrate
                binwidth = h1.GetXaxis().GetBinWidth(1);
                tot = f2.Integral(integral_min,integral_max)/binwidth;
                sigma_integral = f2.IntegralError(integral_min,integral_max);
            #To find the signal, we integrate just the gaussian peak with 3 sigma 
                signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                if signal_under_peak<0:
                    print('Negative signal')                

                sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                if sigma_signal_under_peak!=0:
                    print("Integral errors ",sigma_signal_under_peak)

                tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
            #Background
                backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                if backgnd_under_peak<0:
                    print('negative background')                
                sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
            #Significance = signal/(signal+background)^0.5
                Significance = signal_under_peak/TMath.Sqrt(tot);

                #3.5 sigma
                signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




                std = par2 [1]
                estd = f2.GetParError(1)

                latex = ROOT . TLatex ()
                latex . SetNDC ()
                latex . SetTextSize (0.02)
                latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                legend.AddEntry(h1,"Invariant mass of lambda","l");
                legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+Bx^{}-Cx","l");
                legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                legend.AddEntry(fb,"Bx^{3}-Cx","l");
                legend . SetLineWidth (0)
                legend.Draw()
                
                canvas . cd ()
                pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                pad2 . Draw ()
                pad2 . cd ()
                pad2.Clear()


                h3.SetLineColor(TColor.GetColor(5))
                h3.SetYTitle("d-f/#Deltad")
                h3.Draw()
                line = ROOT . TLine (mm,0 ,1.23 ,0)
                line . SetLineColor ( ROOT . kRed )
                line . SetLineWidth (2)
                line . Draw (" same ")


                pad1 . SetBottomMargin (0)
                pad2 . SetTopMargin (0)
                pad2 . SetBottomMargin (0.25)

                h1 . GetXaxis (). SetLabelSize (0)
                h1 . GetXaxis (). SetTitleSize (0)
                h1 . GetYaxis (). SetTitleSize (0.05)
                h1 . GetYaxis (). SetLabelSize (0.03)
                h1 . GetYaxis (). SetTitleOffset (0.6)

                h3 . SetTitle ("")
                h3 . GetXaxis (). SetLabelSize (0.12)
                h3 . GetXaxis (). SetTitleSize (0.12)
                h3 . GetYaxis (). SetLabelSize (0.1)
                h3 . GetYaxis (). SetTitleSize (0.15)
            #ratio . GetYaxis (). SetTitle (" Data /MC")
                h3 . GetYaxis (). SetTitleOffset (0.17)
            #207,512 divisions
                h3 . GetYaxis (). SetNdivisions (207)
                h1 . GetYaxis (). SetRangeUser (0.5 ,2500)
                h1 .GetYaxis().SetNdivisions(107)
                h3 . GetXaxis (). SetNdivisions (207)

                canvas.Update()
                canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            third_order_pol.append(tot_sig_2_point_5_sigma)
            third_order_pol.append(tot_sig_3_sigma)
            third_order_pol.append(tot_sig_3_point_5_sigma)

canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

## 2nd order normal

In [None]:
second_order_pol = []
mass_range_min = [1.0853341]
fit_limit_low=[0,0.05*0.059818,0.1*0.059818,1.2707699000000001,1.2707699000000001+(0.05*0.059818),1.2707699000000001 +(0.1*0.059818)]
for mm in mass_range_min:
    canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
    canvas.Draw()
    canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    

    binning = [70,100,130]
    for b in binning:

        bins=b
        for mmm in range(0,3,1):
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            for entry in list1:
                distribution = entry
            #Step 1
                data0 = background_selector(entry)
                h0 = ROOT.TH1F("Background","Background without peak",bins,mm,1.23)
                for i in range(0,data0.shape[0]):
                    h0.Fill(data0.iloc[i])
                fb = TF1("fb","[0]+[1]*x+[2]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fb.SetParameters(0,0,0);
                h0.Fit(fb,"EM");
                par = fb.GetParameters()

            #Step 1
                data = distribution['mass']
        #the minimum x (lower edge of the first bin)=mm        
                h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(distribution['rapidity'].min(),distribution['rapidity'].max(),distribution['pT'].min(),distribution['pT'].max(), mm, bins),bins,mm,1.23)
                for i in range(0,data.shape[0]):
                    h1.Fill(data.iloc[i])
                f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f1.SetParameters(1,par[0],par[1],par[2]);
                h1.Fit(f1,"EM");
                par1 = f1.GetParameters()


            #Step2
                canvas .Clear ()
                pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                pad1 . Draw ()
                pad1 . cd ()
                pad1. Clear()


                h1.SetTitleOffset(-1)
                h1.SetFillStyle(3003);
                h1.SetLineWidth(2)
                h1.SetStats (0)
                h1.SetYTitle("Entries")
                h1.SetLineColor(ROOT.kBlack)
                h2 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3.SetLineWidth(2)
                h3.SetStats (0)
                h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3])
                f2.SetNpx(100000);
                f2.SetParameters(par1[0],0.0001,1.115,par1[1],par1[2],par1[3]);
                f2.SetLineColor(ROOT.kRed)
                h1.Fit(f2,"EM");
                par2 = f2.GetParameters()


                h1.Draw("pe")
                fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fs.SetNpx(100000);
                fs.SetLineColor(ROOT.kGreen)
                fb.SetLineStyle(4)
                fb.SetLineColor(ROOT.kBlue)
                fb.SetNpx(100000);
                fs.SetParameters(par2[0],par2[1],par2[2]);
                fb.SetParameters(par2[3],par2[4],par2[5]);
                fs.Draw("SAME")
                fb.Draw("SAME")
                f2.Draw("SAME")

                bin1 = h1.FindBin(mm+fit_limit_low[mmm]);
                bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                for i in range(bin1,bin2):
                    f_value= f2.Eval(h1.GetBinCenter(i));
                    t_value = h1.GetBinContent(i)
                    h2.SetBinContent(i,f_value)
                    if (h1.GetBinError(i) > 0):
                        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                h2.Sumw2()

            #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
            #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                integral_max = par2[2] + (TMath.Abs(3*par2[1]));
            #To integrate area under the signal plus background curve we take 3 sigma and integrate
                binwidth = h1.GetXaxis().GetBinWidth(1);
                tot = f2.Integral(integral_min,integral_max)/binwidth;
                sigma_integral = f2.IntegralError(integral_min,integral_max);
            #To find the signal, we integrate just the gaussian peak with 3 sigma 
                signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                if sigma_signal_under_peak!=0:
                    print("Integral errors ",sigma_signal_under_peak)

                tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
            #Background
                backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                if backgnd_under_peak<0:
                    print('Negative background')
                sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
            #Significance = signal/(signal+background)^0.5
                Significance = signal_under_peak/TMath.Sqrt(tot);

                #3.5 sigma
                signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




                std = par2 [1]
                estd = f2.GetParError(1)

                latex = ROOT . TLatex ()
                latex . SetNDC ()
                latex . SetTextSize (0.02)
                latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                legend.AddEntry(h1,"Invariant mass of lambda","l");
                legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
                legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                legend . SetLineWidth (0)
                legend.Draw()

                canvas . cd ()
                pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                pad2 . Draw ()
                pad2 . cd ()
                pad2.Clear()


                h3.SetLineColor(TColor.GetColor(5))
                h3.SetYTitle("d-f/#Deltad")
                h3.Draw()
                line = ROOT . TLine (mm,0 ,1.23 ,0)
                line . SetLineColor ( ROOT . kRed )
                line . SetLineWidth (2)
                line . Draw (" same ")


                pad1 . SetBottomMargin (0)
                pad2 . SetTopMargin (0)
                pad2 . SetBottomMargin (0.25)

                h1 . GetXaxis (). SetLabelSize (0)
                h1 . GetXaxis (). SetTitleSize (0)
                h1 . GetYaxis (). SetTitleSize (0.05)
                h1 . GetYaxis (). SetLabelSize (0.03)
                h1 . GetYaxis (). SetTitleOffset (0.6)

                h3 . SetTitle ("")
                h3 . GetXaxis (). SetLabelSize (0.12)
                h3 . GetXaxis (). SetTitleSize (0.12)
                h3 . GetYaxis (). SetLabelSize (0.1)
                h3 . GetYaxis (). SetTitleSize (0.15)
            #ratio . GetYaxis (). SetTitle (" Data /MC")
                h3 . GetYaxis (). SetTitleOffset (0.17)
            #207,512 divisions
                h3 . GetYaxis (). SetNdivisions (207)
                h1 . GetYaxis (). SetRangeUser (0.5 ,7000)
                h1 .GetYaxis().SetNdivisions(107)
                h3 . GetXaxis (). SetNdivisions (207)

                canvas.Update()
                canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            second_order_pol.append(tot_sig_2_point_5_sigma)
            second_order_pol.append(tot_sig_3_sigma)
            second_order_pol.append(tot_sig_3_point_5_sigma)

canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
del h0, canvas, h1, h2, h3, pad1, pad2, f1,f2, fs, fb
gc.collect()

## Linear

In [None]:
linear_pol = []
mass_range_min = [1.0853341]
fit_limit_low=[0,0.05*0.059818,0.1*0.059818,1.2707699000000001,1.2707699000000001+(0.05*0.059818),1.2707699000000001 +(0.1*0.059818)]
for mm in mass_range_min:
    canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
    canvas.Draw()
    canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    

    binning = [70,100,130]
    for b in binning:

        bins=b
        for mmm in range(0,3,1):
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            for entry in list1:
                distribution = entry
            #Step 1
                data0 = background_selector(entry)
                h0 = ROOT.TH1F("Background","Background without peak",bins,mm,1.23)
                for i in range(0,data0.shape[0]):
                    h0.Fill(data0.iloc[i])
                fb = TF1("fb","[0]+[1]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fb.SetParameters(0,0);
                h0.Fit(fb,"RNIFCWW");
                par = fb.GetParameters()

            #Step 1
                data = distribution['mass']
        #the minimum x (lower edge of the first bin)=mm        
                h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(distribution['rapidity'].min(),distribution['rapidity'].max(),distribution['pT'].min(),distribution['pT'].max(), mm, bins),bins,mm,1.23)
                for i in range(0,data.shape[0]):
                    h1.Fill(data.iloc[i])
                f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f1.SetParameters(1,par[0],par[1]);
                h1.Fit(f1,"RNI");
                par1 = f1.GetParameters()


            #Step2
                canvas .Clear ()
                pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                pad1 . Draw ()
                pad1 . cd ()
                pad1. Clear()


                h1.SetTitleOffset(-1)
                h1.SetFillStyle(3003);
                h1.SetLineWidth(2)
                h1.SetStats (0)
                h1.SetYTitle("Entries")
                h1.SetLineColor(ROOT.kBlack)
                h2 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3.SetLineWidth(2)
                h3.SetStats (0)
                h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3])
                f2.SetNpx(100000);
                f2.SetParameters(par1[0],0.0001,1.115,par1[1],par1[2]);
                f2.SetLineColor(ROOT.kRed)
                h1.Fit(f2,"MNIR");
                par2 = f2.GetParameters()


                h1.Draw("pe")
                fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fs.SetNpx(100000);
                fs.SetLineColor(ROOT.kGreen)
                fb.SetLineStyle(4)
                fb.SetLineColor(ROOT.kBlue)
                fb.SetNpx(100000);
                fs.SetParameters(par2[0],par2[1],par2[2]);
                fb.SetParameters(par2[3],par2[4]);
                fs.Draw("SAME")
                fb.Draw("SAME")
                f2.Draw("SAME")

                bin1 = h1.FindBin(mm+fit_limit_low[mmm]);
                bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                for i in range(bin1,bin2):
                    f_value= f2.Eval(h1.GetBinCenter(i));
                    t_value = h1.GetBinContent(i)
                    h2.SetBinContent(i,f_value)
                    if (h1.GetBinError(i) > 0):
                        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                h2.Sumw2()

            #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
            #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                integral_max = par2[2] + (TMath.Abs(3*par2[1]));
            #To integrate area under the signal plus background curve we take 3 sigma and integrate
                binwidth = h1.GetXaxis().GetBinWidth(1);
                tot = f2.Integral(integral_min,integral_max)/binwidth;
                sigma_integral = f2.IntegralError(integral_min,integral_max);
            #To find the signal, we integrate just the gaussian peak with 3 sigma 
                signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                if sigma_signal_under_peak!=0:
                    print("Integral errors ",sigma_signal_under_peak)

                tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
            #Background
                backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                if backgnd_under_peak<0:
                    print('fail')
                sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
            #Significance = signal/(signal+background)^0.5
                Significance = signal_under_peak/TMath.Sqrt(tot);

                #3.5 sigma
                signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




                std = par2 [1]
                estd = f2.GetParError(1)

                latex = ROOT . TLatex ()
                latex . SetNDC ()
                latex . SetTextSize (0.02)
                latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                legend.AddEntry(h1,"Invariant mass of lambda","l");
                legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx","l");
                legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                legend.AddEntry(fb,"B+Cx","l");
                legend . SetLineWidth (0)
                legend.Draw()

                canvas . cd ()
                pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                pad2 . Draw ()
                pad2 . cd ()
                pad2.Clear()


                h3.SetLineColor(TColor.GetColor(5))
                h3.SetYTitle("d-f/#Deltad")
                h3.Draw()
                line = ROOT . TLine (mm,0 ,1.23 ,0)
                line . SetLineColor ( ROOT . kRed )
                line . SetLineWidth (2)
                line . Draw (" same ")


                pad1 . SetBottomMargin (0)
                pad2 . SetTopMargin (0)
                pad2 . SetBottomMargin (0.25)

                h1 . GetXaxis (). SetLabelSize (0)
                h1 . GetXaxis (). SetTitleSize (0)
                h1 . GetYaxis (). SetTitleSize (0.05)
                h1 . GetYaxis (). SetLabelSize (0.03)
                h1 . GetYaxis (). SetTitleOffset (0.6)

                h3 . SetTitle ("")
                h3 . GetXaxis (). SetLabelSize (0.12)
                h3 . GetXaxis (). SetTitleSize (0.12)
                h3 . GetYaxis (). SetLabelSize (0.1)
                h3 . GetYaxis (). SetTitleSize (0.15)
            #ratio . GetYaxis (). SetTitle (" Data /MC")
                h3 . GetYaxis (). SetTitleOffset (0.17)
            #207,512 divisions
                h3 . GetYaxis (). SetNdivisions (207)
                h1 . GetYaxis (). SetRangeUser (0.5 ,2500)
                h1 .GetYaxis().SetNdivisions(107)
                h3 . GetXaxis (). SetNdivisions (207)

                canvas.Update()
                canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            linear_pol.append(tot_sig_2_point_5_sigma)
            linear_pol.append(tot_sig_3_sigma)
            linear_pol.append(tot_sig_3_point_5_sigma)

canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
gaussian_2nd_order_pol

In [None]:
df_clean[df_clean['issignal']==1].shape

## Gaussian with second order pol

In [None]:
gaussian_2nd_order_pol = []
mass_range_min = [1.0853341]
fit_limit_low=[0,0.05*0.059818,0.1*0.059818,1.2707699000000001,1.2707699000000001+(0.05*0.059818),1.2707699000000001 +(0.1*0.059818)]
for mm in mass_range_min:
    canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
    canvas.Draw()
    canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    

    binning = [70,100,130]
    for b in binning:

        bins=b
        for mmm in range(0,3,1):
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            for entry in list1:
                distribution = entry
            #Step 1
                data0 = background_selector(entry)
                h0 = ROOT.TH1F("Background","Background without peak",bins,mm,1.23)
                for i in range(0,data0.shape[0]):
                    h0.Fill(data0.iloc[i])
                fb = TF1("fb","[0]+[1]*x+[2]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fb.SetParameters(0,0,0);
                h0.Fit(fb,"LRNIFCWW");
                par = fb.GetParameters()

            #Step 1
                data = distribution['mass']
        #the minimum x (lower edge of the first bin)=mm        
                h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(distribution['rapidity'].min(),distribution['rapidity'].max(),distribution['pT'].min(),distribution['pT'].max(), mm, bins),bins,mm,1.23)
                for i in range(0,data.shape[0]):
                    h1.Fill(data.iloc[i])
                f1 = TF1("step1","[0]*exp(-0.5*((x-1.115683)/0.0014)^2)+[1]+[2]*x+[3]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f1.SetParameters(1,par[0],par[1], par[2]);
                h1.Fit(f1,"LRNI");
                par1 = f1.GetParameters()


            #Step2
                canvas .Clear ()
                pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                pad1 . Draw ()
                pad1 . cd ()
                pad1. Clear()


                h1.SetTitleOffset(-1)
                h1.SetFillStyle(3003);
                h1.SetLineWidth(2)
                h1.SetStats (0)
                h1.SetYTitle("Entries")
                h1.SetLineColor(ROOT.kBlack)
                h2 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3.SetLineWidth(2)
                h3.SetStats (0)
                h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                f2 = TF1("full","[0]*exp(-0.5*((x-[2])/[1])^2)+[3]+[4]*x+[5]*x*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3])
                f2.SetNpx(100000);
                f2.SetParameters(par1[0],0.001,1.115,par1[1],par1[2], par1[3]);
                f2.SetLineColor(ROOT.kRed)
                h1.Fit(f2,"LMNIR");
                par2 = f2.GetParameters()


                h1.Draw("pe")
                fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fs.SetNpx(100000);
                fs.SetLineColor(ROOT.kGreen)
                fb.SetLineStyle(4)
                fb.SetLineColor(ROOT.kBlue)
                fb.SetNpx(100000);
                fs.SetParameters(par2[0],par2[1],par2[2]);
                fb.SetParameters(par2[3],par2[4], par2[5]);
                fs.Draw("SAME")
                fb.Draw("SAME")
                f2.Draw("SAME")

                bin1 = h1.FindBin(mm+fit_limit_low[mmm]);
                bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                for i in range(bin1,bin2):
                    f_value= f2.Eval(h1.GetBinCenter(i));
                    t_value = h1.GetBinContent(i)
                    h2.SetBinContent(i,f_value)
                    if (h1.GetBinError(i) > 0):
                        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                h2.Sumw2()

            #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
            #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                integral_max = par2[2] + (TMath.Abs(3*par2[1]));
            #To integrate area under the signal plus background curve we take 3 sigma and integrate
                binwidth = h1.GetXaxis().GetBinWidth(1);
                tot = f2.Integral(integral_min,integral_max)/binwidth;
                sigma_integral = f2.IntegralError(integral_min,integral_max);
            #To find the signal, we integrate just the gaussian peak with 3 sigma 
                signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                if sigma_signal_under_peak!=0:
                    print("Integral errors ",sigma_signal_under_peak)

                tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
            #Background
                backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                if backgnd_under_peak<0:
                    print('fail')
                sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
            #Significance = signal/(signal+background)^0.5
                Significance = signal_under_peak/TMath.Sqrt(tot);

                #3.5 sigma
                signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




                std = par2 [1]
                estd = f2.GetParError(1)

                latex = ROOT . TLatex ()
                latex . SetNDC ()
                latex . SetTextSize (0.02)
                latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                legend.AddEntry(h1,"Invariant mass of lambda","l");
                legend.AddEntry(f2,"Ae^{#frac{-1}{2} #frac{(x-#mu)^{2}}{#sigma^{2}}}+B+Cx+Dx^{2}","l");
                legend.AddEntry(fs,"Ae^{#frac{-1}{2} #frac{(x-#mu)^{2}}{#sigma^{2}}}","l");
                legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                legend . SetLineWidth (0)
                legend.Draw()

                canvas . cd ()
                pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                pad2 . Draw ()
                pad2 . cd ()
                pad2.Clear()


                h3.SetLineColor(TColor.GetColor(5))
                h3.SetYTitle("d-f/#Deltad")
                h3.Draw()
                line = ROOT . TLine (mm,0 ,1.23 ,0)
                line . SetLineColor ( ROOT . kRed )
                line . SetLineWidth (2)
                line . Draw (" same ")


                pad1 . SetBottomMargin (0)
                pad2 . SetTopMargin (0)
                pad2 . SetBottomMargin (0.25)

                h1 . GetXaxis (). SetLabelSize (0)
                h1 . GetXaxis (). SetTitleSize (0)
                h1 . GetYaxis (). SetTitleSize (0.05)
                h1 . GetYaxis (). SetLabelSize (0.03)
                h1 . GetYaxis (). SetTitleOffset (0.6)

                h3 . SetTitle ("")
                h3 . GetXaxis (). SetLabelSize (0.12)
                h3 . GetXaxis (). SetTitleSize (0.12)
                h3 . GetYaxis (). SetLabelSize (0.1)
                h3 . GetYaxis (). SetTitleSize (0.15)
            #ratio . GetYaxis (). SetTitle (" Data /MC")
                h3 . GetYaxis (). SetTitleOffset (0.17)
            #207,512 divisions
                h3 . GetYaxis (). SetNdivisions (207)
                h1 . GetYaxis (). SetRangeUser (0.5 ,2950)
                h1 .GetYaxis().SetNdivisions(107)
                h3 . GetXaxis (). SetNdivisions (207)

                canvas.Update()
                canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            gaussian_2nd_order_pol.append(tot_sig_2_point_5_sigma)
            gaussian_2nd_order_pol.append(tot_sig_3_sigma)
            gaussian_2nd_order_pol.append(tot_sig_3_point_5_sigma)

canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

## Gaussian with linear

In [None]:
gaussian_linear = []
mass_range_min = [1.08,1.085, 1.09, 1.092, 1.094, 1.096]
fit_limit_low=[0,0.001,0.005,1.21,1.22,1.23]
for mm in mass_range_min:
    canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
    canvas.Draw()
    canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    

    binning = [70,100,130]
    for b in binning:

        bins=b
        for mmm in range(0,3,1):
            tot_sig_3_sigma = 0
            tot_bac_3_sigma = 0
            tot_sig_3_point_5_sigma = 0
            tot_bac_3_point_5_sigma = 0
            tot_sig_2_point_5_sigma = 0
            tot_bac_2_point_5_sigma = 0
            for entry in list1:
                distribution = entry
            #Step 1
                data0 = background_selector(entry)
                h0 = ROOT.TH1F("Background","Background without peak",bins,mm,1.23)
                for i in range(0,data0.shape[0]):
                    h0.Fill(data0.iloc[i])
                fb = TF1("fb","[0]+[1]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fb.SetParameters(0,0);
                h0.Fit(fb,"LRNIFCWW");
                par = fb.GetParameters()

            #Step 1
                data = distribution['mass']
        #the minimum x (lower edge of the first bin)=mm        
                h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(distribution['rapidity'].min(),distribution['rapidity'].max(),distribution['pT'].min(),distribution['pT'].max(), mm, bins),bins,mm,1.23)
                for i in range(0,data.shape[0]):
                    h1.Fill(data.iloc[i])
                f1 = TF1("step1","[0]*exp(-0.5*((x-1.115683)/0.0014)^2)+[1]+[2]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                f1.SetParameters(1,par[0],par[1]);
                h1.Fit(f1,"LRNI");
                par1 = f1.GetParameters()


            #Step2
                canvas .Clear ()
                pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                pad1 . Draw ()
                pad1 . cd ()
                pad1. Clear()


                h1.SetTitleOffset(-1)
                h1.SetFillStyle(3003);
                h1.SetLineWidth(2)
                h1.SetStats (0)
                h1.SetYTitle("Entries")
                h1.SetLineColor(ROOT.kBlack)
                h2 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3 = ROOT.TH1F("h2", "", bins, mm, 1.23);
                h3.SetLineWidth(2)
                h3.SetStats (0)
                h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

                f2 = TF1("full","[0]*exp(-0.5*((x-[2])/[1])^2)+[3]+[4]*x",mm+fit_limit_low[mmm],fit_limit_low[mmm+3])
                f2.SetNpx(100000);
                f2.SetParameters(par1[0],0.001,1.115,par1[1],par1[2]);
                f2.SetLineColor(ROOT.kRed)
                h1.Fit(f2,"LMNIR");
                par2 = f2.GetParameters()


                h1.Draw("pe")
                fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",mm+fit_limit_low[mmm],fit_limit_low[mmm+3]);
                fs.SetNpx(100000);
                fs.SetLineColor(ROOT.kGreen)
                fb.SetLineStyle(4)
                fb.SetLineColor(ROOT.kBlue)
                fb.SetNpx(100000);
                fs.SetParameters(par2[0],par2[1],par2[2]);
                fb.SetParameters(par2[3],par2[4]);
                fs.Draw("SAME")
                fb.Draw("SAME")
                f2.Draw("SAME")

                bin1 = h1.FindBin(mm+fit_limit_low[mmm]);
                bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                for i in range(bin1,bin2):
                    f_value= f2.Eval(h1.GetBinCenter(i));
                    t_value = h1.GetBinContent(i)
                    h2.SetBinContent(i,f_value)
                    if (h1.GetBinError(i) > 0):
                        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                h2.Sumw2()

            #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
            #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                integral_max = par2[2] + (TMath.Abs(3*par2[1]));
            #To integrate area under the signal plus background curve we take 3 sigma and integrate
                binwidth = h1.GetXaxis().GetBinWidth(1);
                tot = f2.Integral(integral_min,integral_max)/binwidth;
                sigma_integral = f2.IntegralError(integral_min,integral_max);
            #To find the signal, we integrate just the gaussian peak with 3 sigma 
                signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
                sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                if sigma_signal_under_peak!=0:
                    print("Integral errors ",sigma_signal_under_peak)

                tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
            #Background
                backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                if backgnd_under_peak<0:
                    print('fail')
                sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
            #Significance = signal/(signal+background)^0.5
                Significance = signal_under_peak/TMath.Sqrt(tot);

                #3.5 sigma
                signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




                std = par2 [1]
                estd = f2.GetParError(1)

                latex = ROOT . TLatex ()
                latex . SetNDC ()
                latex . SetTextSize (0.02)
                latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
                latex . DrawLatex (0.4 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


                legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                legend.AddEntry(h1,"Invariant mass of lambda","l");
                legend.AddEntry(f2,"Ae^{#frac{-1}{2} #frac{(x-#mu)^{2}}{#sigma^{2}}}+B+Cx","l");
                legend.AddEntry(fs,"Ae^{#frac{-1}{2} #frac{(x-#mu)^{2}}{#sigma^{2}}}","l");
                legend.AddEntry(fb,"B+Cx","l");
                legend . SetLineWidth (0)
                legend.Draw()

                canvas . cd ()
                pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                pad2 . Draw ()
                pad2 . cd ()
                pad2.Clear()


                h3.SetLineColor(TColor.GetColor(5))
                h3.SetYTitle("d-f/#Deltad")
                h3.Draw()
                line = ROOT . TLine (mm,0 ,1.23 ,0)
                line . SetLineColor ( ROOT . kRed )
                line . SetLineWidth (2)
                line . Draw (" same ")


                pad1 . SetBottomMargin (0)
                pad2 . SetTopMargin (0)
                pad2 . SetBottomMargin (0.25)

                h1 . GetXaxis (). SetLabelSize (0)
                h1 . GetXaxis (). SetTitleSize (0)
                h1 . GetYaxis (). SetTitleSize (0.05)
                h1 . GetYaxis (). SetLabelSize (0.03)
                h1 . GetYaxis (). SetTitleOffset (0.6)

                h3 . SetTitle ("")
                h3 . GetXaxis (). SetLabelSize (0.12)
                h3 . GetXaxis (). SetTitleSize (0.12)
                h3 . GetYaxis (). SetLabelSize (0.1)
                h3 . GetYaxis (). SetTitleSize (0.15)
            #ratio . GetYaxis (). SetTitle (" Data /MC")
                h3 . GetYaxis (). SetTitleOffset (0.17)
            #207,512 divisions
                h3 . GetYaxis (). SetNdivisions (207)
                h1 . GetYaxis (). SetRangeUser (0.5 ,2650)
                h1 .GetYaxis().SetNdivisions(107)
                h3 . GetXaxis (). SetNdivisions (207)

                canvas.Update()
                canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")

            gaussian_linear.append(tot_sig_2_point_5_sigma)
            gaussian_linear.append(tot_sig_3_sigma)
            gaussian_linear.append(tot_sig_3_point_5_sigma)

canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
#sigma_signal_under_peak
sigma_integral
#sigma_backgnd_under_peak

In [None]:
np.linspace(h[0].min(), h[0].max(), 4, endpoint=True)

In [None]:
def pT_vs_rapidity(df, var_xaxis , var_yaxis , range_var_xaxis, range_var_yaxis):
    import matplotlib as mpl
    fig, axs = plt.subplots(figsize=(8, 6),dpi = 300)
    h=plt.hist2d(df[var_xaxis],df[var_yaxis],range=[range_var_xaxis,range_var_yaxis], bins=np.arange(0,17)*0.2+0, norm=mpl.colors.LogNorm())
    v1 = np.linspace(0, h[0].max(), 4, endpoint=True)
    cbar = fig.colorbar(h[3], ticks = v1 )
    #cbar.set_ticks([h[0].min(),(h[0].max()-h[0].min())/2,h[0].max()])
    #cbar.set_ticklabels([h[0].min(),(h[0].max()-h[0].min())/2,h[0].max()])
    
    #v1 = np.linspace(Z.min(), Z.max(), 8, endpoint=True)
    #cbar=plt.colorbar(ticks=v1)              # the mystery step ???????????
    cbar.ax.set_yticklabels([ '0', '1784', '3568', '5353']) # add the labels
    

    
    plt.vlines(x=1.59,ymin=-1,ymax=2.4, color='r', linestyle='-')
    #plt.hlines(y=bins4[1], xmin=bins0[3], xmax=3.162, colors='b', linestyles='solid', label='')
    #plt.hlines(y=bins4[2], xmin=bins0[3], xmax=3.162, colors='b', linestyles='solid', label='')

    #plt.hlines(y=0.4, xmin=-0.1, xmax=df[var_xaxis].max(), colors='b', linestyles='solid', label='')
    #plt.hlines(y=0.2, xmin=-0.1, xmax=1.5996, colors='b', linestyles='solid', label='')
    #plt.hlines(y=0.9, xmin=-0.1, xmax=3.5, colors='b', linestyles='solid', label='')
    plt.xlabel('$y_{Lab}$', fontsize=20)
    plt.ylabel('$p_{T}$ (GeV/$c$)', fontsize=18)
    axs.text(0.02, 3, r'CBM Performance', fontsize=15)
    axs.text(0.02, 2.8, r'DCM-QGSM-SMM, Au+Au @ 12 $A$GeV/$c$', fontsize=15, color ='r')
    axs.text(1.2, 0.6, r'$y_{CM}$', fontsize=20, color ='r')
    axs.tick_params(axis='both', which='major', labelsize=18)
    axs.grid(b=True, animated=True )
    axs.set_xticks(np.arange(0,17)*0.2+0)
    axs.set_xticklabels(['0' ,'' ,'' ,'0.6','','', '1.2','','', '1.8','' ,'' ,'2.4','','' ,'3' , ''])
    axs.set_yticks(np.arange(0,16)*0.2+0)
    axs.set_yticklabels(['0' ,'' ,'' ,'0.6','','', '1.2','','', '1.8','' ,'' ,'2.4','','' ,'3' , ''])
    #plt.title("  y-$p_{T}$ plot for signal candidates (MC=1) with a cut = %.2f"%0.95,  fontsize=18)
    #plt.grid(which='both', ydata =yy)
    plt.show()
    
    
    fig.tight_layout()
    fig.savefig("/home/shahid/cbmsoft/Cut_optimization/uncut_data/pT_vs_rapidity.png")
    return h

In [None]:
range1=[-0., 3.2]
range2=[-0.01, 3.]

h =pT_vs_rapidity(df4[df4['issignal']==1],'rapidity','pT', range1, range2)

In [None]:
pt_rapidity_cut = df3_base[(df3_base['rapidity']<2) & (df3_base['rapidity']>0.8) &(df3_base['pT']>0.15)
                           &(df3_base['pT']<0.9)]
pT_vs_rapidity(pt_rapidity_cut,'rapidity','pT', range1, range2)

In [None]:
high_pt_below_mid_rapidity_cut = df3_base[(df3_base['rapidity']<1.5996) & (df3_base['pT']>0.4)]
pT_vs_rapidity(df3_base,'rapidity','pT', range1, range2)

In [None]:
del  h1, h2 ,h3

In [None]:
def pyf_tf1_params(x, p):
    return p[0] * x[0] + p[1]

npars = 2
f = ROOT.TF1("tf1_params", pyf_tf1_params, 0.0, 1.0, npars)

In [None]:
np.exp(1)

## Fitting Signal

In [None]:
def lorenztian( x ,p):
    return 0.5*p[0]*p[1] /( ((x[0]-p[2])**2) + ((0.5 * p[1])**2)) 

def gaus_fit( x ,p):
    return p[0]*np.exp(-0.5*((x[0]-p[2])/p[1])**2)


f2 = ROOT.TF1 (" gaussfit", "[0]*exp(-0.5*((x-[2])/[1])^2)"  ,1.1 ,1.13)

#def lorenztian( x ,p):
#    return p[0]*2*np.sqrt(2)*p[1]*p[2]*np.sqrt(p[2]*(p[2]**2 + p[1]**2)) /(np.pi*np.sqrt(p[2]+np.sqrt(p[2]*(p[2]**2 + p[1]**2)))) /( ((x[0]**2) - (p[2]**2))**2 +(p[1]*p[2])**2 )
mm= 1.105
bins =100
canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
canvas.Draw()
canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
distribution = mid_pT_high_rapidity
data = distribution['mass']
#the minimum x (lower edge of the first bin)=mm        
h1 = ROOT.TH1F("B_&_S","", bins,mm,1.13)
for i in range(0,data.shape[0]):
    h1.Fill(data.iloc[i])
f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014)",mm,1.13);
#f1.SetParameters(1);
h1.Fit(f1,"RNI");
par1 = f1.GetParameters()


#Step2
canvas .Clear ()
pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
pad1 . Draw ()
pad1 . cd ()
pad1. Clear()


h1.SetTitleOffset(-1)
h1.SetFillStyle(3003);
h1.SetLineWidth(2)
h1.SetStats (0)
h1.SetYTitle("Entries")
h1.SetLineColor(ROOT.kBlack)
h2 = ROOT.TH1F("h2", "", bins, mm, 1.13);
h3 = ROOT.TH1F("h2", "", bins, mm, 1.13);
h3.SetLineWidth(2)
h3.SetStats (0)
h3.GetXaxis().SetTitle("Mass (GeV/c^2)")

f2 = TF1("full",lorenztian,mm,1.13,3);
f2.SetNpx(100000);
f2.SetParameters(par1[0],0.0001,1.115);
f2.SetLineColor(ROOT.kRed)
h1.Fit(f2,"E");
par2 = f2.GetParameters()


h1.Draw("pe")

f2.Draw("SAME")

bin1 = h1.FindBin(mm);
bin2 = h1.FindBin(1.13);
for i in range(bin1,bin2):
    f_value= f2.Eval(h1.GetBinCenter(i));
    t_value = h1.GetBinContent(i)
    h2.SetBinContent(i,f_value)
    if (h1.GetBinError(i) > 0):
        h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

h2.Sumw2()

#To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
#(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
integral_min = par2[2] - (TMath.Abs(3*par2[1]));
integral_max = par2[2] + (TMath.Abs(3*par2[1]));
#To integrate area under the signal plus background curve we take 3 sigma and integrate
binwidth = h1.GetXaxis().GetBinWidth(1);
tot = f2.Integral(integral_min,integral_max)/binwidth;
sigma_integral = f2.IntegralError(integral_min,integral_max);
#To find the signal, we integrate just the gaussian peak with 3 sigma 
signal_under_peak = (fs.Integral(integral_min,integral_max)/binwidth);
sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
if sigma_signal_under_peak!=0:
    print("Integral errors ",sigma_signal_under_peak)

tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
#Background
backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
#Significance = signal/(signal+background)^0.5
#Significance = signal_under_peak/TMath.Sqrt(tot);

#3.5 sigma
signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)




std =  f2.GetParameter(1)
estd = f2.GetParError(1)

latex = ROOT . TLatex ()
latex . SetNDC ()
latex . SetTextSize (0.02)
#latex . DrawLatex (0.4 ,0.85, "Significance in 2.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
#latex . DrawLatex (0.4 ,0.80, "Significance in 3#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
#latex . DrawLatex (0.4 ,0.75, "Significance in 3.5#sigma region around peak = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
latex . DrawLatex (0.2 ,0.75, " #Gamma = %.4f #pm %.5f GeV"%(std,estd ))
latex . DrawLatex (0.2 ,0.70, " m_{0} = %.4f #pm %.5f GeV"%(par2 [2],f2.GetParError(2) ))
#latex . DrawLatex (0.2 ,0.65," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))


legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
legend.AddEntry(h1,"Invariant mass of lambda","l");
legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
legend . SetLineWidth (0)
legend.Draw()

canvas . cd ()
pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
pad2 . Draw ()
pad2 . cd ()
pad2.Clear()


h3.SetLineColor(TColor.GetColor(5))
h3.SetYTitle("d-f/#Deltad")
h3.Draw()
line = ROOT . TLine (mm,0 ,1.125 ,0)
line . SetLineColor ( ROOT . kRed )
line . SetLineWidth (2)
line . Draw (" same ")


pad1 . SetBottomMargin (0)
pad2 . SetTopMargin (0)
pad2 . SetBottomMargin (0.25)

h1 . GetXaxis (). SetLabelSize (0)
h1 . GetXaxis (). SetTitleSize (0)
h1 . GetYaxis (). SetTitleSize (0.05)
h1 . GetYaxis (). SetLabelSize (0.03)
h1 . GetYaxis (). SetTitleOffset (0.6)

h3 . SetTitle ("")
h3 . GetXaxis (). SetLabelSize (0.12)
h3 . GetXaxis (). SetTitleSize (0.12)
h3 . GetYaxis (). SetLabelSize (0.1)
h3 . GetYaxis (). SetTitleSize (0.15)
h1 . GetXaxis (). SetRangeUser (1. ,1.126)
h3 . GetXaxis (). SetRangeUser (1.11 ,1.122)
#ratio . GetYaxis (). SetTitle (" Data /MC")
h3 . GetYaxis (). SetTitleOffset (0.17)
#207,512 divisions
h3 . GetYaxis (). SetNdivisions (207)
h1 . GetYaxis (). SetRangeUser (0.5 ,1000)
h1 .GetYaxis().SetNdivisions(107)
h3 . GetXaxis (). SetNdivisions (207)

canvas.Update()
canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.png")



canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
df = sgnal[sgnal['issignal']==1]
lowest_rapidity = df[df['rapidity']<0.5]
low_rapidity = df[(df['rapidity']>0.5)   & (df['rapidity']<1)]
mid_rapidity = df[(df['rapidity']>1.5)   & (df['rapidity']<2)]
high_rapidity = df[(df['rapidity']>2)    & (df['rapidity']<2.5)]
higher_rapidity = df[(df['rapidity']>2.5)]

    

low_pT_lowest_rapidity = lowest_rapidity[lowest_rapidity['pT']<1]
mid_pT_lowest_rapidity = lowest_rapidity[(lowest_rapidity['pT']>1) & (lowest_rapidity['pT']<2)]
high_pT_lowest_rapidity =lowest_rapidity[(lowest_rapidity['pT']>2)]


low_pT_low_rapidity = low_rapidity[low_rapidity['pT']<1]
mid_pT_low_rapidity = low_rapidity[(low_rapidity['pT']>1) & (low_rapidity['pT']<2)]
high_pT_low_rapidity= low_rapidity[(low_rapidity['pT']>2)]
    

low_pT_mid_rapidity = mid_rapidity[mid_rapidity['pT']<1]
mid_pT_mid_rapidity = mid_rapidity[(mid_rapidity['pT']>1) & (mid_rapidity['pT']<2)]
high_pT_mid_rapidity=mid_rapidity[(mid_rapidity['pT']>2)]
    

low_pT_high_rapidity = high_rapidity[high_rapidity['pT']<1]
mid_pT_high_rapidity = high_rapidity[(high_rapidity['pT']>1) & (high_rapidity['pT']<2)]
high_pT_high_rapidity=high_rapidity[(high_rapidity['pT']>2)]

low_pT_higher_rapidity = higher_rapidity[higher_rapidity['pT']<1]
mid_pT_higher_rapidity = higher_rapidity[(higher_rapidity['pT']>1) & (higher_rapidity['pT']<2)]
high_pT_higher_rapidity=higher_rapidity[(higher_rapidity['pT']>2)]


del  lowest_rapidity, mid_rapidity, high_rapidity, higher_rapidity, df

In [None]:
list1 = [low_pT_lowest_rapidity, mid_pT_lowest_rapidity, high_pT_lowest_rapidity, low_pT_low_rapidity,
         mid_pT_low_rapidity, high_pT_low_rapidity, low_pT_mid_rapidity, mid_pT_mid_rapidity,
        high_pT_mid_rapidity, low_pT_high_rapidity, mid_pT_high_rapidity, high_pT_high_rapidity, 
         low_pT_higher_rapidity, mid_pT_higher_rapidity, high_pT_higher_rapidity]

In [None]:
signal[signal['issignal']==1]['rapidity'].describe()

In [None]:
df4['mass'].iloc[0]

In [None]:
from ROOT import TFile, TTree
from array import array
from ROOT import std

f = TFile('pt_y_yield_bdt_cut_0.95.root','recreate')
t = TTree('t1','tree with df')


rapidity = array('f',[0])
mass = array('f',[0])
pT = array('f',[0])
issignal = array('f',[0])

t.Branch('rapidity', rapidity,'y/F')
t.Branch('mass', mass,'mass/F')
t.Branch('pT', pT,'pT/F')
t.Branch('issignal', issignal,'pT/F')

for i in range(len(df4['mass'])):
    rapidity[0] = df4['rapidity'].iloc[i]
    mass[0] = df4['mass'].iloc[i]
    pT[0] = df4['pT'].iloc[i]
    issignal[0] = df4['issignal'].iloc[i]
    t.Fill()
f.Write()
f.Close()

In [None]:
gc.collect()

In [None]:
df4[(df4['rapidity']>1.4)& (df4['pT']>1.4)]['pT'].shape

# Efficiency 
Efficieny correction on just one configuration i.e lorenztian + 2nd order pol, 100 mass binings

In [None]:
def draw_hist(h1, f2, fs, fb, h3):
    c = ROOT . TCanvas (" canvas ","", 1200,1000)
    c . Draw() 
    c.Clear ()
    
    pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
    pad1 . Draw ()
    pad1 . cd ()
    pad1 . Clear()
    pad1 . SetBottomMargin (0)
    
    h1 . SetTitleOffset(-1)
    h1 . SetFillStyle(3003);
    h1 . SetLineWidth(2)
    h1 . SetStats (0)
    h1 . SetYTitle("Entries")
    h1 . SetLineColor(ROOT.kBlack)
    h1 . GetXaxis (). SetLabelSize (0)
    h1 . GetXaxis (). SetTitleSize (0)
    h1 . GetYaxis (). SetTitleSize (0.05)
    h1 . GetYaxis (). SetLabelSize (0.03)
    h1 . GetYaxis (). SetTitleOffset (0.6)
    h1 . GetYaxis().SetNdivisions(107)
    
    
    fs.SetNpx(100000);
    fs.SetLineColor(ROOT.kGreen)
    
    fb.SetLineStyle(4)
    fb.SetLineColor(ROOT.kBlue)
    fb.SetNpx(100000);
    
    f2.SetNpx(100000);
    f2.SetLineColor(ROOT.kRed)
    
    
    h1.Draw("pe")
    fs.Draw("SAME")
    fb.Draw("SAME")
    f2.Draw("SAME")
    
    latex = ROOT . TLatex ()
    latex . SetNDC ()
    latex . SetTextSize (0.02)
    latex . DrawLatex (0.4 ,0.85, "Significance in m_{0} #pm 2.5#Gamma  = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
    latex . DrawLatex (0.4 ,0.80, "Significance in m_{0} #pm 3#Gamma = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
    latex . DrawLatex (0.4 ,0.75, "Significance in m_{0} #pm 3.5#Gamma = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
    latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(par2 [1],f2.GetParError(1) ))
    latex . DrawLatex (0.4 ,0.65, " m_{0} = %.4f #pm %.5f GeV"%(par2 [2],f2.GetParError(2) ))
    latex . DrawLatex (0.4 ,0.6," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))
    latex . DrawLatex (0.4 ,0.55," True signal (MC=1) = %.f"%(mc_counts))


    legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
    legend.AddEntry(h1,"Invariant mass of lambda","l");
    legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
    legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
    legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
    legend . SetLineWidth (0)
    legend.Draw()
    
    c . cd ()
    pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
    pad2 . Draw ()
    pad2 . cd ()
    pad2.Clear()
    pad2.SetGrid()
    pad2 . SetTopMargin (0)
    pad2 . SetBottomMargin (0.25)

    
    h3.SetLineWidth(2)
    h3.SetStats (0)
    h3.GetXaxis().SetTitle("Mass [GeV/c{^2}]")
    h3 . SetTitle ("")
    h3 . GetXaxis (). SetLabelSize (0.12)
    h3 . GetXaxis (). SetTitleSize (0.12)
    h3 . GetYaxis (). SetLabelSize (0.1)
    h3 . GetYaxis (). SetTitleSize (0.15)
    #ratio . GetYaxis (). SetTitle (" Data /MC")
    h3 . GetYaxis (). SetTitleOffset (0.17)
    #207,512 divisions
    h3 . GetYaxis (). SetNdivisions (207)
    h3 . GetXaxis (). SetNdivisions (207)
    h3.SetLineColor(TColor.GetColor(5))
    h3.SetYTitle("d-f/#Deltad")
    
    h3.Draw()
    
    
    line = ROOT . TLine (mm,0 ,1.23 ,0)
    line . SetLineColor ( ROOT . kRed )
    line . SetLineWidth (2)
    line . Draw (" same ")
    c . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf [")
    
    #c . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
#lorentzian + second order pol
#lorentzian + second order pol
#lorentzian + second order pol
a = []
pt_y_bin_for_yield_min=[]
pt_y_bin_for_yield_max=[]
y_bin_for_yield_max=[]
y_bin_for_yield_min=[]
true_mc_in_recons =[]


df = df3_base
mass_range_min = [1.08]
fit_limit_low=[0,0.1* (df['mass'].describe()[2]),   0.2* (df['mass'].describe()[2]),
               1.23,
               df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.1* (df['mass'].describe()[2]),
                df['mass'].describe()[1]+1.2*(df['mass'].describe()[2])+0.2* (df['mass'].describe()[2])]


for mm in mass_range_min:
    for mmm in range(0,1,1):

        binning = [100]
        for b in binning:

            y_bin_low=-0.2
            y_bin_up =0
            for i in range(0,15,1):
                tot_sig_3_sigma = 0
                tot_bac_3_sigma = 0
                tot_sig_3_point_5_sigma = 0
                tot_bac_3_point_5_sigma = 0
                tot_sig_2_point_5_sigma = 0
                tot_bac_2_point_5_sigma = 0
                tot_sig_2_sigma = 0
                
                y_bin_low = truncate(y_bin_low+0.2)
                y_bin_up = truncate(y_bin_up+0.2)
                df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
                pt_bin_low =-0.2
                pt_bin_up =0
                
                for i in range(0,15,1):
                    pt_bin_low = truncate(pt_bin_low+0.2)
                    #print(pt_bin_low)
                    pt_bin_up = truncate(pt_bin_up+0.2)
                    df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
                    mc_counts = df_pt[df_pt['issignal']>0].shape[0]
                    #print(y_bin_low, y_bin_up, " pT ", pt_bin_low,pt_bin_up)
                    if df_pt.shape[0]>500:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fb =TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #fb.SetParameters(0,0,0);
                        #fb.SetParameters(0,0,0,0);
                        h0.Fit(fb,"RIEM");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                        
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #f1=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f1.SetParameters(1,par[0], par[1], par[2]);
                        #f1.SetParameters(1,par[0], par[1], par[2],par[3]);
                        h1.Fit(f1,"RNI");
                        par1 = f1.GetParameters()



                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x+[6]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3], par1[4]);
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3]);

                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIR"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);

                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5], par2[6]);



                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);


                        bin1 = h1.FindBin(fit_limit_low[mmm]+mm);
                        bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            t_value = h1.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))

                        h2.Sumw2()

                                #To integrate over the gaussian peak we take the integral limits 3 sigmas (i.e. parameter 3) below the mean value
                    #(i.e. par 1) of the gaussian as a minimum limit and 3 sigmas above the mean as a max limit of the integral*/
                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));
                    #To integrate area under the signal plus background curve we take 3 sigma and integrate
                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        tot = f2.Integral(integral_min,integral_max)/binwidth;
                        sigma_integral = f2.IntegralError(integral_min,integral_max);
                    #To find the signal, we integrate just the gaussian peak with 3 sigma 
                        #params.integral = fit->GetParameter(0) * sqrt(2*3.1415) * fit->GetParameter(2) / h->GetBinWidth(1);
                        #signal_under_peak = par2[1] * np.sqrt(2*3.1415) *3 *par2[2]/ binwidth
                        signal_under_peak = fs.Integral(integral_min,integral_max)/binwidth
                        if signal_under_peak<0:
                            signal_under_peak = 0
                            print('Negative signal')                
                        sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
                    #Background
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        if backgnd_under_peak<0:
                            print('Negative background')
                        sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                        tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
                    #Significance = signal/(signal+background)^0.5
                        Significance = signal_under_peak/TMath.Sqrt(tot);

                        #3.5 sigma
                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                        tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)

                        signal_under_peak_2_sigma = (fs.Integral(par2[2] - (TMath.Abs(2*par2[1])),par2[2] + (TMath.Abs(2*par2[1])))/binwidth);
                        
                        draw_hist(h1, f2, fs, fb, h3)
                        

                        
            #a.append(tot_sig_2_point_5_sigma)
                        a.append(signal_under_peak_2_point_5_sigma)
                        y_bin_for_yield_min.append(truncate(y_bin_low))
                        y_bin_for_yield_max.append(truncate(y_bin_up))
                        pt_y_bin_for_yield_min.append(pt_bin_low)
                        pt_y_bin_for_yield_max.append(pt_bin_up)
                        true_mc_in_recons.append(mc_counts)
                    else:
                        a.append(0)
                        y_bin_for_yield_min.append(truncate(y_bin_low))
                        y_bin_for_yield_max.append(truncate(y_bin_up))
                        pt_y_bin_for_yield_min.append(pt_bin_low)
                        pt_y_bin_for_yield_max.append(pt_bin_up)
                        true_mc_in_recons.append(mc_counts)
            #a.append(tot_sig_3_point_5_sigma)
#c . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")

In [None]:
dcm_clean_mc = true_mc_in_recons
#len(dcm_clean_mc)
#sum(true_mc_in_recons)
len(dcm_clean_mc)

In [None]:
%jsroot off
#lorentzian + second order pol
a = []
pt_y_bin_for_yield_min=[]
pt_y_bin_for_yield_max=[]
y_bin_for_yield_max=[]
y_bin_for_yield_min=[]
true_mc_in_recons =[]



df = df4

for mm in mass_range_min:
    for mmm in range(0,1,1):
        canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
        canvas.Draw()

        binning = [100]
        for b in binning:

            y_bin_low=1
            y_bin_up =1.2
            for i in range(0,1,1):
                
                y_bin_low = truncate(y_bin_low+0.2)
                y_bin_up = truncate(y_bin_up+0.2)
                df_y = df[(df['rapidity']>y_bin_low) & (df['rapidity']<y_bin_up)]
                pt_bin_low =-0.2
                pt_bin_up =0.
                for i in range(0,1,1):
                    pt_bin_low = truncate(pt_bin_low+0.2)
                    #print(pt_bin_low)
                    pt_bin_up = truncate(pt_bin_up+0.2)
                    df_pt = df_y[(df_y['pT']>pt_bin_low) & (df_y['pT']<pt_bin_up)]
                    mc_counts = df_pt[df_pt['issignal']==1].shape[0]
                    #step 0
                    if df_pt.shape[0]>400:
                        data0 = background_selector(df_pt)
                        h0 = ROOT.TH1F("Background","Background without peak",b,mm,fit_limit_low[5])
                        for i in range(0,data0.shape[0]):
                            h0.Fill(data0.iloc[i])
                        fb = TF1("fb","[0]+[1]*x+[2]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fb =TF1("fb","[0]+[1]*x+[2]*x*x+[3]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #fb.SetParameters(0,0,0);
                        #fb.SetParameters(0,0,0,0);
                        h0.Fit(fb,"RIEM");
                        par = fb.GetParameters()
                        #Step 1
                        data = df_pt['mass']
                        
                #the minimum x (lower edge of the first bin)=mm        
                        h1 = ROOT.TH1F("B_&_S","rapidity=[%.2f,%.2f] & p_{T}=[%.2f,%.2f] & Min Mass= %.3f & bins=%.0f"%(df_pt['rapidity'].min(),df_pt['rapidity'].max(),df_pt['pT'].min(),df_pt['pT'].max(), mm, b),b,mm,fit_limit_low[5])
                        for i in range(0,data.shape[0]):
                            h1.Fill(data.iloc[i])
                        f1 = TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #f1=TF1("step1","((0.5)*[0]*0.0014) /((x-1.115683)*(x-1.115683)+ .25*0.0014*0.0014) +[1]+[2]*x+[3]*x*x+[4]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        f1.SetParameters(1,par[0], par[1], par[2]);
                        #f1.SetParameters(1,par[0], par[1], par[2],par[3]);
                        h1.Fit(f1,"RNI");
                        par1 = f1.GetParameters()

                        canvas .Clear ()
                        pad1 = ROOT . TPad (" pad1 "," pad1 " ,0 ,0.3 ,1 ,1)
                        pad1 . Draw ()
                        pad1 . cd ()
                        pad1. Clear()

                #step 2
                        f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2 = TF1("full","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1]) +[3]+[4]*x+[5]*x*x+[6]*x*x*x",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3])
                        #f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3], par1[4]);
                        f2.SetNpx(100000);
                        f2.SetParameters(par1[0],0.001,1.115,par1[1], par1[2], par1[3]);
                        f2.SetLineColor(ROOT.kRed)
                        r= ROOT.TFitResultPtr(h1.Fit(f2,"MNIR"))
                        par2 = f2.GetParameters()

                        fs = TF1("fs","((0.5)*[0]*[1]) /((x-[2])*(x-[2])+ .25*[1]*[1])",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        #fs = TF1("fs","[0]*exp(-0.5*((x-[2])/[1])^2)",fit_limit_low[mmm]+mm,fit_limit_low[mmm+3]);
                        fs.SetNpx(100000);
                        fs.SetLineColor(ROOT.kGreen)
                        fb.SetLineStyle(4)
                        fb.SetLineColor(ROOT.kBlue)
                        fb.SetNpx(100000);
                        fs.SetParameters(par2[0],par2[1],par2[2]);
                        fb.SetParameters(par2[3],par2[4],par2[5], par2[6]);


                        h1.SetTitleOffset(0)
                        h1.SetFillStyle(3003);
                        h1.SetLineWidth(2)
                        h1.SetStats (0)
                        h1.SetYTitle("Entries")
                        h1.SetLineColor(ROOT.kBlack)
                        h1.GetYaxis().SetTitle("Counts")
                        h2 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h4 =  ROOT.TH1F("h2", "", b, mm, 1.23)
                        h5 = ROOT.TH1F("h2", "", b, mm, 1.23);
                        h3.SetLineWidth(2)
                        h3.SetStats (0)
                        h3.GetXaxis().SetTitle("Mass (GeV/#it{c}^{2}]")

                        h1.Draw("pe")
                        h_mc.SetLineColor(ROOT.kMagenta)
                        h_mc.SetLineWidth(2)
                        #h_mc.Draw("SAMEpe")
                        fs.Draw("SAME")
                        fb.Draw("SAME")
                        f2.Draw("SAME")
                        bin1 = h1.FindBin(fit_limit_low[mmm]+mm);
                        bin2 = h1.FindBin(fit_limit_low[mmm+3]);
                        for i in range(bin1,bin2):
                            f_value= f2.Eval(h1.GetBinCenter(i));
                            fs_values = fs.Eval(h3.GetBinCenter(i))
                            t_value = h1.GetBinContent(i)
                            t_value_mc = h_mc.GetBinContent(i)
                            h2.SetBinContent(i,f_value)
                            h4.SetBinContent(i,fs_values)
                            if (h1.GetBinError(i) > 0):
                                h3.SetBinContent(i,(t_value-f_value)/h1.GetBinError(i))
                            if (h_mc.GetBinError(i) > 0):
                                h5.SetBinContent(i,(t_value_mc-fs_values)/h_mc.GetBinError(i))


                        h2.Sumw2()
                        #h4.Sumw2()
                        h5.SetLineColor(ROOT.kBlue)
                        h5.SetLineWidth(2)

                        integral_min = par2[2] - (TMath.Abs(3*par2[1]));
                        integral_max = par2[2] + (TMath.Abs(3*par2[1]));

                        binwidth = h1.GetXaxis().GetBinWidth(1);
                        tot = f2.Integral(integral_min,integral_max)/binwidth;
                        sigma_integral = f2.IntegralError(integral_min,integral_max);
                        #signal_under_peak = par2[0] * np.sqrt(2*3.1415*par2[1]*par2[1])/binwidth
                        signal_under_peak = fs.Integral(integral_min,integral_max)/binwidth
                        if signal_under_peak<0:
                            signal_under_peak = 0
                            print('Negative signal')                
                        sigma_signal_under_peak = fs.IntegralError(integral_min,integral_max);
                        man_sigma_signal_under_peak = TMath.Sqrt(signal_under_peak)
                        if sigma_signal_under_peak!=0:
                            print("Integral errors ",sigma_signal_under_peak)

                        tot_sig_3_sigma= tot_sig_3_sigma+signal_under_peak
                    #Background
                        backgnd_under_peak = (fb.Integral(integral_min,integral_max)/binwidth)
                        if backgnd_under_peak<0:
                            print('Negative background')
                        sigma_backgnd_under_peak = fb.IntegralError(integral_min,integral_max);
                        tot_bac_3_sigma = tot_bac_3_sigma+backgnd_under_peak
                    #Significance = signal/(signal+background)^0.5
                        Significance = signal_under_peak/TMath.Sqrt(tot);
                        #print("total - background = ",tot-backgnd_under_peak)
                        #3.5 sigma
                        signal_under_peak_3_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        bac_under_peak_3_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])))/binwidth);
                        tot_sig_3_point_5_sigma = tot_sig_3_point_5_sigma+signal_under_peak_3_point_5_sigma
                        tot_bac_3_point_5_sigma = tot_bac_3_point_5_sigma + bac_under_peak_3_point_5_sigma

                        sigma_signal_under_peak_3_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(3.5*par2[1])),par2[2] + (TMath.Abs(3.5*par2[1])));
                        man_sigma_signal_under_peak_3_point_5_sigma = TMath.Sqrt(signal_under_peak_3_point_5_sigma)

                        signal_under_peak_2_point_5_sigma = (fs.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        bac_under_peak_2_point_5_sigma = (fb.Integral(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])))/binwidth);
                        tot_sig_2_point_5_sigma = tot_sig_2_point_5_sigma+signal_under_peak_2_point_5_sigma
                        tot_bac_2_point_5_sigma = tot_bac_2_point_5_sigma + bac_under_peak_2_point_5_sigma

                        sigma_signal_under_peak_2_point_5_sigma = fs.IntegralError(par2[2] - (TMath.Abs(2.5*par2[1])),par2[2] + (TMath.Abs(2.5*par2[1])));
                        man_sigma_signal_under_peak_2_point_5_sigma = TMath.Sqrt(signal_under_peak_2_point_5_sigma)


                        std = par2 [1]
                        estd = f2.GetParError(1)
                        
                        latex = ROOT . TLatex ()
                        latex . SetNDC ()
                        latex . SetTextSize (0.02)
                        latex . DrawLatex (0.4 ,0.85, "Significance in m_{0} #pm 2.5#Gamma  = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_2_point_5_sigma, man_sigma_signal_under_peak_2_point_5_sigma, signal_under_peak_2_point_5_sigma,bac_under_peak_2_point_5_sigma,signal_under_peak_2_point_5_sigma/TMath.Sqrt(bac_under_peak_2_point_5_sigma+signal_under_peak_2_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.80, "Significance in m_{0} #pm 3#Gamma = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak,man_sigma_signal_under_peak, signal_under_peak,backgnd_under_peak,Significance ))
                        latex . DrawLatex (0.4 ,0.75, "Significance in m_{0} #pm 3.5#Gamma = #frac{%.1f #pm %.1f}{#sqrt{%.1f+%.1f}} = %.1f"%(signal_under_peak_3_point_5_sigma,man_sigma_signal_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma,bac_under_peak_3_point_5_sigma,signal_under_peak_3_point_5_sigma/TMath.Sqrt(signal_under_peak_3_point_5_sigma+bac_under_peak_3_point_5_sigma) ))
                        latex . DrawLatex (0.4 ,0.70, " #Gamma = %.4f #pm %.5f GeV"%(par2 [1],f2.GetParError(1) ))
                        latex . DrawLatex (0.4 ,0.65, " m_{0} = %.4f #pm %.5f GeV"%(par2 [2],f2.GetParError(2) ))
                        latex . DrawLatex (0.4 ,0.6," #frac{#chi^{2}}{ndf} = %.1f/%d = %.4f"%(f2.GetChisquare() , f2.GetNDF() , f2.GetChisquare() / f2.GetNDF() ))
                        latex . DrawLatex (0.4 ,0.55," True signal (MC=1) = %.f"%(mc_counts))
                        
                        latex1 = ROOT . TLatex ()
                        latex1 . SetNDC ()
                        latex1 . SetTextSize (0.035)
                        latex1. DrawLatex (0.4 ,0.25, "CBM performance")
                        latex1. DrawLatex (0.4 ,0.15, "URQMD, Au+Au @ 12#it{A} GeV/#it{c}")
                        latex1.Draw()

                        
                        legend = ROOT.TLegend(0.87,0.3,0.6,0.6);
                        legend.AddEntry(h1,"Invariant mass of lambda","l");
                        legend.AddEntry(f2,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}+B+Cx+Dx^{2}","l");
                        legend.AddEntry(fs,"A #frac{0.5 #Gamma}{(m-m_{0})^{2} + 0.25#Gamma^{2}}","l");
                        legend.AddEntry(fb,"B+Cx+Dx^{2}","l");
                        legend . SetLineWidth (0)
                        legend.Draw()

                        canvas . cd ()
                        pad2 = ROOT . TPad (" pad2 "," pad2 " ,0 ,0.05 ,1 ,0.3)
                        pad2 . Draw ()
                        pad2 . cd ()
                        pad2.Clear()


                        h3.SetLineColor(TColor.GetColor(5))
                        h3.SetYTitle("d-f/#Deltad")
                        #h5.Draw()
                        h3.Draw("SAME")
                        line = ROOT . TLine (mm,0 ,1.2 ,0)
                        line . SetLineColor ( ROOT . kRed )
                        line . SetLineWidth (2)
                        line . Draw (" same ")


                        pad1 . SetBottomMargin (0)
                        pad2 . SetTopMargin (0)
                        pad2 . SetBottomMargin (0.25)

                        h1 . GetXaxis (). SetLabelSize (0)
                        #h1.SetTitle("")
                        h1 . GetXaxis (). SetTitleSize (0)
                        h1 . GetYaxis (). SetTitleSize (0.05)
                        h1 . GetYaxis (). SetLabelSize (0.03)
                        h1 . GetYaxis (). SetTitleOffset (0.6)

                        h3 . SetTitle ("")
                        h3 . GetXaxis (). SetLabelSize (0.12)
                        h3 . GetXaxis (). SetTitleSize (0.12)
                        h3 . GetYaxis (). SetLabelSize (0.1)
                        h3 . GetYaxis (). SetTitleSize (0.15)
                    #ratio . GetYaxis (). SetTitle (" Data /MC")
                        h3 . GetYaxis (). SetTitleOffset (0.17)
                    #207,512 divisions
                        h3 . GetYaxis (). SetNdivisions (207)
                        #h_mc.GetYaxis (). SetRangeUser (0.5 ,700)
                        h1 . GetYaxis (). SetRangeUser (0.5 ,400)
                        h3 . GetXaxis (). SetRangeUser (1.08 ,1.2)
                        h1 . GetXaxis (). SetRangeUser (1.08 ,1.2)
                        h1 .GetYaxis().SetNdivisions(107)
                        h3 . GetXaxis (). SetNdivisions (207)

                        canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/c.png")

            #a.append(tot_sig_2_point_5_sigma)
                        a.append(signal_under_peak)
                        y_bin_for_yield_min.append(truncate(y_bin_low+0.2))
                        y_bin_for_yield_max.append(truncate(y_bin_up+0.2))
                        pt_y_bin_for_yield_min.append(pt_bin_low)
                        pt_y_bin_for_yield_max.append(pt_bin_up)
                        true_mc_in_recons.append(mc_counts)
                    else:
                        a.append(0)
                        y_bin_for_yield_min.append(truncate(y_bin_low+0.2))
                        y_bin_for_yield_max.append(truncate(y_bin_up+0.2))
                        pt_y_bin_for_yield_min.append(pt_bin_low)
                        pt_y_bin_for_yield_max.append(pt_bin_up)
                        true_mc_in_recons.append(mc_counts)
            #a.append(tot_sig_3_point_5_sigma)
#canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.pdf ]")       

In [None]:
a
#mc_counts
#sum(a)
#sum(true_mc_in_recons)

In [None]:
df_pt[df_pt['issignal']==1].shape[0]

In [None]:
import uproot
file =uproot.open("lambda_qa_dcm.root")
array1 = file["SimParticles_McLambda/SimParticles_rapidity_SimParticles_pT_McLambda"].to_numpy()
#for i in range(0,14,1):
array1[0][0]

In [None]:
size = 15*15
pt_y_yields = pd.DataFrame(data=np.arange(0,size,1),columns = ['numbering'])
pt_y_yields['rapidity_min_MC'] = np.zeros(size)
pt_y_yields['pT_min_MC'] = np.zeros(size)

pt_y_yields['ratio_recons_sim']=np.zeros(size)
pt_y_yields['ratio_recons_mc']=np.zeros(size)
pt_y_yields['pT_min'] = np.zeros(size)
pt_y_yields ['pt_y_yields_MC']=np.zeros(size)
pt_y_yields['pt_y_yields_recons']=a
pt_y_yields['true_mc_in_recons'] = true_mc_in_recons
pt_y_yields['total_mc_in_recons'] = dcm_clean_mc

for i in range(0,15):
    for j in range(0,15):
        pt_y_yields['rapidity_min_MC'].iloc[i+j*15]=0+j*0.2
    

for i in range(0,15):    
    pt_y_yields['pT_min_MC'].iloc[i]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+1*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+2*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+3*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+4*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+5*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+6*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+7*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+8*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+9*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+10*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+11*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+12*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+13*15]=i/5
    pt_y_yields['pT_min_MC'].iloc[i+14*15]=i/5
    


for i in range(0,15,1):
    pt_y_yields ['pt_y_yields_MC'].iloc[i]=array1[0][0][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+1*15]=array1[0][1][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+2*15]=array1[0][2][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+3*15]=array1[0][3][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+4*15]=array1[0][4][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+5*15]=array1[0][5][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+6*15]=array1[0][6][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+7*15]=array1[0][7][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+8*15]=array1[0][8][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+9*15]=array1[0][9][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+10*15]=array1[0][10][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+11*15]=array1[0][11][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+12*15]=array1[0][12][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+13*15]=array1[0][13][i]
    pt_y_yields['pt_y_yields_MC'].iloc[i+14*15]=array1[0][14][i]

for i in range(0,15*15,1):
    pt_y_yields['ratio_recons_mc'].iloc[i]=a[i]/pt_y_yields['true_mc_in_recons'].iloc[i]
    pt_y_yields['ratio_recons_sim'].iloc[i]=a[i]/pt_y_yields['pt_y_yields_MC'].iloc[i]
    pt_y_yields['pT_min'].iloc[i] = pt_y_bin_for_yield_min[i]
    #print("%.2f"%pt_y_yields['rapidity_min_MC'].iloc[i],"       ",pt_y_yields['pT_min_MC'].iloc[i],"    ", pt_y_yields['ratio'].iloc[i] )
#plt.plot(pt_y_yields['numbering'], pt_y_yields['ratio_recons_sim'], label='Reconstructed/Sim')
plt.plot(pt_y_yields['numbering'], pt_y_yields['ratio_recons_mc'], label='Rencostructed/MC')
plt.legend()
plt.ylim([0.9,1.1])
plt.savefig("hists")
#pt_y_yields[(pt_y_yields['rapidity_min_MC']>1) & (pt_y_yields['rapidity_min_MC']<1.4) &(pt_y_yields['pT_min_MC']<1)&(pt_y_yields['pT_min_MC']>0)]
pt_y_yields[(pt_y_yields['numbering']>100) & (pt_y_yields['numbering']<120)]

In [None]:
h4 = ROOT.TH2F("recons", "recons", 15,0,3,15,0,3);
h5 = ROOT.TH2F("Mc", "Mc", 15,0,3,15,0,3);
h6 = ROOT.TH2F("Mc in reconstructed", "Mc in reconstructed", 15,0,3,15,0,3);
h7 = ROOT.TH2F("DCM Efficiency", "DCM Efficiency", 15,0,3,15,0,3);
h8 = ROOT.TH2F("total mc in reconstructed", "total mc in reconstructed", 15,0,3,15,0,3);


h4.SetStats(0)
h5.SetStats(0)
h6.SetStats(0)

c = ROOT . TCanvas (" canvas ","", 950,800)
c.Draw()
bin1 = h4.FindBin(0);
bin2 = h4.FindBin(3);
for i in range(1,225):
    #recons.SetBinContent( (pt_y_yields1['rapidity_min'].iloc[i]), (pt_y_yields1['pT_min'].iloc[i]) ,pt_y_yields1['pt_y_yields'].iloc[i])
    y= (pt_y_yields['rapidity_min_MC'].iloc[i])
    pT=(pt_y_yields['pT_min_MC'].iloc[i])
    y_bin = int((y+0.1)/0.2 + 1);
    pT_bin = int((pT+0.1)/0.2 + 1);
    h4.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    h5.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_MC'].iloc[i]);
    h6.SetBinContent(y_bin, pT_bin, pt_y_yields['true_mc_in_recons'].iloc[i]);
    h7.SetBinContent(y_bin, pT_bin, a[i]);
    h8.SetBinContent(y_bin, pT_bin, pt_y_yields['total_mc_in_recons'].iloc[i]);

c.SetGrid()
#h4.Draw('colz')

#h5.Draw('colz')
#hist_2d.Draw('colz')
ratio_recons_to_recons_mc=h4.Divide(h8)

#h6.Draw('colz')
#ratio_recons_to_mc=h4.Divide(h5)
h4.Draw('colz')

h4.GetZaxis().SetRangeUser (0 ,0.8)
h4.SetTitleOffset(-1)
latex = ROOT . TLatex ()
latex . SetNDC ()
latex . SetTextSize (0.039)
#latex . DrawLatex (0.4 ,0.7, "#Lambda_{Reconstructed} / #Lambda_{MC} =  %.f / %.f = %.3f"%(sum(a),df4[df4['issignal']>0].shape[0], sum(a) / (df4[df4['issignal']>0].shape[0])))
latex . DrawLatex (0.12 ,0.68, "ML algorithm Efficiency = #Lambda_{Reconstructed} / #Lambda_{Reconstructable} " )

latex.Draw()

latex1 = ROOT . TLatex ()
latex1 . SetNDC ()
latex1 . SetTextSize (0.035)
latex1. DrawLatex (0.12 ,0.84, "CBM performance")
latex1. DrawLatex (0.12 ,0.76, "DCM-QGSM-SMM, Au+Au @ 12#it{A} GeV/#it{c}")
latex1 . DrawLatex (0.45 ,.61, "= %.f / %.f = %.3f"%(sum(a),sum(pt_y_yields ['total_mc_in_recons']), sum(a) / (sum(pt_y_yields ['total_mc_in_recons']))))
latex1.Draw()


h4 . SetTitle ("")
h4 .GetXaxis().SetTitle("#it{y}_{Lab}")
h4. GetXaxis().SetTitleSize(0.06)
h4 .GetXaxis().SetTitleOffset(0.7)
h4 .GetXaxis().SetLabelSize(0.05)
h4 .GetYaxis().SetTitle("p_{T} (GeV/#it{c})")
h4. GetYaxis().SetTitleSize(0.06)
h4 .GetYaxis().SetTitleOffset(0.7)
h4 .GetYaxis().SetLabelSize(0.05)
h4 .GetZaxis().SetLabelSize(0.05)

c.SetRightMargin(0.13);
c. Update()
c . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.png")

In [None]:
h8 = ROOT.TH2F("recons_urqmd", "recons_urqmd", 15,0,3,15,0,3);
h9 = ROOT.TH2F("Mc_urqmd", "Mc_urqmd", 15,0,3,15,0,3);
h10 = ROOT.TH2F("Mc in reconstructed_urqmd", "Mc in reconstructed_urqmd", 15,0,3,15,0,3);
h11 = ROOT.TH2F("urqmd_Efficiency", "Efficiency", 15,0,3,15,0,3);
h11.SetStats(0)
h9.SetStats(0)
h10.SetStats(0)

canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
canvas.Draw()
bin1 = h8.FindBin(0);
bin2 = h8.FindBin(3);
for i in range(1,225):
    #recons.SetBinContent( (pt_y_yields1['rapidity_min'].iloc[i]), (pt_y_yields1['pT_min'].iloc[i]) ,pt_y_yields1['pt_y_yields'].iloc[i])
    y= (pt_y_yields['rapidity_min_MC'].iloc[i])
    pT=(pt_y_yields['pT_min_MC'].iloc[i])
    y_bin = int((y+0.1)/0.2 + 1);
    pT_bin = int((pT+0.1)/0.2 + 1);
    h8.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    h9.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_MC'].iloc[i]);
    h10.SetBinContent(y_bin, pT_bin, pt_y_yields['true_mc_in_recons'].iloc[i]);
    h11.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);

canvas.SetGrid()
#h4.Draw('colz')

#h5.Draw('colz')
#hist_2d.Draw('colz')
ratio_recons_to_recons_mc=h11.Divide(h9)

#h6.Draw('colz')
#ratio_recons_to_mc=h4.Divide(h5)
h11.Draw('colz')

h11.GetZaxis().SetLabelSize (0.02)

h11.SetTitleOffset(-1)
latex = ROOT . TLatex ()
latex . SetNDC ()
latex . SetTextSize (0.03)
#latex . DrawLatex (0.4 ,0.7, "#Lambda_{Reconstructed} / #Lambda_{MC} =  %.f / %.f = %.3f"%(sum(a),df4[df4['issignal']>0].shape[0], sum(a) / (df4[df4['issignal']>0].shape[0])))
latex . DrawLatex (0.3 ,0.7, "#Lambda_{Reconstructed} / #Lambda_{Simulated} =  %.f / %.f = %.3f"%(sum(a),sum(pt_y_yields ['pt_y_yields_MC']), sum(a) / (sum(pt_y_yields ['pt_y_yields_MC']))))
latex . DrawLatex (0.3 ,0.6, "URQMD")

latex.Draw()


h11 . SetTitle ("")
h11 .GetXaxis().SetTitle("y_{Lab}")
h11 .GetXaxis().SetTitleOffset(0)
h11 .GetYaxis().SetTitle("p_{T} GeV/c")
h11 .GetXaxis().SetTitleOffset(0)
canvas . Print ("/home/shahid/cbmsoft/Cut_optimization/uncut_data/Project/pT_rapidity_distribution_XGB_extracted_signal.png")

In [None]:
from ROOT import TFile, TTree
from array import array
from ROOT import std

f = TFile('new_urqmd_efficiency_pt_y_yield_bdt_cut_0.9.root','recreate')
t = TTree('t1','tree')


h8 = ROOT.TH2F("recons_urqmd", "recons_urqmd", 15,0,3,15,0,3);
h9 = ROOT.TH2F("Mc_urqmd", "Mc_urqmd", 15,0,3,15,0,3);
h10 = ROOT.TH2F("Mc in reconstructed_urqmd", "Mc in reconstructed_urqmd", 15,0,3,15,0,3);
h11 = ROOT.TH2F("urqmd_Efficiency", "Efficiency", 15,0,3,15,0,3);
h8.SetStats(0)
h9.SetStats(0)
h10.SetStats(0)


bin1 = h8.FindBin(0);
bin2 = h8.FindBin(3);
for i in range(1,225):
    #recons.SetBinContent( (pt_y_yields1['rapidity_min'].iloc[i]), (pt_y_yields1['pT_min'].iloc[i]) ,pt_y_yields1['pt_y_yields'].iloc[i])
    y= (pt_y_yields['rapidity_min_MC'].iloc[i])
    pT=(pt_y_yields['pT_min_MC'].iloc[i])
    y_bin = int((y+0.1)/0.2 + 1);
    pT_bin = int((pT+0.1)/0.2 + 1);
    h8.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    h9.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_MC'].iloc[i]);
    h10.SetBinContent(y_bin, pT_bin, pt_y_yields['true_mc_in_recons'].iloc[i]);
    h11.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    


#h4.Draw('colz')

#h5.Draw('colz')
#hist_2d.Draw('colz')
#ratio_recons_to_recons_mc=h8.Divide(h9)

#h6.Draw('colz')
ratio_recons_to_mc=h11.Divide(h9)
#h8.Draw('colz')





h8 . SetTitle ("")
h8 .GetXaxis().SetTitle("y_{Lab}")
h8 .GetXaxis().SetTitleOffset(0)
h8 .GetYaxis().SetTitle("p_{T} GeV/c")
h8 .GetXaxis().SetTitleOffset(0)

f.Write()
f.Close()

In [None]:
from ROOT import TFile, TTree
from array import array
from ROOT import std

f = TFile('new_dcm_100_efficiency_pt_y_yield_bdt_cut_0.9.root','recreate')
t = TTree('t1','tree')


h4 = ROOT.TH2F("recons", "recons", 15,0,3,15,0,3);
h5 = ROOT.TH2F("Mc", "Mc", 15,0,3,15,0,3);
h6 = ROOT.TH2F("Mc in reconstructed", "Mc in reconstructed", 15,0,3,15,0,3);
h7 = ROOT.TH2F("Efficiency", "Efficiency", 15,0,3,15,0,3);
h8 = ROOT.TH2F("reconstructable_mc", "reconstructable_mc", 15,0,3,15,0,3);

bin1 = h4.FindBin(0);
bin2 = h4.FindBin(3);
for i in range(1,225):
    #recons.SetBinContent( (pt_y_yields1['rapidity_min'].iloc[i]), (pt_y_yields1['pT_min'].iloc[i]) ,pt_y_yields1['pt_y_yields'].iloc[i])
    y= (pt_y_yields['rapidity_min_MC'].iloc[i])
    pT=(pt_y_yields['pT_min_MC'].iloc[i])
    y_bin = int((y+0.1)/0.2 + 1);
    pT_bin = int((pT+0.1)/0.2 + 1);
    h4.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    h5.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_MC'].iloc[i]);
    h6.SetBinContent(y_bin, pT_bin, pt_y_yields['true_mc_in_recons'].iloc[i]);
    h7.SetBinContent(y_bin, pT_bin, pt_y_yields['pt_y_yields_recons'].iloc[i]);
    h8.SetBinContent(y_bin, pT_bin, dcm_clean_mc[i]);
    


#h4.Draw('colz')

#h5.Draw('colz')
#hist_2d.Draw('colz')
#ratio_recons_to_recons_mc=h4.Divide(h5)

#h6.Draw('colz')
ratio_recons_to_mc=h7.Divide(h5)




f.Write()
f.Close()

In [None]:

inFile = ROOT . TFile . Open ( "lambda_qa_urqmd.root" ," READ ")
inFile.ls()

In [None]:

hist_2d = inFile.Get("SimParticles_McLambda/SimParticles_rapidity_SimParticles_pT_McLambda")

In [None]:
inFile.Print()

In [None]:
h4 = ROOT.TH2F("recons", "recons", 15,0,3,15,0,3);
h4.SetStats(0)
canvas = ROOT . TCanvas (" canvas ","", 1200,1000)
canvas.Draw()

for i in range(1,225):
    #recons.SetBinContent( (pt_y_yields1['rapidity_min'].iloc[i]), (pt_y_yields1['pT_min'].iloc[i]) ,pt_y_yields1['pt_y_yields'].iloc[i])
    y= (df4['rapidity'].iloc[i])
    pT=(df4['pT'].iloc[i])
    y_bin = int((y+0.1)/0.2 + 1);
    pT_bin = int((pT+0.1)/0.2 + 1);
    h4.SetBinContent(y_bin, pT_bin, df4['issignal'].iloc[i]);
    
    
    




h4.Draw('colz')

In [None]:
import matplotlib as mpl
fig, axs =plt.subplots(figsize=(12,10))
h =axs.hist2d(df4[df4['issignal']==1]['rapidity'], df4[df4['issignal']==1]['pT'], bins=(bins1, bins1),norm=mpl.colors.LogNorm())
cbar=fig.colorbar(h[3], ax=axs)
plt.show()
fig.savefig('hists')

In [None]:
bins1 = np.linspace(0,3,16)
bins1

In [None]:
from scipy.stats import binned_statistic as b_s
bin_means, bin_edges, binnumber = b_s(df[variable_xaxis],df[variable_yaxis], statistic='mean', bins=non_uniform_binning)
