# Tutorial 1: Basics

This tutorial will talk about how to use this software from your own python project or Jupyter notebook.
There is also a nice command line interface that enables you to do the same with just two lines in your command line.

**NOTE FOR CONTRIBUTORS: Always clear all output before commiting (``Cell`` > ``All Output`` > ``Clear``)**!

In [None]:
# Magic
%matplotlib inline
# Reload modules whenever they change
%load_ext autoreload
%autoreload 2

# Make bclustering package available even without installation
import sys
sys.path = ["../../"] + sys.path

In [None]:
import numpy as np
import flavio
import functools

In [None]:
import bclustering.physics.models.bdlnu.distribution as bdlnu

## Scanning

### Setting it up

In [None]:
from bclustering.scan import Scanner

Let's set up a scanner object and configure it.

In [None]:
s = Scanner()

First we set up the function/distribution that we want to consider. Here we look into the differential cross section with respect to $q^2$ of $B\longrightarrow D \tau \bar\nu_\tau$. This is implemented in 

In [None]:
# Using our own implementation

s.set_dfunction(
    bdlnu.dGq2,
    binning=np.linspace(bdlnu.q2min, bdlnu.q2max, 3),
    normalize=True
)

In [None]:
# Uncomment to use flavio's implementation

# def dBrdq2(w, q):
#     return flavio.sm_prediction("dBR/dq2(B+->Dtaunu)", q) + \
#         flavio.np_prediction("dBR/dq2(B+->Dtaunu)", w, q)

# s.set_dfunction(
#     dBrdq2,
#     binning=np.linspace(bdlnu.q2min, bdlnu.q2max, 3),
#     normalize=True
# )

First, let's set up the Wilson coefficients (alias "benchmark points") that need to be sampled. The Wilson coefficients are implemented using the Wilson package (https://wilson-eft.github.io/ ), which allows to use a variety of bases, EFTs and matches them to user specified scales.
Using the example of $B\longrightarrow D \tau \bar\nu_\tau$, we sample the coefficients ``CVL_bctaunutau``, ``CSL_bctaunutau`` and ``CT_bctaunutau`` from the ``flavio`` basis with 4 points between $-1\times 10^{-2}$ and $1\times 10^{-2}$ :

In [None]:
s.set_wpoints_equidist(
    {
        "CVL_bctaunutau": (-1, 1, 2),
        "CSL_bctaunutau": (-1, 1, 2),
        "CT_bctaunutau": (-1, 1, 2)
    },
    scale=5,
    eft='WET',
    basis='flavio'
)

### Running it

In [None]:
from bclustering.data.dwe import DataWithErrors

In [None]:
d = DataWithErrors()

In [None]:
s.run(d)

The results are saved in a dataframe, ``d.df``. Let's have a look:

In [None]:
d.df.head()

The configuration of the scanner is saved in a mdatadata object, which is a nested dictionary of config items. 
As an example, we can quickly check for the number of bins in q2 later:

In [None]:
d.md["scan"]

The metadata also contains information about the source code version you're using (git has, commit messages etc.).

### Output files

Now it's time to write out the results for later use.

In [None]:
d.md

In [None]:
d.write(directory="output/scan", name="tutorial_basics", overwrite="overwrite")

You can find more information about the output files in the next part of the tutorial. 

## Clustering

### Setting it up

In [None]:
from bclustering.cluster.hierarchy_cluster import HierarchyCluster

In [None]:
c = HierarchyCluster()

### Running it 

In [None]:
c.build_hierarchy(d)

In [None]:
c.cluster(d, max_d=0.2)

The cluster numbers are directly added as a new column to the dataframe:

In [None]:
d.df.head()

In [None]:
d.write("output/cluster", "tutorial_basics", overwrite="overwrite")