In [7]:
import warnings
warnings.filterwarnings("ignore")

# Why did I look to other tools?

## Fourier analysis

## Principal Analysis Component

## Change of air

# The tools: python and jupyter notebooks

## Quick tour of python tools {.allowframebreaks}

**Python offers so (too?) much tools for data analysis** (*non exhaustive* list)

<div class="notes">
It's true that python is a very dynamic language and offers many tools, in data analysis but not only. In the following part, I'll focus on packages which are relevant for HEP but there are much more to deal with in term of website creation, API for google maps or geographical data.
</div>

+ data vizualization (interactive) [matplolib](https://matplotlib.org/), [plotpy](https://plot.ly/python/), [seaborn](https://seaborn.pydata.org/), [bokeh](https://bokeh.pydata.org/en/latest/), ...

+ scientific, numeric and symbolic calculation [scipy](https://www.scipy.org/), [numpy](http://www.numpy.org/), [simpy](http://www.sympy.org/en/index.html)

+ machine learning [scikitlearn](http://scikit-learn.org/stable/), [kerras](https://keras.io/), [tensorflow](https://www.tensorflow.org/), [pytorch](https://pytorch.org/), etc ...

+ data manipulation [pandas](https://pandas.pydata.org/)

+ **pure HEP**:
   
    + interfaced with ROOT in several ways [pyROOT](https://root.cern.ch/pyroot), [rootpy](http://www.rootpy.org/), [root_numpy](http://scikit-hep.org/root_numpy/), [uproot](https://github.com/scikit-hep/uproot), [root_pandas](https://github.com/scikit-hep/root_pandas)

    + and few more hep-oriented libraries in  [scikit-hep](http://scikit-hep.org/) (starting effort)

![Gallery of bokeh with interactive plots, as in [this example](https://bokeh.pydata.org/en/latest/docs/gallery/hexbin.html) or [this one](https://demo.bokehplots.com/apps/movies)](figures/bokeh.png){width=65%}

![Screenshot of the [scikit-hep website](http://scikit-hep.org/) showing the affiliated pacakges, on top of the actuall content of scikit-hep ([pyjet](https://github.com/scikit-hep/pyjet), [numpythia](https://github.com/scikit-hep/numpythia)). Inspired by [astropy](http://www.astropy.org/)](figures/scikit-hep.png){width=90%}

## NumFocus {.allowframebreaks}

According the [NumFOCUS website](https://numfocus.org/):

> The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. We accomplish this mission through our educational programs and events as well as through fiscal sponsorship of open source scientific computing projects. We aim to increase collaboration and communication within the data science and scientific computing community. 

Projects cover data vizualization, astrophysics, thermodynamics, fluid mechanics, economy, data analysis, scientific computation, etc ... [@LHCDMreport]

![Supported projects](figures/numfocus.png){width=30%} 


**Example of electromagnetic problem** solved with [FEniCS](https://fenicsproject.org/) (with a 92 lines code)

![Supported projects](figures/magnetostatics_geometry.png){width=49%}![Supported projects](figures/magnetostatics_field.png){width=49%} 

## Quick tour of notebooks {.allowframebreaks}

<div class="notes">
Jupyter notebook environement allow to combine code, plots and notes in a friendly place.
</div>

**Jupyter notebooks** (or how to *try to* get back analysis repoductibility?)
 
 + a single environment combining source code, plots and notes
 
 + great for exploring data or learning new concepts and *document it*

 + many nice features:
    + exportation (html, python, article, slides) -- I'll come back on this
    + sharing with [nbviewer](https://nbviewer.jupyter.org/) & online execution with [mybinder](https://mybinder.org/) (beta)
    + [SWAN](https://swan.web.cern.ch/) online notebooks service at CERN (connected to CERNbox)

 + [jupyter project](http://jupyter.org/) have **many** tutorials via [nbviewer](https://nbviewer.jupyter.org/). E.g.:
    + [signal processing tutorials](https://nbviewer.jupyter.org/github/unpingco/Python-for-Signal-Processing/tree/master/): about 20 tutorials including filtering, markov chains, maximum likelihood approach, etc ...
    + [probabilistic programing](https://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter1_Introduction/Ch1_Introduction_PyMC3.ipynb): 20 pages tutorial with code, plots and explanations.
   

**Example with Fourier analysis**

![](figures/FFT0.png){width=49%}![](figures/FFT1.png){width=49%} 

 + view on [nbviewer](http://nbviewer.jupyter.org/github/rmadar/ExamplesWithPython/blob/master/NotebookExamples/ExampleFFT.ipynb) or [execute on binder](https://mybinder.org/v2/gh/rmadar/ExamplesWithPython/master?filepath=NotebookExamples)
 + clone via [github](https://github.com/rmadar/ExamplesWithPython/)

    
**Example with gaussian processes**

![](figures/GP1.png){width=49%}![](figures/GP2.png){width=49%} 


 + view on [nbviewer](http://nbviewer.jupyter.org/github/rmadar/ExamplesWithPython/blob/master/NotebookExamples/GaussianProcesses.ipynb) or [execute on binder](https://mybinder.org/v2/gh/rmadar/ExamplesWithPython/master?filepath=NotebookExamples)
 + clone via [github](https://github.com/rmadar/ExamplesWithPython/)
 


# In practice: what is great and less great?

## What's great about python {.allowframebreaks}

Python is nice because it's very fast to code!

**Example 1: get all possible pairs**

In [4]:
import itertools 
mu_pt,el_pt = [23,42,55,137],[24,32,61,172]

# Get all pairs
all_pairs = list(itertools.product(mu_pt, el_pt))

# Print all pairs
print('all pairs: {}'.format(str(all_pairs)))

# Print every second pair
print('Every second pair: {}'.format(all_pairs[::2]))

all pairs: [(23, 24), (23, 32), (23, 61), (23, 172), (42, 24), (42, 32), (42, 61), (42, 172), (55, 24), (55, 32), (55, 61), (55, 172), (137, 24), (137, 32), (137, 61), (137, 172)]
Every second pair: [(23, 24), (23, 61), (42, 24), (42, 61), (55, 24), (55, 61), (137, 24), (137, 61)]


**Example 2: generate random binnings**

In [5]:
def generate_bins(n,xmin,xmax,step=1.):
    import numpy as np
    xmin,xmax=xmin/step,xmax/step
    r = np.sort(np.random.random_integers(xmin,xmax,n))*step
    r = np.insert(r,0,xmin*step)
    r = np.insert(r,len(r),xmax*step)
    return r

bins = [generate_bins(10,0,500,5) for i in range(0,5)]
for b in bins: print(b)

[  0  20  70 105 170 215 310 335 385 440 480 500]
[  0  50  55 105 260 310 335 385 400 460 480 500]
[  0  25 185 205 230 245 290 315 355 410 450 500]
[  0  60 110 225 240 260 310 435 460 475 500 500]
[  0  80 115 115 190 260 315 350 385 450 470 500]


In [40]:
print('[test](if this is done in markdown)')

[test](if this is done in markdown)


## What's is not so great about python

## What is great about notebooks

## What is not so great about notebooks

# Concrete examples in ATLAS analysis


# Side discovery: pandocs


# References