# 1: Basics

In this Advanced Python Tutorial we will cover some useful python skills and tips. The lessons are as follows:

* Basics
* Loading data and plotting with matplotlib
* Cut based selction
* Multivariate Analysis with Scikit Learn
* uBoost with hep_ml
* Neural Network Demo
* Mutivariate kinematic reweighting
* The sPlot technique

The first lesson will be on some python basics:

* Mutable and immutable objects in python
* List and dictionaries and comprehensions
* Writing code in markdown
* Jupyter notebook basics
* Importing moduels

## Basics

**Mutability**

* Let's start by comparing some **mutable** and **immutable** objects.
* In python lists and strings are **mutable** and tuples are **immutable**.
* What happens when you run the code below?

In [None]:
a = ['a', 'b', 'c']
b = a
b[1] = 'hello'

print(a)
print(b)

In [None]:
a = {'a': '0', 'b': '1', 'c': '2'}
b = a
b['b'] = 'hello'

print(a)
print(b)

In [None]:
a = 'foo'
b = 'bar'
for c in [a, b]:
    c += '!'

print(a)
print(b)

**List comprehensions**

In [None]:
N = 10

list_of_squares = [i**2 for i in range(N)]
sum_of_squares = sum(list_of_squares)

print('Sum of squares for', N, 'is', sum_of_squares)

**Dictionary comprehensions**

In [None]:
squares = {i: i**2 for i in range(10)}
print(squares)

In [None]:
N = 5
print('The square of', N, 'is', squares[N])

## Markdown

Write comments inline about your code:

Use LaTeX:

$A = \frac{1}{B+C}$

Show lists:

* A wonderful
* List
* This is

Show code with syntax highlighting:

**Python:** (but in a sad grey world)

```
print('Hello world')
```

**Python:**

```python
print('Hello world')
```

**C++:**

```cpp
#include <iostream>

std::cout << "Hello world" << std::endl;
```

**Bash:**

```bash
echo "Hello world"
```

**f-strings**

In [None]:
pt_cut = 1789.234567890987654
eta_low = 2
eta_high = 5

cut_string = f'(PT > {pt_cut:.2f}) & ({eta_low} < ETA < {eta_high})'
print(cut_string)

## Jupyter

Jupyter has some very useful features included that can help make trying things out faster...

Cells have a return value which is shown after the finish runing if it's not `None`:

In [None]:
"Hello starterkitters"

In [None]:
None

Run a shell command:

In [None]:
!ls

In [None]:
!wget https://example.com/index.html

Time how long something takes for one line:

In [None]:
%time sum([i**2 for i in range(10000)])

Time how long an entire cell takes:

In [None]:
%%time
a = sum([i**2 for i in range(10000)])
b = sum([i**2 for i in range(10000)])
c = sum([i**2 for i in range(10000)])

If something takes longer than you expect, you can profile it to find out where it spends it's time:

Jupyter also makes it easy to look at documentation, just add a question mark to the end of the line

In [None]:
def my_print(my_string):
    print(my_string)

In [None]:
my_print?

Two question marks allows you to see the code that is in the function

In [None]:
my_print??

In [None]:
range?

Note that this is done without running the actual line of code so sometimes you need to use a junk variable to make it work

In [None]:
{'a': 'b'}.get?

In [None]:
{'a': 'b'}.get

In [None]:
junk = {'a': 'b'}.get
junk?

## Importing modules

* It is good practice to import all modules at the beginning of your python script or notebook
* Avoid using **wildcard imports** as you it makes it unclear where things come from: for example ```from math import *```
* Below we now have two ```max``` functions and trying to use ```max``` will return an error

In [None]:
max(10, 15)

In [None]:
from numpy import max

max(10, 15)

To avoid this, import `numpy as np` and then use `np.max`.

In [None]:
import numpy as np

np.max([0, 1, 2])

Some common abrivations for packages are:

```python
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import ROOT as R
```

Typially the nicest code uses a mixture of `import X` and `from X import Y` like we have above.

If you're interested in following best practices for style look there is an offical style guide called [PEP8](https://www.python.org/dev/peps/pep-0008/). The document itself is quite long but you can also get automated sytle checkers called 'linters'. Look into [flake8](https://gitlab.com/pycqa/flake8/), either as a command line application or as a plugin for your favourite text editor. Take care though, it's occasionally better to break style rules to make code easier to read!

* Restart the kernal to fix ```max```

In [None]:
max(10, 15)