# 1: Basics

This Python tutorial will cover the basics of
 - Jupyter Notebooks
 - basic types in Python
 - container types
 - control flow

# Jupyter

A Jupyter Notebook consists of cells that can be executed. There are two fundamental different types of cells, markdown or code cells. The former renders the text as markdown while the latter runs the Python interpreter.

Notebooks can be converted to Python scripts.

Be aware that notebooks are stateful. E.g. executing a cell again is like adding this line two times in a script.

Useful commands:
 - to run a cell and go to the next one, press Shift + Enter
 - to move from editing mode to navigation (move up and down in cells), press Esc
 - to enter editing mode from navigation, press Enter
 
Let's start! Execute the lines below

In [7]:
a = 0

In [9]:
a = a + 1
print(a)

2


If we rerun the cell above, `a` will have changed again! If you want to rerun everythin, restart the kernel first.

Notebooks offer convenient functionality, similar to IPython:
 - can use `!command` to execute bash
 - has tab-completion
 - prints automatically return value if it is not assigned to a variable
 - magic commands like %timeit
 
We will use them later on more

## Basic types and operations

Python has several basic types
 - numerical (float, int, complex)
 - string
 - bool
 
There are several operations defined on them, as we have already seen in examples.


In [42]:
a = 1  # creates an integer

b = 3.4  # float

# several ways for strings
c = "hello"
d = 'world'
cd = "welcome to this 'world' here"  # we can now use '' inside (or vice versa)
e = """hello world"""  # which we can also wrap
e2 = """hello
world
come here!"""

g = True

In [16]:
type(a)

int

With `type(...)`, we can determine the type of an object.

## strong typing

Python is **strongly typed** (as opposed to weakly typed). This means that the type of the variable _matters_ and some interactions between certain types are not directly possible.

In [17]:
a = 1
b = 2

In [41]:
a + b

4.4

These are two integers. We are not astonished that this works. What about the following?

In [29]:
mix_str_int = a + "foo"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Maybe the following works?

In [31]:
mix_str_int2 = a + "5"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Python is strict on the types, but we can always convert from one type to another (if a conversion is possible):

In [33]:
a + int("5")

6

...which works because `int("5") -> 5`.

There are though some implicit conversions in Python, let's look at the following:

In [20]:
f = 1.2
print(type(f))

<class 'float'>


In [27]:
int_plus_float = a + f
print(type(int_plus_float))

<class 'float'>


This is one of the few examples, where Python automatically converts the integer type to a float. The above addition _actually_ reads as

In [34]:
int_plus_float = float(a) + f

Similar with booleans as they are in principle 1 (`True`) and 0 (`False`)

In [37]:
True + 5

6

For readability, it is usually better to write an explicit conversion.

## Container types

Python has several container types as also found in other languages. The most important ones are:
 - list  (~array in other languages)
 - dict  (~hash table in other languages)
 
They can contain other objects which can then be assigned and accessed via the `[]` operator (_we will have a closer look at operators later on_)

A list stores elements by indices, which are integers, while a dict stores elements by a `key`, which can be "any basic type" (to be precise: by their "hash", it can be any immutable type).

Let's look at examples!

In [49]:
# creating a list
list1 = [1, 2, 3]
print(list1)

[1, 2, 3]


We can access these element by indices, starting from 0

In [52]:
list1[0]

1

We can also assign a value to an place in the list

In [54]:
list1[1] = 42
print(list1)

[1, 42, 3]


and it can be extended with elements

In [65]:
list1.append(-5)
print(list1)

[1, 42, 3, -5, -5, -5]


Choosing a value that is not contained in the list raises an error. It is verbose, read and understand it.

Being able to understand and interpret errors correctly is a key to becoming better in coding.

In [67]:
list1[14]

IndexError: list index out of range

We can play a similar game with dicts

In [59]:
person = {'name': "Jonas Eschle", 'age': 42, 5: True, 11: "hi"}  # we can use strings but also other elements
print(person)

{'name': 'Jonas Eschle', 'age': 42, 5: True, 11: 'hi'}


In [70]:
print(person['name'])
print(person[5])
print(person[11])

Jonas Eschle
True
hi


We can also assign a new value to a key.

In [72]:
person['age'] = '42.00001'

... or even extend it by assigning to a key that did not yet exists in the dict

In [74]:
person['alias'] = "Mayou36"

As we see this works. Again, selecting a key that is not contained in the dict raises an error.

In [69]:
person['nationality']

KeyError: 'nationality'

As any object in Python, there are many useful methods on `list` and `dict` that help you accomplish things. For example, what if we want to retrieve a value from a dict _only_ if the key is there and otherwise return a default value? We can use `get`:

In [77]:
hair_color = person.get('hair_color', 'unknown')  # the second argument gets returned if key is not in dict
print(hair_color)

unknown


## dynamic typing

Python is dynamically typed (as opposed to statically typed). This means that a variable, which once was an int, such as `a`, can be assigned a value of another type.

In [44]:
a = 1

In [46]:
a = "one"

**Mutability**

* Let's start by comparing some **mutable** and **immutable** objects.
* In python lists and strings are **mutable** and tuples are **immutable**.
* What happens when you run the code below?

In [1]:
a = ['a', 'b', 'c']
b = a
b[1] = 'hello'

print(a)
print(b)

['a', 'hello', 'c']
['a', 'hello', 'c']


In [2]:
a = {'a': '0', 'b': '1', 'c': '2'}
b = a
b['b'] = 'hello'

print(a)
print(b)

{'a': '0', 'b': 'hello', 'c': '2'}
{'a': '0', 'b': 'hello', 'c': '2'}


In [3]:
a = 'foo'
b = 'bar'
for c in [a, b]:
    c += '!'

print(a)
print(b)

foo
bar


**List comprehensions**

In [4]:
N = 10

list_of_squares = [i**2 for i in range(N)]
sum_of_squares = sum(list_of_squares)

print('Sum of squares for', N, 'is', sum_of_squares)

Sum of squares for 10 is 285


**Dictionary comprehensions**

In [5]:
squares = {i: i**2 for i in range(10)}
print(squares)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


In [6]:
N = 5
print('The square of', N, 'is', squares[N])

The square of 5 is 25


## Markdown

Write comments inline about your code:

Use LaTeX:

$A = \frac{1}{B+C}$

Show lists:

* A wonderful
* List
* This is

Show code with syntax highlighting:

**Python:** (but in a sad grey world)

```
print('Hello world')
```

**Python:**

```python
print('Hello world')
```

**C++:**

```cpp
#include <iostream>

std::cout << "Hello world" << std::endl;
```

**Bash:**

```bash
echo "Hello world"
```

**f-strings**

In [7]:
pt_cut = 1789.234567890987654
eta_low = 2
eta_high = 5

cut_string = f'(PT > {pt_cut:.2f}) & ({eta_low} < ETA < {eta_high})'
print(cut_string)

(PT > 1789.23) & (2 < ETA < 5)


## Jupyter

Jupyter has some very useful features included that can help make trying things out faster...

Cells have a return value which is shown after the finish runing if it's not `None`:

In [8]:
"Hello starterkitters"

'Hello starterkitters'

In [9]:
None

Run a shell command:

In [10]:
!ls 

1Basics.ipynb               8sPlot.ipynb
2DataAndPlotting.ipynb      README.md
3Classification.ipynb       [34mdata[m[m
4Extension.ipynb            index.html
[31m5BoostingToUniformity.ipynb[m[m real_data.root
6DemoNeuralNetworks.ipynb   simulated_data.root
7DemoReweighting.ipynb


In [11]:
!wget https://example.com/index.html

--2019-10-11 13:23:11--  https://example.com/index.html
Resolving example.com (example.com)... 2606:2800:220:1:248:1893:25c8:1946, 93.184.216.34
Connecting to example.com (example.com)|2606:2800:220:1:248:1893:25c8:1946|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1270 (1.2K) [text/html]
Saving to: ‘index.html.1’


2019-10-11 13:23:12 (37.8 MB/s) - ‘index.html.1’ saved [1270/1270]



Time how long something takes for one line:

In [12]:
%time sum([i**2 for i in range(10000)])

CPU times: user 3.42 ms, sys: 340 µs, total: 3.76 ms
Wall time: 3.8 ms


333283335000

Time how long an entire cell takes:

In [13]:
%%time
a = sum([i**2 for i in range(10000)])
b = sum([i**2 for i in range(10000)])
c = sum([i**2 for i in range(10000)])

CPU times: user 12.2 ms, sys: 491 µs, total: 12.7 ms
Wall time: 12.6 ms


If something takes longer than you expect, you can profile it to find out where it spends it's time:

Jupyter also makes it easy to look at documentation, just add a question mark to the end of the line

In [14]:
def my_print(my_string):
    print(my_string)

In [15]:
my_print?

[0;31mSignature:[0m [0mmy_print[0m[0;34m([0m[0mmy_string[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mFile:[0m      ~/Development/analysis-essentials-1/advanced-python/<ipython-input-14-5048fdec001b>
[0;31mType:[0m      function


Two question marks allows you to see the code that is in the function

In [16]:
my_print??

[0;31mSignature:[0m [0mmy_print[0m[0;34m([0m[0mmy_string[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m <no docstring>
[0;31mSource:[0m   
[0;32mdef[0m [0mmy_print[0m[0;34m([0m[0mmy_string[0m[0;34m)[0m[0;34m:[0m[0;34m[0m
[0;34m[0m    [0mprint[0m[0;34m([0m[0mmy_string[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mFile:[0m      ~/Development/analysis-essentials-1/advanced-python/<ipython-input-14-5048fdec001b>
[0;31mType:[0m      function


In [17]:
range?

[0;31mInit signature:[0m [0mrange[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
range(stop) -> range object
range(start, stop[, step]) -> range object

Return an object that produces a sequence of integers from start (inclusive)
to stop (exclusive) by step.  range(i, j) produces i, i+1, i+2, ..., j-1.
start defaults to 0, and stop is omitted!  range(4) produces 0, 1, 2, 3.
These are exactly the valid indices for a list of 4 elements.
When step is given, it specifies the increment (or decrement).
[0;31mType:[0m           type
[0;31mSubclasses:[0m     


Note that this is done without running the actual line of code so sometimes you need to use a junk variable to make it work

In [18]:
{'a': 'b'}.get?

Object `get` not found.


In [None]:
{'a': 'b'}.get

In [19]:
{'a': 'b'}.get

<function dict.get(key, default=None, /)>

In [20]:
{'a': 'b'}.get

<function dict.get(key, default=None, /)>

In [21]:
{'a': 'b'}.get

<function dict.get(key, default=None, /)>

In [22]:
{'a': 'b'}.get

<function dict.get(key, default=None, /)>

In [23]:
{'a': 'b'}.get

<function dict.get(key, default=None, /)>

In [24]:
junk = {'a': 'b'}.get
junk?

[0;31mSignature:[0m [0mjunk[0m[0;34m([0m[0mkey[0m[0;34m,[0m [0mdefault[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0;34m/[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Return the value for key if key is in the dictionary, else default.
[0;31mType:[0m      builtin_function_or_method


## Importing modules

* It is good practice to import all modules at the beginning of your python script or notebook
* Avoid using **wildcard imports** as you it makes it unclear where things come from: for example ```from math import *```
* Below we now have two ```max``` functions and trying to use ```max``` will return an error

In [25]:
max(10, 15)

15

In [26]:
from numpy import max
max(10, 15)

AxisError: axis 15 is out of bounds for array of dimension 0

To avoid this people often import `numpy` and then use `numpy.max` or, if they don't like typing, import it as `np` like we have above:

In [27]:
import numpy as np
np.max([0, 1, 2])

2

Some common abrivations for packages are:

```python
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
import ROOT as R
```

Typially the nicest code uses a mixture of `import X` and `from X import Y` like we have above.

If you're interested in following best practices for style look there is an offical style guide called [PEP8](https://www.python.org/dev/peps/pep-0008/). The document itself is quite long but you can also get automated sytle checkers called 'linters'. Look into [flake8](https://gitlab.com/pycqa/flake8/), either as a command line application or as a plugin for your favourite text editor. Take care though, it's occasionally better to break style rules to make code easier to read!

* Restart the kernal to fix ```max```

In [1]:
max(10, 15)

15