##### BEP 09/07/2019

# Day-to-day python3 recipes

### Juan Esteras | jestglez@gmail.com

#### EESY in Altran |  Vehicle Performance in Toyota Motor Europe

* Thanks for comming
* first technical talk since **Thesis defense**
* jupyter notebook running python3.7
    * Good for teaching/user manuals, check it out!

# For the enthusiasts

## https://github.com/j14x/python3-tricks-101

* **_Jupyter notebook_** --> **`.ipynb`**
* **_Anaconda_** virtual environment --> **.yaml**
    * `conda-env create -f=path2venv.yaml -y`
* Non Anaconda users dependencies:
    * python3.7
    * jupyter
    * rise (only for slides view)

* Electronics background
    * C for embedded systems
    * VHDL
    * Matlab
* Consultant for Altran since 2017, mainly in automotive
* Toyota:
    * Vehicle Performance/enhancing Driving simulator

# A brief introduction to python
* First released in **1991**
* Created by **Guido Van Rossum**
* Named after the BBC famous comedy **"Monty Python's flying circus"**
* Written in **C**
* Drop **python2** support on **2020**
* 3rd most **popular** language ([June 2019](https://www.tiobe.com/tiobe-index/)) 

# Why python?
![alt text](https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2Fthumb%2Fc%2Fc3%2FPython-logo-notext.svg%2F480px-Python-logo-notext.svg.png&f=1 "python logo")

* What is all this **hype** about python?
* Why would I **pick** python?
* First, python is **fun**
* **4 reassons** we are gonna discuss

## Batteries included

Extensive standard library:
* **Operating System** and **filesystem paths** interfaces --> `import os, pathlib`
* **Command Line Interface** ease of creation --> `import argparse
* **Regular expressions** --> `import re`
* **Mathematics** --> `import math, statistics`
* **Internet access** for URL and mail server --> `import urllib, smtplib`
* **Quality and testing** --> `import docstring, unittest`

* **Tooling** to build an app

## Extremely generative open-source community
* [> 175k libraries](https://pypistats.org/) in [PyPI](https://pypi.org/)
![alt text](https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Findianpythonista.files.wordpress.com%2F2017%2F01%2Fpypi.png&f=1 "PyPI logo")

>**`pip install opensource_library`**

* PyPI --> **The Python Package Index**
* Tremendously generous
* Some libraries are sponsored, most are not

![alt text](https://d1jnx9ba8s6j9r.cloudfront.net/blog/wp-content/uploads/2018/07/Python-Domains-Edureka-1.png "python application domains")

* **Multidisciplinary** ecosystem
* scripting language --> **general purpose lang**
* **standard library is the foundations** of most of this libraries

## A glue language
Python can be **extended** with:
* **C/C++** --> [CPython API](https://docs.python.org/3/c-api/index.html) (e.g. _Numpy_)
* **Java** --> [Jython](https://www.jython.org/)

## When you want to extend python?
* Python **higher-level control of the program**
* **Performace-critical** parts of an app

# What is _pythonic_ code?
![alt text](https://proxy.duckduckgo.com/iu/?u=http%3A%2F%2Fsegmentfault.com%2Fimg%2FbVcHLM&f=1 "Guido van Rossum holding a python snake")

## This talk is also about idiomatic python code == pythonic
* First, we need to undestand what idiomatic code is and why is important

* Is **compartmentalised** into small pieces


* Every piece **does one thing** well


* This one thing is what would **expect**

## Dude, my python code works alright

* **Readbility** --> favours **transfer-knowledge**

* **Efficiency** --> no need to **reinvent the wheel** --> **standard library** to the rescue

* Better **performance** --> **faster** and less **silent exceptions**

* **superpower/problem** --> we can write **non-pythonic code** is most likely gonna **work**
* interpreted language --> do not stop us from trying
>**Idiomatic code** and **clear design (achitecture)** are the best documentation
* pythonic code --> getting to know the **standard lib**  --> **Good habits**
* Performance --> **Big data/file sets** and **batch processing**

## _The Zen of Python_, the definitive guide-to-pythonic

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


* Tim Peters use it to challenge Guido van Rossum to write a **guide to pythonic code**
* Guido turn around the challenge and included Tim's poem into the standard library as **easter egg**
## Does someone know another python's easter egg?
* `import antigravity`

# Quiz time

* Python my main programming language

* Use a specific library (_pandas, django_...)

* know most of the built-in types

* know parts of the standard library

I'd like you to raise your hand if you feel represented by the sentence

# Index of contents
1. `string` formatting in python:
    * Legacy formats
    * f-strings
2. Python's _sequences_ (`string`, `tuple` & `list`):
    * Indexing
    * Iterating
    * Unpacking
    * Mutability
    * Tips & tricks

# Index of contents:
3. Conditional expressions & function arguments
4. `set`s, the great forgotten:
    * Getting unique _sequences_
    * Member testing
5. _Mappings_ (`Enum` & `dict`):
    * Enums in python
    * Dictionaries tips & tricks
6. Bonus track: demonstration time

# String formatting in python:

## * Legacy formats
## * f-strings

# String formatting in python: legacy formats
Over the time python has suported up to **3 different ways of formatting strings**:

* `printf`-like --> **`%`** formatting:

In [2]:
event = 'BEP'
"Today, we are attending to a %s" % event

'Today, we are attending to a BEP'

* object-like --> **`"{}".format()`** formatting:

In [3]:
"Today, we are attending to a {}".format(event)

'Today, we are attending to a BEP'

### f-string --> **f"{}"** since python3.6

In [4]:
f"Today, we are attending to a {event}"

'Today, we are attending to a BEP'

>Now is better than never.

In [5]:
import sys

my_os = sys.platform
print(my_os)

linux


In [6]:
%timeit f"Jupyter notebook running on a {my_os} OS"

79.9 ns ± 2.16 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


`f-string` under comparison:

In [7]:
%timeit "Jupyter notebook running on a %s OS" % my_os

172 ns ± 8.69 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


In [8]:
%timeit "Jupyter notebook running on a {} OS".format(my_os)

249 ns ± 5.66 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


# _Sequences_ `string`, `tuple` & `list`:
## * Indexing
## * Iterating
## * Unpacking
## * Operators
## * Mutability
## * Tips & tricks

## Indexing _sequences_

**_Sequences_** (`string`, `tuple` & `list`) can be both **positive** and **negative** indexed: 

![alt text](./indexing.png "indexing in python")

>Special cases aren't special enough to break the rules.

>Although practicality beats purity.

python's _sequences_ (`string`, `tuple` & `list`) are **0 indexed**

In [9]:
'pythonic'[0]

'p'

In [10]:
len('pythonic')

8

In [11]:
'pythonic'[0] == 'pythonic'[-8]

True

In [12]:
'pythonic'[-1]

'c'

**Tip**: by using **[0]** and **[-1]** as index for the _sequence_ start and end repectively, **we do not need to know the lenght!**

and can be **_sliced_**, following this rule [**start**:**end+1**]

In [13]:
my_str = 'pythonic'

my_str[0:6]  # == my_str[:6]

'python'

In [14]:
my_str[6:8]  # == my_str[6:]

'ic'

**Tip**: by using **empty start and end index** we can indicate beggining and end of the _sequence_ slice repectively

**slicing** supports the **step** parameter [start:end+1:**step**]:

In [15]:
my_str[0:8:3]  # my_str[::3]

'phi'

slicing backwards with step 1 is equal to **reverse**:

In [16]:
my_str[::-1]

'cinohtyp'

**Tip**: by using **empty start and end index** we can indicate we are stepping over the full _sequence_

## Iterating _sequences_ (`string`, `tuple` & `list`)

_Sequences_ implements the **iterator** method: `iter()`. So the `for` loop enables us to **iterate** over them:

### iteration is the way to transverse any array-like structure in python
* `for` loop gets feed with iterables

In [17]:
my_str

'pythonic'

In [18]:
for char in my_str[:6]:
    print(char)

p
y
t
h
o
n


**`enumerate()`** returns a `tuple` with the **index** and the item:

In [19]:
for idx, char in enumerate(my_str[:6]):
    print(f"Index {idx} --> {char}")

Index 0 --> p
Index 1 --> y
Index 2 --> t
Index 3 --> h
Index 4 --> o
Index 5 --> n


In [20]:
for idx, char in enumerate(my_str[6:]):
    print(f"Index {idx} --> {char}")

Index 0 --> i
Index 1 --> c


In [21]:
for idx, char in enumerate(my_str[6:]):
    print(f"Index {idx+6} --> {char}")

Index 6 --> i
Index 7 --> c


we can use the optional argument **`start`** on `enumerate` to **compensate the index offset**:

In [22]:
for idx, char in enumerate(my_str[6:], start=6):
    print(f"Index {idx} --> {char}")

Index 6 --> i
Index 7 --> c


## Unpacking  _sequences_ (`string`, `tuple` & `list`)

In [23]:
person = ('Alice', 'artist', 42)  # == tuple('Alice', 'artist', 42)

person

('Alice', 'artist', 42)

As it is known, this _sequence_ holds 3 items, **we can unpack them into 3 variables**:

In [24]:
name, profession, age = person

print(f"{name} works as {profession} and is {age} years old")

Alice works as artist and is 42 years old


**`_`** is used as **convention for dropping values** when unpacking:

In [25]:
name, _, age = person

print(f"{name} is {age} years old")

Alice is 42 years old


**`*`** is used for **gathering items in a single variable** when unpacking:

In [26]:
name, *rest = person  # without `*`, it'll raise ValueError

rest

['artist', 42]

**Tip**: we can also use a combination of both **`*_`** to gather multiple drops

## Operators for  _sequences_ (`string`, `tuple` & `list`)

**`in`/`not in`** --> **member testing**:

In [27]:
months = ['Jan', 'Feb', 'Mar', 'Jun']

In [28]:
'Jan' in months

True

In [29]:
'Jul' not in months

True

### Question time: Are these expressions equivalent?

In [30]:
'Jan' and 'Feb' in months

True

In [31]:
'Jan' in months and 'Feb' in months

True

In [32]:
'Dec' and 'Feb' in months

True

In [33]:
'Dec' in months and 'Feb' in months

False

>In the face of ambiguity, refuse the temptation to guess.

**`True` and `in`** has **more priority** than **`and`**

## Mutability in _sequences_ (`string`, `tuple` & `list`)
>Object's ability to **change state dinamically** _after_ creation

* **Mutable** _sequences_: `list`

* **Inutable** _sequences_: `string` & `tuple`

In [34]:
abba = 'abba'

### Question time: Can we access the last item `abba[-1]` and overwrite it by `'c'`?

In [35]:
try:
    abba[-1] = 'c'  # raises TypeError
except TypeError as e:
    print(e)

'str' object does not support item assignment


* **faster read/write** of references --> **fix and unchained** storage
* python **pre-allocates** strings at startup and **reuses** them

* mutable --> can change in place
    
* inmutable --> cannot change in place

In [36]:
a = 'a'

In [37]:
abba[-1]

'a'

In [38]:
id(a) == id(abba[-1])

True

**Both references of 'a'** are pointing to the **same memory location**:
* Faster access
### Question time: Why we can afford to reference `'a'` multiple times?

* inmutable _sequences_ **cannot be modified in place**
* `id` is convenient for **debugging**

## _Sequences_ tips & tricks

### Faster `string` manipulation

In [39]:
labels = 'Date, Time, Client Name, Product, Purchase Price'.split(', ')

labels

['Date', 'Time', 'Client Name', 'Product', 'Purchase Price']

In [40]:
labels = 'Date, Time, Client Name, Product, Purchase Price'.split(', ', maxsplit=3)

labels

['Date', 'Time', 'Client Name', 'Product, Purchase Price']

In [41]:
joined = ', '.join(labels)

joined

'Date, Time, Client Name, Product, Purchase Price'

### How to modify an inmutable _sequence_ in place:

we can **convert** it to a **mutable** _sequence_:

In [42]:
mutable_abba = [char for char in abba]

mutable_abba

['a', 'b', 'b', 'a']

In [43]:
mutable_abba[-1] = 'c'
 
''.join(mutable_abba)

'abbc'

### Numeric lists using **`list comprenhention`** syntax and **`range(start, stop, step)`**:

In [44]:
octet = []
for n in range(1, 9, 1):
    octet.append(n)
    
octet

[1, 2, 3, 4, 5, 6, 7, 8]

In [45]:
octet = [n for n in range(1, 9, 1)]

octet

[1, 2, 3, 4, 5, 6, 7, 8]

>Flat is better than nested.

If we only pass **one argument** to `range()`, it takes **(start=0, stop=your_arg, step=1)**:

In [46]:
octet = [n + 1 for n in range(8)]

octet

[1, 2, 3, 4, 5, 6, 7, 8]

**`if-else`** condition can be used as filter:

In [47]:
evens = [n for n in octet if not n%2]

evens

[2, 4, 6, 8]

Initalizing a **nested `list`** to create a **_tic-tac-toe_ board**:

In [48]:
tic_tac_toe_board1 = [['_'] * 3] * 3

for row in tic_tac_toe_board1:
    print(row)

['_', '_', '_']
['_', '_', '_']
['_', '_', '_']


In [49]:
tic_tac_toe_board2 = [['_'] * 3 for i in range(3)]

for row in tic_tac_toe_board2:
    print(row)

['_', '_', '_']
['_', '_', '_']
['_', '_', '_']


### Question time: Are the _tic-tac-toe_ board methods equivalent?

**Tip**: python let us assign multiple variables to the same value

In [50]:
tic_tac_toe_board1[1][1] = tic_tac_toe_board2[1][1] = 'X'

In [51]:
for row in tic_tac_toe_board2:
    print(row)

['_', '_', '_']
['_', 'X', '_']
['_', '_', '_']


In [52]:
for row in tic_tac_toe_board1:
    print(row)

['_', 'X', '_']
['_', 'X', '_']
['_', 'X', '_']


The **`_board1` has created 3 lists which are referencing one** while the **`_board2` creates 3 independent lists**

# Conditonal expressions and function arguments

>functions in python has **_regular, arbitrary, default and keyword_** arguments

## * _Regular_ arguments
## * _Arbitrary_ arguments
## * _Default_ arguments

### **_Regular_** args: functions raise exception when missing

In [53]:
def person(name, profession, age):
    return f"{name} works as {profession} and has {age} years old"

**Tip**: python let's us make **multiple in-line assigments**:

In [54]:
nm, pfss, age = 'Alice', 'artist', '42'

There are **different ways of passing _mandatory_ arguments** to a function:

In [55]:
person(name=nm, profession=pfss, age=age)

'Alice works as artist and has 42 years old'

In [56]:
person(nm, profession=pfss, age=age)

'Alice works as artist and has 42 years old'

In [57]:
person(nm, pfss, age)

'Alice works as artist and has 42 years old'

**Packing!**

In [58]:
alice_attrs = (nm, pfss, age)

**Unpacking!**

In [59]:
person(*alice_attrs)

'Alice works as artist and has 42 years old'

### The _sequences_ dilema: _arbitrary_ args to the rescue

In [60]:
def batch_processor(files):
    for idx, file in enumerate(files, start=1):
        print(f"[{idx:2d}] Processing file: {file}")

In [61]:
files = [f"text_file{n}.txt" for n in range(3)]

files

['text_file0.txt', 'text_file1.txt', 'text_file2.txt']

In [62]:
batch_processor(files)

[ 1] Processing file: text_file0.txt
[ 2] Processing file: text_file1.txt
[ 3] Processing file: text_file2.txt


What if we pass a single string, **it is _sequence_ so we can iterate over it**, right?

In [63]:
files[0]

'text_file0.txt'

In [64]:
batch_processor(files[0])

[ 1] Processing file: t
[ 2] Processing file: e
[ 3] Processing file: x
[ 4] Processing file: t
[ 5] Processing file: _
[ 6] Processing file: f
[ 7] Processing file: i
[ 8] Processing file: l
[ 9] Processing file: e
[10] Processing file: 0
[11] Processing file: .
[12] Processing file: t
[13] Processing file: x
[14] Processing file: t


**`*args`** to the rescue!

In [65]:
def batch_processor(*files):
    for idx, file in enumerate(files, start=1):
        print(f"[{idx}] Processing file: {file}")

In [66]:
batch_processor(files[0])

[1] Processing file: text_file0.txt


In [67]:
batch_processor(files[0], files[1], files[2])

[1] Processing file: text_file0.txt
[2] Processing file: text_file1.txt
[3] Processing file: text_file2.txt


In [68]:
batch_processor(*files)

[1] Processing file: text_file0.txt
[2] Processing file: text_file1.txt
[3] Processing file: text_file2.txt


**`*files` is now a `tuple`**, by convention this variable is named **`*args`**

## The _one-liner_ `if-else` statement

In [104]:
def greater_than_5(n):
    if n > 5:
        return True
    return False

In [105]:
greater_than_5(3)

False

In [71]:
def greater_than_6(n):
    return True if n > 6 else False

In [72]:
greater_than_6(7)

True

### Where is the difference besides the number of lines?

>Explicit is better than implicit.


###  _Default_ args and conditional expressions, the perfect marrige

In [73]:
files = [f"text_file{n}.txt" for n in range(3)]

files

['text_file0.txt', 'text_file1.txt', 'text_file2.txt']

**_default_ arguments** must have a **default value**:

In [106]:
def batch_processor(*files, reversed=False):
    iterable = files if not reversed else files[::-1]
    
    for idx, file in enumerate(iterable):
        print(f"[{idx}] Processing file: {file}")

In [107]:
batch_processor(*files, reversed=False)

[0] Processing file: text_file0.txt
[1] Processing file: text_file1.txt
[2] Processing file: text_file2.txt


In [108]:
batch_processor(*files)

[0] Processing file: text_file0.txt
[1] Processing file: text_file1.txt
[2] Processing file: text_file2.txt


In [109]:
batch_processor(*files, reversed=True)

[0] Processing file: text_file2.txt
[1] Processing file: text_file1.txt
[2] Processing file: text_file0.txt


# `set`, the great forgotten

>[...]an **unordered** collection of **unique objects**[...]

## * Membership testing
## * Making a sequence _unique_
## * Mathematical operations (useful for big data-sets)

>Simple is better than complex.

In [78]:
palindrome = "madam"

uniques = []
for char in palindrome:
    if char not in uniques:
        uniques.append(char)
        
uniques

['m', 'a', 'd']

In [79]:
palindome = "madam"
uniques = set(palindome)  # no order is guaranteed

uniques

{'a', 'd', 'm'}

A **`list`** does not implement a subset checker for iterables **and this is because it does not unpack the sequence into individual items performing a check on every one of them.**

**`dir()`** provides information about the **properties and methods of an object**:

In [80]:
dir(list)[-11:]

['append',
 'clear',
 'copy',
 'count',
 'extend',
 'index',
 'insert',
 'pop',
 'remove',
 'reverse',
 'sort']

In [81]:
mutable_methods = ['append', 'pop', 'extend', 'remove']

In [82]:
mutable_methods in dir(list)

False

In [83]:
setted = set(mutable_methods)

setted

{'append', 'extend', 'pop', 'remove'}

In [84]:
setted.issubset(dir(list))

True

# _Mappings_ (`Enum` & `dict`)
## * Enums
## * `dict` trick

## Enums
>An only-read dictionary

* Inmutable
* Based on **`types.MappingProxyType`**

A propper way of defining **only-read "macros"**

In [85]:
from enum import Enum

class Server(Enum):
    BUFFER_LEN = 110
    PORT = 1024
    IP = 'localhost'

print(f"Buffer lenght: {Server.BUFFER_LEN.value}, listening to {Server.IP.value}:{Server.PORT.value}")

Buffer lenght: 110, listening to localhost:1024


What if we try to modify an enumerator?

In [86]:
try:
    Server.PORT.value = 42
except AttributeError as e:
    print(e)
    

can't set attribute


## A `dict` trick: how to create a `switch`
In python **there is not** built-in `switch`

In [87]:
cases = 'Jan Apr Aug Oct'.split()

cases

['Jan', 'Apr', 'Aug', 'Oct']

In [88]:
def winter():
    return 'this month is on winter season'

def spring():
    return 'this month is on spring season'

def summer():
    return 'this month is on summer season'

def autumn():
    return 'this month is on autumn season'

blocks = [winter, spring, summer, autumn]

blocks

[<function __main__.winter()>,
 <function __main__.spring()>,
 <function __main__.summer()>,
 <function __main__.autumn()>]

In [89]:
mapping = {case: block for case, block in zip(cases, blocks)}

In [90]:
def default():
    return 'Option not available'

def switch(case):
    result = mapping.get(case, default)
    return result()

In [91]:
switch('Jan')

'this month is on winter season'

In [92]:
switch('Aug')

'this month is on summer season'

In [93]:
switch('Dec')

'Option not available'

# Bonus track: demonstration time

### How the `for` loop consumes iterables inside?

In [94]:
sequence = [n**2 for n in range(3)]

sequence

[0, 1, 4]

No we **_activate_ the iterating capabilities** of the _sequence_ by calling **`iter()`** on it:

In [95]:
iterable = iter(sequence)

Next we can start **_consuming_ each _iterator_** present on the _iterable_ by calling **`next()`**:

In [96]:
next(iterable)

0

In [97]:
next(iterable)

1

In [98]:
next(iterable)

4

We've already **exhausted** the _iterable_ so **`StopIteration`** will be raised:

In [99]:
try:
    next(iterable)  # raises StopIteration
except StopIteration:
    print('This exception is special, it is not an error as such but an STOP message')

This exception is special, it is not an error as such but an STOP message


>If the implementation is easy to explain, it may be a good idea.

In [100]:
def for_loop_print(sequence):
    iterable = iter(sequence)
    idx = 0
    
    while True:
        try:
            result = next(iterable)
        except StopIteration:
            break
        else:
            print(f"Index [{idx}] --> {result}")
            idx += 1
    print("\nIterable is exhausted")

In [101]:
for_loop_print(list(range(4)))

Index [0] --> 0
Index [1] --> 1
Index [2] --> 2
Index [3] --> 3

Iterable is exhausted
