# Basic Python workout

This section is intended to provide a set of basic exercises to refresh basic Python concepts so that you can code in Python with confidence.

The section assumes that a valid kernel is available for the Jupyter code blocks to run.
Some familiarity with Python is also expected.

## Table of Contents

+ [Section 1 &mdash; Hello, Python!](#section-1--hello-python)
+ [Section 2 &mdash; Hello, Lists!](#section-2--hello-lists)
+ [Section 3 &mdash; Tuples](#section-3--tuples)
+ [Section 4 &mdash; Sets](#section-4--sets)
+ [Section 5 &mdash; Dictionaries](#section-5--dictionaries)
+ [Section 6 &mdash; Ranges](#section-6--ranges)
+ [Section 7 &mdash; List comprehensions](#section-7--list-comprehensions)
+ [Section 8 &mdash; Generators](#section-8--generators)
+ [Section 9 &mdash; The *zip* operation](#section-9--the-zip-operation)
+ [Section 10 &mdash; The `*` (star or asterisk) operator](#section-10--the--star-or-asterisk-operator)
+ [Section 11 &mdash; Exceptions](#section-11--exceptions)
+ [Section 12 &mdash; String formatting/templating](#section-12--string-formattingtemplating)
+ [Section 13 &mdash; Functions](#section-13--functions)
+ [Section 14 &mdash; OOP](#section-14--oop)
+ [Section 15 &mdash; Creating and importing libraries](#section-15--creating-and-importing-libraries)
+ [Section 16 &mdash; Documenting your code](#section-16--documenting-your-code)
+ [Section 17 &mdash; Files](#section-17--files)
+ [Section 18 &mdash; The `with` statement](#section-18--the-with-statement)
+ [Section 19 &mdash; Interacting with the underlying OS](#section-19--interacting-with-the-underlying-os)
+ [Section 20 &mdash; Date and Time](#section-20--date-and-time)
+ [Section 21 &mdash; Python Style Guide](#section-21--python-style-guide)
+ [Section 22 &mdash; More on Operators](#section-22--more-on-operators)
+ [Section 23 &mdash; More on strings](#section-23--more-on-strings)
+ [Section 24 &mdash; More on Booleans](#section-24--more-on-booleans)
+ [Section 25 &mdash; Enums](#section-25--enums)
+ [Section 26 &mdash; Emulating constants](#section-26--emulating-constants)
+ [Section 27 &mdash; Reading user input](#section-27--reading-user-input)
+ [Section 28 &mdash; More on `if`](#section-28--more-on-if)
+ [Section 29 &mdash; More on functions](#section-29--more-on-functions)
+ [Section 30 &mdash; More on loops](#section-30--more-on-loops)
+ [Section 31 &mdash; The Python Standard Library](#section-31--the-python-standard-library)
+ [Section 32 &mdash; The PEP8 Python Style Guide](#section-32--the-pep8-python-style-guide)
+ [Section 33 &mdash; Variable scope rules in Python](#section-33--variable-scope-rules-in-python)
+ [Section 34 &mdash; Decorators](#section-34--decorators)
+ [Section 35 &mdash; Introspection/Reflection](#section-35--introspectionreflection)
+ [Section 36 &mdash; More on Exceptions](#section-36--more-on-exceptions)
+ [Section 37 &mdash; More on operator overloading](#section-37--more-on-operator-overloading)
+ [Section 38 &mdash; Collection functions](#section-38--collection-functions)
+ [Section 39 &mdash; CLI/Terminal Arguments basics in Python](#section-39--cliterminal-arguments-basics-in-python)
+ [Section 40 &mdash; Regular Expressions (regex/regexp)](#section-40--regular-expressions-regexregexp)
+ [Section 41 &mdash; More on Data Containers](#section-41--more-on-data-containers)
+ [Section 42 &mdash; More on Slicing](#section-42--more-on-slicing)
+ [Section 43 &mdash; Finding items in a sequence](#section-43--finding-items-in-a-sequence)
+ [Section 44 &mdash; More on Iterables and Iterators](#section-44--more-on-iterables-and-iterators)
+ [Section 45 &mdash; More on list comprehensions, dictionary comprehensions, and set comprehensions](#section-45--more-on-list-comprehensions-dictionary-comprehensions-and-set-comprehensions)
+ [Section 46 &mdash; Type Hinting: A gentle and focused introduction](#section-46--a-lighter-and-focused-introduction-into-type-hinting)
+ [Section 47 &mdash; Increasing function flexibility with `*args`, `**kwargs` and `/`](#section-47--increasing-function-flexibility-with-args-kwargs-and)
+ [Section 48 &mdash; More on OOP](#section-48--more-on-oop)
+ [Section 49 &mdash; Processing JSON](#section-49--processing-json)
+ [Section 50 &mdash; More on files](#section-50--more-on-files)
+ [Section 51 &mdash; Logging](#section-51--logging)
+ [Section 52 &mdash; Unit Testing](#section-52--unit-testing)
+ [Section 53 &mdash; The *walrus* `:=` operator](#section-53--the-walrus--operator)

## Section 1 &mdash; Hello, Python!

### Hello, World! in Python

Write the canonical "Hello, World!" program in Python. To make it a bit more interesting, define a variable for the greeting message.

In [1]:
message = "Hello, World!"

print(message)

Hello, World!


### Hello functions

Functions are first-class citizens in Python. Declare a function `f(x)` that returns the square of the value received as an argument.

Then, use the function to compute and show the result of $ 5^2 $ and $ 7^2 $.

In [2]:
def f(x):
    return x**2

print(f(5))
print(f(7))

25
49


### Functions as arguments to other functions

Create a function `add(x, y)` that sums up the numbers received as arguments, and a function `sub(x, y)` that subtracts that numbers.

Define the function `compute(x, y, op)` that receives as arguments two numbers and a function such as `add()` and `sub()` and use it to perform some calculations.

In [3]:
def add(x, y):
    return x + y

def sub(x, y):
    return x - y

def compute(x, y, op):
    return op(x, y)

res1 = compute(3, 2, add)
res2 = compute(3, 2, sub)

print(res1)
print(res2)

5
1


### Using functions from the `math` library

Write a program that imports the `math` library in your program to:
+ print the value of $ \pi $ and $ e $.
+ compute $ \sqrt{144} $
+ compute $ sin(2 \cdot pi) $

In [5]:
from math import pi, e, sqrt, sin  # better get only what's needed

print(pi)
print(e)

print(sqrt(144))
print(sin(2 * pi))

3.141592653589793
2.718281828459045
12.0
-2.4492935982947064e-16


### Using complex numbers

Python supports complex numbers out of the box. Define the complex numbers $ 3 + i $ and $ 100 + 10 i $ with $ i $ being the imaginary part.

HINT: use `j` to represent the imaginary part

In [6]:
c1 = 3 + 1j
c2 = 100 + 10j

print(c1)
print(c2)

(3+1j)
(100+10j)


### Generating random numbers

Using Python's `random` library that packs several utility functions to generate random numbers:
+ produce a random integer between 0 and 10 (inclusive)
+ produce a random floating point number between 7.5 and 10.5 (HINT: use `uniform`)

In [7]:
from random import randint, uniform

print(randint(0, 10))
print(uniform(7.5, 10.5))

9
8.979032581597249


### Objects in Python

Everything in Python is an object, even primitive types such as integers, strings, and floats.

Objects in Python have attributes and methods that can be accessed using the `.` operator.

There are a few interesting functions and methods you can use on Python objects:
+ the method `bit_length()` returns the numbers of bits needed to represent a given number.
+ the global function `id()` returns the memory location of a given object.

In [2]:
n1 = 5
n2 = 3.14

print(n1.bit_length())
# print(n2.bit_length())  # ERROR: float has no bit length method

print(id(n1))
print(id(n2))

3
140137336275312
140137140078576


Immutability is an essential concept in many programming languages. If an object provides methods to change the internal state of an object after having created it, it's considered mutable; otherwise, it is immutable.

Most of Python data types are immutable, even when it doesn't feel so.
You can see in the following example how the memory address for the same variable is updated even when we are reusing the variable name.

In [3]:
n1 = 5
print("&n1: ", id(n1))

n1 += 1
print("&n1: ", id(n1))

&n1:  140137336275312
&n1:  140137336275344


## Section 2 &mdash; Hello, Lists!

Lists are ordered collection of elements. Lists in Python are a fundamental concepts on which many other advanced constructs are based.

### Creating a list and accessing its elements

Create a named collection `months` containing the name of the months from January to June. Print in the screen the first month and the 4th month. Using negative indexes, get the month before last, and the last month.


In [11]:
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]

print("First month:", months[0])
print("Fourth month:", months[3])

print("Last month:", months[-1])
print("Penultimate month:", months[-2])

First month: Jan
Fourth month: Apr
Last month: Jun
Penultimate month: May


### Unpacking a list into named variables

Use *unpacking* (also known destructuring in other languages) to create a named variable for the months of January through June.

In [9]:
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]

jan, feb, mar, apr, may, jun = months

print(jan)
print(jun)

Jan
Jun


### Slicing lists

Python lets you extract a *slice* from a list using the following syntax:

```python
my_list[start_inclusive:end_exclusive]
```

Python is flexible enough to let you omit the start index (meaning take from the first), or the last index (meaning take until the last).

Create a list with the numbers 1 thru 10 (inclusive). 
+ Obtain the sublist of elements from the second to the 5th (inclusive) and print the obtained list. Print also the length of the list using the `len()` function.

+ Obtain the sublist of elements from the 2nd to the one before last. Print the list and the len.

+ Do the same with the sublist of elements from the 2nd to last.

+ Repeat with the sublist of elements from the first to the one before last.

In [17]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def printListAndLen(l):
    print(f"sublist: {l}, len={len(l)}")

sublist = nums[1:5]
printListAndLen(sublist)

# note that nums[-1] = 10, but when slicing nums[s:-1] gets the one before last
sublist = nums[1:-1]
printListAndLen(sublist)

sublist = nums[1:]
printListAndLen(sublist)

sublist = nums[:-1]
printListAndLen(sublist)


sublist: [2, 3, 4, 5], len=4
sublist: [2, 3, 4, 5, 6, 7, 8, 9], len=8
sublist: [2, 3, 4, 5, 6, 7, 8, 9, 10], len=9
sublist: [1, 2, 3, 4, 5, 6, 7, 8, 9], len=9


### More unpacking and indexing

Given the following list of strings `["jane", "john", "jill", "jack"]`. Define a variable that gets jack, and another variable that gets the rest of names.

In [18]:
names = ["jane", "john", "jill", "jack"]

jack = names[-1]
rest = names[:-1]

print(jack)
print("rest:", rest)

jack
rest: ['jane', 'john', 'jill']


### More unpacking and indexing (II)

| NOTE: |
| :---- |
| This exercise requires tuples. |

Given the following list:

```python
[("jane", 21), ("john", 32), ("jill", 45), ("jack", 23)]
```

Define a variable that gets jack's name and his associated value.

Use the following syntax `print(f"Hello to {jack!r} who turns {jack_value} today!")` to print the results.

In [30]:
friends = [("jane", 21), ("john", 32), ("jill", 45), ("jack", 23)]

name, age = friends[-1]
print("Jack's name:", name)
print("Jack's age:", age)

print(f"Hello to {name!r} who turns {age} today!")

Jack's name: jack
Jack's age: 23
Hello to 'jack' who turns 23 today!


Alternatively, you can use unpacking with `*_` which is used to discard all but the last value:

In [31]:
*_, (jack_name, jack_age) = friends

print(f"Hello to {jack_name!r} who turns {jack_age} today!")

Hello to 'jack' who turns 23 today!


### More on using the starred `*` expression when unpacking

When the tuples contain many items is common to use the `*` operator.

Consider the following example in which we have the scores of gymnastics event for a player. The scores are sorted in ascending order:

Let's calculate:
+ min score
+ max score
+ middles scores (all but the first and last)
+ average score

In [5]:
player_scores = [6.1, 6.5, 6.8, 7.1, 7.3, 7.6, 8.2, 8.9]

min_score = player_scores[0]
max_score = player_scores[-1]
middle_scores = player_scores[1:-1]

avg_score = sum(player_scores) / len(player_scores)

print(f"min_score={min_score}")
print(f"max_score={max_score}")
print(f"middle_scores={middle_scores}")
print(f"average_score={avg_score}")


min_score=6.1
max_score=8.9
middle_scores=[6.5, 6.8, 7.1, 7.3, 7.6, 8.2]
average_score=7.312499999999999


A more Pythonic way makes use of the starred operator while unpacking:

In [6]:
player_scores = [6.1, 6.5, 6.8, 7.1, 7.3, 7.6, 8.2, 8.9]

min_score, *middles, max_score = player_scores
avg_score = sum(player_scores) / len(player_scores)

print(f"min_score={min_score}")
print(f"max_score={max_score}")
print(f"middle_scores={middle_scores}")
print(f"average_score={avg_score}")

min_score=6.1
max_score=8.9
middle_scores=[6.5, 6.8, 7.1, 7.3, 7.6, 8.2]
average_score=7.312499999999999


When using the starred expression `*var_name`:
+ all items not denoted by other variables are captured by `var_name`
+ a starred expression produces a list object of the captured items
+ the number of captured items might be zero
+ one assignment can only use one starred expression

In [7]:
first_item, *mid_items, last_item = ["a", "z"]
assert mid_items == []

### Denoting unwanted items with underscore `_` when unpacking

It's common to use `_` to remove distraction when capturing unwanted items while unpacking:

In [8]:
task = (1001, "Laundry", "Wash Clothes", "completed")
task_id, _, _, task_status = task

assert task_id == 1001
assert task_status == "completed"

But it is even more Pythonic to use the starred expression:

In [9]:
task = (1001, "Laundry", "Wash Clothes", "completed")
task_id, *_, task_status = task

assert task_id == 1001
assert task_status == "completed"

### Concatenating lists

Create two lists with the numbers 1, 2, 3 and 4, 5, 6. Concatenate them in a new list and print the results.

In [20]:
nums1 = [1, 2, 3]
nums2 = [4, 5, 6]

union_list = nums1 + nums2
print(union_list)


[1, 2, 3, 4, 5, 6]


### Indexing from the back of a list

Create a list with the numbers 1 thru 10 (inclusive).

Use negative indices to obtain the elements from the back of the list so that you can obtain the last element, and the one before the last element.

Using the syntax used for slicing, obtain the sublist containing the elements from the second to the one before last (included).

In [21]:
nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

last = nums[-1]
before_last = nums[-2]
sublist = nums[1:-1]

print("last:", last)
print("before_last:", before_last)
print("sublist:", sublist)

last: 10
before_last: 9
sublist: [2, 3, 4, 5, 6, 7, 8, 9]


### Creating and accessing list of lists

Create a list with 3 elements, those elements being:
+ 1, 2, 3
+ 4, 5, 6
+ 7, 8, 9

Then print the following elements:
+ third element of the first list
+ first element of the second list
+ second element of the third list

In [24]:
l = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

print(l[0][2]) # 3
print(l[1][0]) # 4
print(l[2][1]) # 8

3
4
8


### Iterating over the elements of a list

Create a list with the months from January thru June. Iterate over the elements of the list using `for`. Print the results.

In [25]:
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"]

for m in months:
    print(m)

Jan
Feb
Mar
Apr
May
Jun


### Appending items to a list programmatically

The Python expression `range(start, end)` returns the collection of numbers from `start` to `end - 1`.

Define an empty list and use a `for` loop to populate the initially empty list programmatically using `append`. Print the list as it is being created.

In [26]:
nums = []
for num in range(0, 10):
    nums.append(num)

nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

### Sorting a list of numbers

The `sorted()` built-in function can be used to sort a list. It optionally allows you to pass a function that identifies the key by which objects can be sorted.

Use the `sorted` function to sort a set of 10 random floating point numbers.

In [29]:
from random import random


nums = [random() for _ in range(0, 10)]
print("unsorted:", nums)

sorted_nums = sorted(nums)
print("sorted:", sorted_nums)

unsorted: [0.9930853928735712, 0.6051511170458547, 0.2898236134665778, 0.14020962361460065, 0.8587867475618877, 0.5484743695017377, 0.22068647740013325, 0.30042732381492077, 0.6243074788122748, 0.18171398934428906]
sorted: [0.14020962361460065, 0.18171398934428906, 0.22068647740013325, 0.2898236134665778, 0.30042732381492077, 0.5484743695017377, 0.6051511170458547, 0.6243074788122748, 0.8587867475618877, 0.9930853928735712]


### Sorting a list of objects

| NOTE: |
| :---- |
| The following exercise requires OOP. |

The `sorted()` function allows you to pass a function which can be used to decide who the objects within a list can be sorted out.

Consider the following class that models some person attributes:


```python
class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

  def __repr__(self, ):
    return "Person(name=" + self.name + ", age=" + str(self.age) + ")"
```

Create a list of objects, and sort them according to their age using the `sorted()` function.

In [33]:
class Person:
  def __init__(self, name, age):
    self.name = name
    self.age = age

  def __repr__(self, ):
    return "Person(name=" + self.name + ", age=" + str(self.age) + ")"


friends = [Person("Jane", 21), Person("John", 32), Person("Jill", 45), Person("Jack", 23)]
print("unsorted:", friends)

sorted_by_age = sorted(friends, key=lambda f : f.age)
print("sorted:", sorted_by_age)


unsorted: [Person(name=Jane, age=21), Person(name=John, age=32), Person(name=Jill, age=45), Person(name=Jack, age=23)]
sorted: [Person(name=Jane, age=21), Person(name=Jack, age=23), Person(name=John, age=32), Person(name=Jill, age=45)]


### Reversing a list elements with `reverse()`

The elements within a list can be *reversed* using Python. Note that this method *mutates* the given list, that is, it performs the operations in-place.

Use the `reverse()` method to reverse the contents of a list containing the numbers 1 thru 5.

| NOTE: |
| :---- |
| There is also a `reversed()` function. |

In [36]:
l = [1, 2, 3, 4, 5]

l.reverse()
print(l)

[5, 4, 3, 2, 1]


### Concatenating the elements of a list to a string using `str.join()`

The elements of a list can be concatenated into a single string using `str.join()`.

Using this approach, convert a list of characters, strings, and numbers into a single string (respectively) using `join`.

HINT 1: Consider using `"".join()`

HINT 2: To convert a list of numbers into a single string you will need to convert each of its elements to a string. Consider using `map()` to do so. As the conversion is simple, you might define an anonymous lambda function for that.

In [38]:
chars = ['a', 'b', 'c']
strings = ["alpha", "beta", "gamma"]
nums = [1, 2, 3, 4, 5]

print("".join(chars))
print("".join(strings))
print("".join(map(lambda num : str(num), nums)))

abc
alphabetagamma
12345


Note that `.join` accepts a delimiter, so that you can write:

In [63]:
style_settings = ["font-size=large", "font=Arial", "color=black", "align=center"]
print(", ".join(style_settings))

font-size=large, font=Arial, color=black, align=center


### Splitting a string to create a list of strings

The `split` method can be used to locate the specified delimiters within a string and separate them accordingly.

In [2]:
task_data = """1001,Homework,5
1002,Laundry,3
1003,Grocery,4"""

processed_tasks = []
for data_line in task_data.split("\n"):
    processed_task = data_line.split(",")
    processed_tasks.append(processed_task)

print(processed_tasks)

[['1001', 'Homework', '5'], ['1002', 'Laundry', '3'], ['1003', 'Grocery', '4'], ['']]


The `split` and `rsplit` methods have the following calling signature. Both methods take an argument to specify the separator and another to specify the max number of created items.

The difference shows up when the number of max items is different as `split` starts from the left and `rsplit` from the right.

In [4]:
lines_data = """This is line 1
This is line 2
This is line 3
This is line 4
This is line 5"""

lines_using_split = lines_data.split("\n", 3)
lines_using_rsplit = lines_data.rsplit("\n", 3)

print(f"using split : {lines_using_split}")
print(f"using rsplit: {lines_using_rsplit}")


using split : ['This is line 1', 'This is line 2', 'This is line 3', 'This is line 4\nThis is line 5']
using rsplit: ['This is line 1\nThis is line 2', 'This is line 3', 'This is line 4', 'This is line 5']


### Find a first match in a list (or iterable)

Given a list containing the names "Linda", "Tiffany", "Florina", and "Jovann", use the method `index()` to find the first name whose length is 7.

| NOTE: |
| :---- |
| `iter.index(val)` returns the index of the first element of the list matching `val`. |

In [40]:
names = ["Linda", "Tiffany", "Florina", "Jovann"]
lengths = [len(s) for s in names]

print(lengths.index(7))
print(names[lengths.index(7)])

1
Tiffany


### Using `in` to check if an item belongs to a list

You can use `in` to check if an item is defined in a list.

Illustrate with an example.

In [2]:
l = ["one", 2, "a", False]

print("foo" in l)
print("a" in l)

False
True


### The empty list `[]`

You can define an empty list using `[]`. You can use it to add afterwards more items to a list.

### Lists are mutable

Python lists are mutable, and therefore, their items can be modified after having being inserted to the lists. The list itself is also mutable, so you can append or remove elements.

Illustrate with a simple example.

In [6]:
l = ["one", 2, "a", False]

# elements can be modified
l[1] = 14
print(l)

# list themselves can be modified
l.append(3.1415)
print(l)

# remove by index
l.pop(0)
print(l)

['one', 14, 'a', False]
['one', 14, 'a', False, 3.1415]
[14, 'a', False, 3.1415]


### Adding elements to a list

Individual items can be added to a list using its `append()` method, the `extend()` method, and the `+` and `+=` operator.

The details are:
+ `append()` &mdash; add a new item at the end of the list.
+ `extend()` &mdash; concatenates the given list to an existing one.
+ `+=` &mdash; concatenates the list on the right-hand side to the list on the left-hand side.
+ `+` &mdash; concatenates the given lists

Additional, you can use `insert()` method to insert an item on a specific place in the list.

Write a program following these steps:
+ define an empty list named `items`
+ Add an element `"one"` using `append`.
+ Add an element `"two"` after the previous one using `append`.
+ Add the element `"three"` using `extend`.
+ Add the elements `"four"` and `"five"` in one shot using `extend`.
+ Do the same using `+=` with element `"six"`, and then `"seven"`, and `"eight"`.
+ Create a list by combining `[1, 2, 3]`, and `[4, 5, 6, 7, 8]`.
+ Insert the element `0` in the previous list.
+ Insert the element `3.5` between `3` and `4`.


In [12]:
items = []

items.append("one")
print(items)

items.append("two")
print(items)

items.extend(["three"])
print(items)

items.extend(["four", "five"])
print(items)

items += ["six"]
print(items)

items += ["seven", "eight"]
print(items)

l = [1, 2, 3] + [4, 5, 6, 7, 8]
print(l)

l.insert(0, 0)
print(l)

l.insert(4, 3.5)
print(l)

['one']
['one', 'two']
['one', 'two', 'three']
['one', 'two', 'three', 'four', 'five']
['one', 'two', 'three', 'four', 'five', 'six']
['one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight']
[1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 3.5, 4, 5, 6, 7, 8]


### Removing items by value

Items can be removed from a list using `remove(item)`.

Define a list `n` with contents `["one", "two", "three", "two"]`, and issue a `remove("two")` statement. Confirm that only the first element is removed.

In [13]:
n = ["one", "two", "three", "two"]
n.remove("two")
print(n)

['one', 'three', 'two']


### Copying a list using slicing

Copy the contents of a list using slices.

In [14]:
nums = [1, 2, 3, 4, 5]
nums_copy = nums[:]

print(nums)
print(nums_copy)

[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5]


### More on sorting lists

A list can be sorted inline using the `sort` method. The method requires the items of the list to be comparable. If items cannot be sorted you'll get a `TypeError`.

In [17]:
items = [1, 4, 2, 6, 3, 5]
print(items)

items.sort()
print(items)

[1, 4, 2, 6, 3, 5]
[1, 2, 3, 4, 5, 6]


In [19]:
try:
    l = ["one", "two", 'a', False]
    l.sort()
except Exception as e:
    print(e)

'<' not supported between instances of 'bool' and 'str'


The `sorted()` global method can be used to obtain the list that results from sorting a given one:

In [21]:
l = [1, 4, 2, 6, 3, 5]

print(sorted(l))
print(l)

[1, 2, 3, 4, 5, 6]
[1, 4, 2, 6, 3, 5]


Both `sort` and `sorted` accept arguments to drive how the comparison is performed:

In [22]:
friends = ["Eloy", "carlos", "Antonio", "ascen", "gloria"]
orig_l = friends[:]

friends.sort()  # default

print(orig_l)
print(friends)
print(sorted(orig_l, key=str.lower))

['Eloy', 'carlos', 'Antonio', 'ascen', 'gloria']
['Antonio', 'Eloy', 'ascen', 'carlos', 'gloria']
['Antonio', 'ascen', 'carlos', 'Eloy', 'gloria']


## Section 3 &mdash; Tuples

Unlike lists, tuples are immutable collection of elements. The elements can be of any type.

Tuple elements can be accessed pretty much like list elements.

### Creating tuples statically

Create:
+ `tuple_1`: the tuple containing the elements 1 and 2
+ `tuple_2`: the tuple containing the elements a, b, c, and d
+ `tuple_3`: the tuple containing the elements 1 through 5 without using parentheses

In [1]:
tuple_1 = (1, 2)  # normal syntax
tuple_2 = ('a', 'b', 'c', 'd')
tuple_3 = 1, 2, 3, 4, 5

print(tuple_1)
print(tuple_2)
print(tuple_3)

(1, 2)
('a', 'b', 'c', 'd')
(1, 2, 3, 4, 5)


### Accessing tuple elements

Using the tuples defined in the previous exercise, print:
+ The first and second element of `tuple_1`
+ The one before last and last element of `tuple_2`
+ The tuple consisting of elements 2nd thru 4th of `tuple_3`

In [4]:
print(tuple_1[0]) # 1
print(tuple_1[1]) # 2

print(tuple_2[-2]) # c
print(tuple_2[-1]) # d

print(tuple_3[1:4])

1
2
c
d
(2, 3, 4)


## Section 4 &mdash; Sets

Sets are another built-in collection available in Python. It is used to hold distinct elements when the order of elements in the container is not important.

### Creating Sets statically

Create a set with the elements 0 through 5

In [8]:
num_set = {0, 1, 2, 3, 4, 5}
print(set)

{0, 1, 2, 3, 4, 5}


### Removing duplicates from a List with a Set

Generate a list of 100 random elements from 0 to 9. Create a set out of the list and print the results.

In [1]:
from random import randint

nums = [randint(0, 9) for _ in range(0, 100)]
distinct_nums = set(nums)
print(distinct_nums)

{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}


### Complement, Union, and Intersection operations on Sets

Sets provide support to perform the complement, union, and intersection on Sets.

Consider the following sets:
+ The sample space of an experiment $ S $, denoted as $ S = \{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 \}$.

+ The events:
  + $ A = \{ 0, 2, 4, 6, 8 \}$, 
  + $ B = \{ 1, 3, 5, 7, 9 \}$, 
  + $ C = \{ 2, 3, 4, 5 \} $, 
  + and $ D = \{ 1, 6, 7 \} $.


Use Python to calculate:

1. $ A \cup C $
2. $ A \cap B $
3. $ C' $
4. $ (C' \cap D) \cup B $
5. $ (S \cap C)' $
6. $ A \cap C \cap D' $

In [12]:
S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
A = {0, 2, 4, 6, 8}
B = {1, 3, 5, 7, 9}
C = {2, 3, 4, 5}
D = {1, 6, 7}

print("A ∪ C =", A.union(C))
print("A ∩ B =", A.intersection(B)) # Note that the empty set is denoted as set()
print("C' =", S.difference(C))
print("(C' ∩ D) ∪ B =", (S.difference(C).intersection(D)).union(B))
print("(S ∩ C)' =", S.difference(S.intersection(C)))
print("A ∩ C ∩ D' =", A.intersection(C).intersection(S.difference(D)))


A ∪ C = {0, 2, 3, 4, 5, 6, 8}
A ∩ B = set()
C' = {0, 1, 6, 7, 8, 9}
(C' ∩ D) ∪ B = {1, 3, 5, 6, 7, 9}
(S ∩ C)' = {0, 1, 6, 7, 8, 9}
A ∩ C ∩ D' = {2, 4}


| NOTE: |
| :---- |
| You can use `&` for intersection, `|` for union, and `-` for difference. |

### Checking if a set is a superset of other

You can check if a set is a superset of another using `>`.

In [32]:
set1 = { "Idris", "Jason", "Kenneth"}
set2 = { "Jason" }

print(set1 > set2)

True


## Section 5 &mdash; Dictionaries

Dictionaries are collections of key-value pairs.

### Creating static dictionaries

Create a dictionary for a dog whose keys are name and age and populate it with values.

The access the individual values to print a message for the dog.

Try to access a non-existing key such as "breed". Try to see how to prevent the runtime error.

In [12]:
dog = {
    "name": "Mara",
    "age": 7
}

print(dog)

print(f"Congrats to {dog['name']} who turns {dog['age']}")

# print("breed:", dog["breed"]) # this panics!
if "breed" in dog:
    print(dog["breed"])
else:
    print("There is no breed info for", dog["name"])

{'name': 'Mara', 'age': 7}
Congrats to Mara who turns 7
There is no breed info for Mara


### Using `get` and specifying default values

You can get elements from a dictionary using the syntax `d["key"]`, but that operation fails if the key does not exist in the dictionary.

Alternatively, you can use the `get` method which allows you to specify a default value:

In [25]:
p = { "name": "Jason", "age": 54}

print(p["name"])
print(p.get("name"))
#  print(p["nationality"])  # key error: "nationality"
print(p.get("nationality")) # None
print(p.get("nationality", "n/a"))

Jason
Jason
None
n/a


### Converting a dictionary into a list

A dictionary can be converted into a list using `list()` built-in function.

Convert the dictionary from the previous exercise into a list and print the results. Did you get the expected results?

Try again using the `items` method on the dictionary before converting it into a list.

Then iterate through the corresponding list generating a report like the following:

```
key=<key-from-dict>, value=<value-from-dict>
```

In [7]:
l = list(dog)
print(l)        # prints the dictionary keys

list_items = list(dog.items())
print(list_items)   # prints tuples (key, val)

for key, val in list_items:
    print(f"key={key}, value={val}")

['name', 'age']
[('name', 'Mara'), ('age', 7)]
key=name, value=Mara
key=age, value=7


### Getting a list of a dictionary keys and values

In the same way that you can use `items()` to get a list of the key-value pairs, you can use `keys()` to get the keys and `values()` to get the values. 

The result can be further transformed into a list using `list`.

In [28]:
p = { "name": "Jason", "age": 54}

print(p.keys())
print(list(p.keys()))

print(p.values())
print(list(p.values()))

dict_keys(['name', 'age'])
['name', 'age']
dict_values(['Jason', 54])
['Jason', 54]


### Removing keys from a dictionary using `del`

You can remove keys from a dictionary using `del`:

In [29]:
p = { "name": "Jason", "age": 54}

del p["age"]

print(p)



{'name': 'Jason'}


### Copying a dictionary using `copy`

You can create a copy of a dictionary using `copy`:



In [31]:
capitals_by_country = {
    'USA': 'Washington D.C.',
    'France': 'Paris',
    'Germany': 'Berlin'
    }

capitals_copy = capitals_by_country.copy()
capitals_by_country["Spain"] = "Madrid"

print(capitals_by_country)
print(capitals_copy)


{'USA': 'Washington D.C.', 'France': 'Paris', 'Germany': 'Berlin', 'Spain': 'Madrid'}
{'USA': 'Washington D.C.', 'France': 'Paris', 'Germany': 'Berlin'}


### Dictionary generators

| NOTE: |
| :---- |
| This exercise requires generators and list comprehensions. |

In the same way that you can use list comprehensions as a compact syntax to generate lists, you can use the following syntax to generate dictionary objects.

```python
{key:value for elem in elems}
```

Use this approach to create a frequency map for the characters found in a string.

For example, the string "jason isaacs" should produce the following dictionary

```python
{
  ' ': 1,
  'j': 1,
  'a': 3,
  's': 2,
  'o': 1,
  'n': 1,
  'i': 1,
  'c': 1
}
```

In [13]:
def get_freq_map(str):
    freq_map = {c: 0 for c in str}
    for c in str:
        freq_map[c] = freq_map[c] + 1
    return freq_map

print(get_freq_map("jason isaacs"))

{'j': 1, 'a': 3, 's': 3, 'o': 1, 'n': 1, ' ': 1, 'i': 1, 'c': 1}


### Creating a dictionary from two lists

In Python, it is fairly common to work with lists in parallel. For example, for plotting libraries you might have a list that keeps track of all the X-axis values, and another list that holds the Y-axis values.

At some point, you might need to recreate a dictionary that *zips* them together in a dictionary.

Consider the data below in which you have a list with the IDs or certain tasks to be performed, and the corresponding titles for those tasks. The exercise consists on creating a dictionary whose keys as re the tasks IDs and the values are the corresponding titles.

That is, the desired output is:

```python
desired_output = {101: "Laundry", 102: "Homework", 103: "Soccer"}
```

In [5]:
id_numbers = [101, 102, 103]
titles = ["Laundry", "Homework", "Soccer"]

# First attempt: non pythonic
desired_output = dict()
for i in range(len(id_numbers)):
    desired_output[id_numbers[i]] = titles[i]

# Note that assertion might fail as not sure if dict preserve ordering
assert desired_output == {101: "Laundry", 102: "Homework", 103: "Soccer"}

The previous solution can be improved with the `zip` function and dictionary comprehension:

In [6]:
id_numbers = [101, 102, 103]
titles = ["Laundry", "Homework", "Soccer"]

desired_output = {k: v for k, v in zip(id_numbers, titles)}

# Note that assertion might fail as not sure if dict preserve ordering
assert desired_output == {101: "Laundry", 102: "Homework", 103: "Soccer"}

### Using `dict.fromkeys`

The function `dict.fromkeys` lets you create a dictionary whose keys are given in an iterable passed as an argument:

In [7]:
reading_params = dict.fromkeys(["statuses", "urgencies", "content"])
print(reading_params)

{'statuses': None, 'urgencies': None, 'content': None}


You can also pass the initial value in another argument. That value will be assigned to each and every key:

In [9]:
nums = dict.fromkeys(["one", "two", "three", "catorce"], 5)

print(nums)

{'one': 5, 'two': 5, 'three': 5, 'catorce': 5}


## Section 6 &mdash; Ranges

A range is another type of Python collection. The `range()` function returns an object that you can iterate over.

### Creating and iterating over ranges

Create a range object containing the numbers 0 thru 9, print it, and iterate over it printing only the even numbers.

In [15]:
nums = range(0, 9)

print(nums) # ranges are not materialized by default

for num in nums:
    if num % 2 == 0:
        print(f"{num} is an even number")

range(0, 9)
0 is an even number
2 is an even number
4 is an even number
6 is an even number
8 is an even number


### Materializing ranges

Create a range of the odd numbers from 5 to 15. Materialize it and print the resulting list.

In [16]:
odd_nums = range(5, 16, 2)
print(list(odd_nums))

[5, 7, 9, 11, 13, 15]


## Section 7 &mdash; List comprehensions

List comprehensions is a fancy Python technique that allows you to build lists in a succinct and declarative manner.

Mastering list comprehensions is fundamental to make your Python programs more idiomatic.

### Hello, list comprehensions

Create a list containing the first cubes from 0 to 9 using both an imperative approach using a loop, and declarative using a list comprehension.

In [18]:
cubes = []
for num in range(0, 10):
    cubes.append(num ** 3)
print(cubes)

cubes = [ num ** 3 for num in range(0, 10)]
print(cubes)

[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]
[0, 1, 8, 27, 64, 125, 216, 343, 512, 729]


### Nested loops in list comprehensions

Using list comprehensions, create a list of all the months in 2020, 2021, 2022, and 2023, so that resulting list looks like:

```
Jan, 2020
Feb, 2020
Mar, 2020
...
Dec, 2022
```

In [19]:
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
years = range(2020, 2024)

c = [m + ", " + str(y) for y in years for m in months]
print(c)

['Jan, 2020', 'Feb, 2020', 'Mar, 2020', 'Apr, 2020', 'May, 2020', 'Jun, 2020', 'Jul, 2020', 'Aug, 2020', 'Sep, 2020', 'Oct, 2020', 'Nov, 2020', 'Dec, 2020', 'Jan, 2021', 'Feb, 2021', 'Mar, 2021', 'Apr, 2021', 'May, 2021', 'Jun, 2021', 'Jul, 2021', 'Aug, 2021', 'Sep, 2021', 'Oct, 2021', 'Nov, 2021', 'Dec, 2021', 'Jan, 2022', 'Feb, 2022', 'Mar, 2022', 'Apr, 2022', 'May, 2022', 'Jun, 2022', 'Jul, 2022', 'Aug, 2022', 'Sep, 2022', 'Oct, 2022', 'Nov, 2022', 'Dec, 2022', 'Jan, 2023', 'Feb, 2023', 'Mar, 2023', 'Apr, 2023', 'May, 2023', 'Jun, 2023', 'Jul, 2023', 'Aug, 2023', 'Sep, 2023', 'Oct, 2023', 'Nov, 2023', 'Dec, 2023']


### More on nested list comprehensions

Using list comprehensions, create a list of lists with all the months from 2020 to 2023, so that months are defined in their own vector.

That is, the list should look like:

```
[
  ["Jan, 2020", "Feb, 2020", ..., "Dec, 2020"],
  ["Jan, 2021", "Feb, 2021", ..., "Dec, 2021"],
  ["Jan, 2022", "Feb, 2022", ..., "Dec, 2022"],
]
```

In [23]:
months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
years = range(2020, 2024)

c = [[m + ", " + str(y) for y in years for m in months]]
print(c)

# Note that this is also possible but different
c = [[m + ", " + str(y)] for y in years for m in months]
print(c)

[['Jan, 2020', 'Feb, 2020', 'Mar, 2020', 'Apr, 2020', 'May, 2020', 'Jun, 2020', 'Jul, 2020', 'Aug, 2020', 'Sep, 2020', 'Oct, 2020', 'Nov, 2020', 'Dec, 2020', 'Jan, 2021', 'Feb, 2021', 'Mar, 2021', 'Apr, 2021', 'May, 2021', 'Jun, 2021', 'Jul, 2021', 'Aug, 2021', 'Sep, 2021', 'Oct, 2021', 'Nov, 2021', 'Dec, 2021', 'Jan, 2022', 'Feb, 2022', 'Mar, 2022', 'Apr, 2022', 'May, 2022', 'Jun, 2022', 'Jul, 2022', 'Aug, 2022', 'Sep, 2022', 'Oct, 2022', 'Nov, 2022', 'Dec, 2022', 'Jan, 2023', 'Feb, 2023', 'Mar, 2023', 'Apr, 2023', 'May, 2023', 'Jun, 2023', 'Jul, 2023', 'Aug, 2023', 'Sep, 2023', 'Oct, 2023', 'Nov, 2023', 'Dec, 2023']]
[['Jan, 2020'], ['Feb, 2020'], ['Mar, 2020'], ['Apr, 2020'], ['May, 2020'], ['Jun, 2020'], ['Jul, 2020'], ['Aug, 2020'], ['Sep, 2020'], ['Oct, 2020'], ['Nov, 2020'], ['Dec, 2020'], ['Jan, 2021'], ['Feb, 2021'], ['Mar, 2021'], ['Apr, 2021'], ['May, 2021'], ['Jun, 2021'], ['Jul, 2021'], ['Aug, 2021'], ['Sep, 2021'], ['Oct, 2021'], ['Nov, 2021'], ['Dec, 2021'], ['Jan, 2022'

### Combination of elements using list comprehensions and tuples

Create the combination of elements:
$$ 
(x, y)  \text{ where }  -1 <= x <= 5 \text{ and } 0 <= y <= 1
$$

In [24]:
combos = [(x, y) for x in range(-1, 6) for y in range(0, 2)]
print(combos)

[(-1, 0), (-1, 1), (0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1), (3, 0), (3, 1), (4, 0), (4, 1), (5, 0), (5, 1)]


## Section 8 &mdash; Generators

Generators allow you to create iterables that are not materialized (i.e., they don't store the values physically, but rather, they're produced on demand).

### Compute the sum of the squares of the first 1,000,000 integers

In order to illustrate when generators are needed, write a snippet that computes the sum of the squares of the first 1,000,000 integers.

In [2]:
upper_limit = 1_000_000 # a good practice to use _ as thousands separator

squares = [x * x for x in range(1, upper_limit + 1)]
sum_squares = sum(squares)
print(f"sum: {sum_squares}; memory used: {squares.__sizeof__()}")

sum: 333333833333500000; memory used: 8448712


A generator is a special kind of iterator that render its items one by one. It doesn't need to *store* its items, which means that you can make it very memory efficient for data rendering.

The memory hungry previous implementation can be improved with a generator:

In [4]:
def perfect_squares(limit):
    n = 1
    while n <= limit:
        yield n * n
        n += 1

upper_limit = 1_000_000
squares_gen = perfect_squares(upper_limit)
sum_squares = sum(squares_gen)

assert sum_squares == 333333833333500000

print(squares_gen.__sizeof__())

88


`yield` is different from `return`, which terminates the current execution and gives control back to the caller. By contrast, `yield` pauses the current execution and gives control back to the caller.

By contrast, `yield` pauses the current execution and gives control back to the caller temporarily.

A *generator* is an iterator at its core, so using a generator involves invoking the `next` function. When the loop is terminated, all the items are yielded.

That's the reason why you shouldn't invoke a generator function directly &mdash; this will return a generator wrapper instead of the yielded value. Instead, you should invoke `__next__` (which is ugly), or include the generator in a construct that calls `__next__` behind the scenes (for loops, comprehensions...).

### Infinite generator

Create an iterable represented by a function `count()` that returns the integer numbers starting from 0. Print the first three elements. Did you get the results you were expecting to get?

Use the same iterable in a `for` loop to print the elements until 10. Can you explain the results?

In [30]:
def count():
    n = 0
    while True:
        yield n
        n += 1

# Printing the first three elems of the generator function
print(count())
print(count())
print(count())

# Using a for loop
for n in count():
    if n > 10:
        break
    print(n)

<generator object count at 0x7fa99a4a5a10>
<generator object count at 0x7fa99a4a5a10>
<generator object count at 0x7fa99a4a5a10>
0
1
2
3
4
5
6
7
8
9
10


Generator functions are iterables, and therefore, cannot be materialized by invoking the function.

### Generators and list comprehensions

List comprehensions are a great way to materialize and collect values from generators.

Using the previously defined `count()` function as inspiration, create a new `count(start, end)` generator function that produces the numbers from `start` to `end`. Then use a list comprehension to *collect* the values from the generator.

In [28]:
def count(start, end):
    for n in range(start, end + 1):
        yield n

nums = [n for n in count(0, 10)]
print(nums)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


### Generator comprehensions

A generator comprehension is a special type of syntax that lets you define generators in a much compact way.

Use the generator comprehension syntax and the regular syntax to create a generator that produces the squares from zero to 9.

In [31]:
# func syntax for generators
def gen_squares():
    for n in range(0, 10):
        yield n * n

some_squares = [s for s in gen_squares()]
print(some_squares)

# compact way
squares = (x * x for x in range(0, 10))
some_squares = [s for s in gen_squares()]
print(some_squares)


[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


#### Using generator expressions (generator comprehensions)

Use the *generator expression*/*generator comprehension* syntax to create a generator that returns the squares of the first million integers.

In [5]:
upper_limit = 1_000_000
squares_gen_expr = (x * x for x in range(1, upper_limit + 1))

sum_squares = sum(squares_gen_expr)
assert sum_squares == 333333833333500000

#### Compact generator expression/generator comprehension

When you use a generator directly in the invocation of a function, you don't need to use the extra parenthesis:

In [6]:
upper_limit = 1_000_000

sum_squares = sum(x * x for x in range(1, upper_limit + 1))
assert sum_squares == 333333833333500000

### Fibonacci sequence using a generator

Create a generator function that returns the Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13...

In [7]:
def fib():
    yield 0
    aux_0 = 0
    aux_1 = 1
    while True:
        yield aux_1 + aux_0
        aux_0, aux_1 = aux_1, aux_0 + aux_1


for n in fib():
    if n > 100:
        break
    print(n)

0
1
2
3
5
8
13
21
34
55
89


## Section 9 &mdash; The *zip* operation

The `zip()` built-in function lets you combine two iterables into a new one.

| NOTE: |
| :---- |
| By itself, invoking `zip()` does not materialize the resulting iterable. If you need to materialize the results you will need to collect invoking a function such as `list()`, or similar. |

### Hello, zip!

Use the `zip` function to combine the list of numbers from 1 thru 3 and the list of characters 'a', 'b', and 'c'.

In [32]:
zipped = zip(range(1, 4), ['a', 'b', 'c'])

# The zip operation does not materialize the resulting iterable
print(zipped)

# But you can materialize it invoking list() or using list comprehensions
print(list(zipped))

<zip object at 0x7fa99a4e3600>
[(1, 'a'), (2, 'b'), (3, 'c')]


### Zipping iterables of different number of items

By default, the `zip` function stops zipping when the iterable with the fewest number of items is exhausted.

In [24]:
assert list(zip(range(3), range(4))) == [(0, 0), (1, 1), (2, 2)]

In Python 3.10, an optional parameter `strict` was introduced to change the behavior.

When set to `True`, if one of the iterables is longer than the other a `ValueError` exception will be raised.

In [28]:
try:
    list(zip(range(3), range(4), strict=True))
except ValueError as e:
    print(f"Ooops: {e}")

Ooops: zip() argument 2 is longer than argument 1


### Using `zip_longest` to zip with the longer iterable

In some cases, you might need to use the longest iterable. In those cases, you can rely on `zip_longest` function:

In [29]:
from itertools import zip_longest

assert list(zip_longest(range(3), range(4))) == [
    (0, 0), (1, 1), (2, 2), (None, 3)
]

## Section 10 &mdash; The `*` (star or asterisk) operator

The `*` operator (aka star or asterisk operator) can mean different, but interrelated things in Python:

+ In a function declaration, it allows a variable number of arguments to be passed to the function in a single parameter placeholder. That is, it lets you implement variadic functions.

+ It can be used to convert a list into an object that can be fed to a variadic function.

For more information see the corresponding section on [Increasing function flexibility with `*args`, `**kwargs` and `/`](#section-47--increasing-function-flexibility-with-args-kwargs-and).

### Hello, `*`

Create a function tht adds a variable number of arguments under a single parameter name. Then declare a list of the numbers from 0 to 10 and pass that list to the function.

In [3]:
# add is a variadic function
def add(*nums):
    total = 0
    for n in nums:
        total += n
    return total

# You can call it as a normal function
print(add(1, 2, 3))

# Creating a list and feeding it to the function
nums = [n for n in range(0, 11)]
# print(add(nums)) # panics! Unsupported operand types: expects ints but finds list
print(add(*nums))

6
55


## Section 11 &mdash; Exceptions

Python provides exception handling with the *try-except* construct:

```python
try:
  # block that can throw
except Exception as ex:
  # handle exception
```

You can also optionally add a `finally` clause to execute some code independently of whether an exception was raised or not:

```python
try:
  # code for the happy path
except Exception as ex:
  # code for the sad path
finally:
  # code to execute after happy/exception path
```

### Hello, exceptions!

Write a try-except block to capture the exception that is thrown when you invoke the function `add(*nums)` from the previous exercise with the list of numbers.

In [4]:
nums = [n for n in range(0, 10)]
try:
    add(nums)
except Exception as err:
    print("add failed:", err)

add failed: unsupported operand type(s) for +=: 'int' and 'list'


## Section 12 &mdash; String formatting/templating

Python provides several ways to format strings, including templatized strings.

### Hello, old-school formatting

Old-school formatting in Python was done using format specifiers such the ones used in C and Go (`%s` for strings, `%d` for integers) and the `%` operator.

Use this old-school formatting to create a function `birthday(name, age)` that returns a string greeting that interpolates the received parameters.

In [6]:
def birthday(name, age):
    return "Hello to %s who turns %d tomorrow" % (name, age)

birthday("Adri", 15)

'Hello to Adri who turns 15 tomorrow'

### The `format()` function

Python also provides the `format()` function for string formatting purposes.

Create a program that uses `format` to:

+ print the message "My favorite vector is (2, 5)", with `(2, 5)` being a variable.
+ interpolate values in the greeting message "Hello to {0}, who is turning {2} tomorrow, and whose favorite number is {1}".

In [8]:
v = (2, 5)
print("My favorite vector is {0}".format(v))

print("Hello to {0}, who is turning {2} tomorrow, and whose favorite number is {1}".format("Adri", 5, 15))

My favorite vector is (2, 5)
Hello to Adri, who is turning 15 tomorrow, and whose favorite number is 5


### Using `f"..."` for templatized strings

Modern versions of Python also supports the use of templatized strings using the `f"...{var}.."` syntax.

Repeat the previous exercise using templatized strings.

In [9]:
v = (2, 5)
print(f"My favorite vector is {v}")

class Person:
  def __init__(self, name, age, favoriteNum):
    self.name = name
    self.age = age
    self.favoriteNum = favoriteNum

adri = Person("Adri", 15, 5)
print(f"Hello to {adri.name}, who is turning {adri.age} tomorrow, and whose favorite number is {adri.favoriteNum}")

My favorite vector is (2, 5)
Hello to Adri, who is turning 15 tomorrow, and whose favorite number is 5


### Using `b"hello"` for getting the string bytes

Python lets you use the syntax `b"...string..."` to get the bytes out of a string.

Use this syntax to get the bytes out of the string " ABC Hello"

In [14]:
str_bytes = b" ABC Hello"

print(str_bytes)

for b in str_bytes:
    print(b)

b' ABC Hello'
32
65
66
67
32
72
101
108
108
111


### Applying specifiers to format f-strings

You can use the following approach to define how an interpolated string should be formatted:

```python
f"Hello, {expr:[padding_char]{<,^,>}width}
```

+ `expr` is the interpolated expression.
+ `<`, `^`, `>` sets the alignment to left, center, and right respectively.
+ `width` is an integer that sets how long the string will expand.

As an exercise, the following snippet should output something like:

```
task_id  task_name  task_urgency
   1     Homework         5
   2     Laundry          3
```

In [17]:
task_ids = [1, 2, 3, 99999]
task_names = ["Do homework", "Laundry", "Pay bills", "012345678901"]
task_urgencies = [5, 3, 4, 999]

print(f"task_id  task_name     task_urgency")
for i in range(len(task_ids)):
    print(f"{task_ids[i]:^7}  {task_names[i]:<12}  {task_urgencies[i]:^12}")

task_id  task_name     task_urgency
   1     Do homework        5      
   2     Laundry            3      
   3     Pay bills          4      
 99999   012345678901      999     


Note that this can further refactored into a function:

In [22]:
task_ids = [1, 2, 3, 99999]
task_names = ["Do homework", "Laundry", "Pay bills", "012345678901"]
task_urgencies = [5, 3, 4, 999]


def print_formatted_records(fmt):
    headers = ["Task ID", "Task Name", "Urgency"]
    print(f"{headers[0]:{fmt}}{headers[1]:{fmt}}{headers[2]:{fmt}}")
    for i in range(len(task_ids)):
        task_id = task_ids[i]
        name = task_names[i]
        urgency = task_urgencies[i]
        print(f"{task_id:{fmt}}{name:{fmt}}{urgency:{fmt}}")

print_formatted_records("^18")
print()
print_formatted_records("*^18")


     Task ID          Task Name          Urgency      
        1            Do homework            5         
        2              Laundry              3         
        3             Pay bills             4         
      99999          012345678901          999        

*****Task ID**********Task Name**********Urgency******
********1************Do homework************5*********
********2**************Laundry**************3*********
********3*************Pay bills*************4*********
******99999**********012345678901**********999********


### Formatting numbers

The following snippet illustrates how you can use different types of formats on numbers:

In [34]:
large_prime_number = 1000000007
print(f"Large integer with separators: {large_prime_number:,d}")

dec_number = 1.23456
print(f"decimal number with 2 decimal digits: {dec_number:.2f}")
print(f"decimal number with 4 decimal digits: {dec_number:.4f}")

sci_number = 0.00000000412733
print(f"scientific notation: {sci_number:e}")
print(f"scientific notation (2 dec digits): {sci_number:.2e}")

# General format (uses e or f depending on length, etc.)
print(f"general notation: {sci_number:g}")
print(f"general notation (2 dec digits): {sci_number:.2g}")

pct_number = 0.179323
print(f"Percentage: {pct_number:%}")
print(f"Percentage (2 dec digits): {pct_number:.2%}")

hex_value = 12
print(f"Hex: {hex_value:x}")
print(f"Hex: {hex_value:X}")

Large integer with separators: 1,000,000,007
decimal number with 2 decimal digits: 1.23
decimal number with 4 decimal digits: 1.2346
scientific notation: 4.127330e-09
scientific notation (2 dec digits): 4.13e-09
general notation: 4.12733e-09
general notation (2 dec digits): 4.1e-09
Percentage: 17.932300%
Percentage (2 dec digits): 17.93%
Hex: c
Hex: C


### Escaping curly braces

Suppose that a product's data is saved as a dict object such as:

```python
{"name": "Vacuum", "price": 130.675}
```

The desired output should look like:

```
Vacuum: {130.68}
```

| HINT: |
| :---- |
| Use an extra curly brace: `{{` => `{` |

In [39]:
product = {"name": "Vacuum", "price": 130.675}

print(f"{product['name']}: {{{product['price']:.2f}}}")



Vacuum: {130.68


## Section 13 &mdash; Functions

This section deals with more concepts on functions. For the most basic examples see [Section 1 &mdash; Hello, Python!](#section-1--hello-python)

### Named Parameters

In Python, parameters (placeholders defined in the function definition), and the arguments (the actual values passed when invoking the functions) are both named.

Create a function `birthday_greet(name, age)` and invoke the function using both named and unnamed (i.e., positional) arguments.

In [15]:
def birthday_greet(name, age):
    print(f"Hello to {name} who turns {age} tomorrow!")

birthday_greet("Adri", 15)          # positional
birthday_greet(age=15, name="Adri") # named



Hello to Adri who turns 15 tomorrow!
Hello to Adri who turns 15 tomorrow!


### The `**` operator

The `**` operator in Python (as in `**kwargs` which stands for *keyworded-args*) is used to pass a keyworded, variable-length argument list to a function.

As with the `*` operator it behaves differently in the function declaration than it does when invoking the function in the client code, but it follows the same pattern:

+ When defining a function, it identifies a parameter as a variable-length, key-value argument.

+ When invoking a function, it lets you pass a dictionary object to a function requiring explicit key-value arguments.

Create a dictionary object representing the `name` and `age` of a person. Then use the `**` operator to pass that object to the `birthday_greet` function defined in the previous exercise.

Then, create a new version of the function that declares a single parameter `**kwargs` and produces the same result. Invoke it from client code.

In [18]:
person = {"name": "Adri", "age": 15}

# function received a key-worded, variable length argument
def print_birthday_greet(**kwargs):
    print(f'Hello to {kwargs["name"]} who turns {kwargs["age"]} tomorrow!')

# invocation requires ** to inject a dictionary into kwargs
print_birthday_greet(**person)

# note that ** is required for non-kwargs too
def print_birthday_greet(name, age):
        print(f"Hello to {name} who turns {age} tomorrow!")
print_birthday_greet(**person)


Hello to Adri who turns 15 tomorrow
Hello to Adri who turns 15 tomorrow!


### Default argument values

Python supports default argument values. In conjunction with named arguments, it makes it very easy to have Python functions with a huge number of arguments, many of them having sensible default values that are not required when invoking the function.

Create a third version of `birthday_greet` in which `age` is an optional parameter, and the default name is `"stranger"`. Adapt the function implementation as required.

| HINT: |
| :---- |
| Use `param=None` to indicate an optional parameter. |

In [25]:
def print_birthday_greet(name="stranger", age=None):
    str = f"Hello to {name}"
    if age:
        str += f" who will turn {age} tomorrow"
    str += "!"
    print(str)

print_birthday_greet()
print_birthday_greet(age=15)
print_birthday_greet(name="Adri")
print_birthday_greet(name="Adri", age=15)

Hello to stranger!
Hello to stranger who will turn 15 tomorrow!
Hello to Adri!
Hello to Adri who will turn 15 tomorrow!


### Default argument values and `**kwargs`

Create an implementation of `birthday_greet` using `**kwargs` where the arguments have the same default values as in the previous exercise.

| HINT: |
| :---- |
| You can use `"key" in obj` to check if a particular key is available in a dictionary. |

In [37]:
def print_birthday_greet(**kwargs):
    str = "Hello "
    if "name" in kwargs:
        str += kwargs["name"]
    else:
        str += "stranger"

    if "age" in kwargs and kwargs["age"]:
        str += f' who turns {kwargs["age"]} tomorrow'

    str += "!"
    print(str)

print_birthday_greet()
print_birthday_greet(age=15)
print_birthday_greet(name="Adri")
print_birthday_greet(name="Adri", age=15)

Hello stranger!
Hello stranger who turns 15 tomorrow!
Hello Adri!
Hello Adri who turns 15 tomorrow!


### Hello unnamed, inline functions aka lambdas!

Python supports inline functions to be passed as parameters to higher-order functions, although its syntax is not as succinct as in other programming languages.

The syntax is:

```python
lambda arg1, arg2, ..., argN:
  impl
```

| NOTE: |
| :---- |
| Unlike regular functions, lambda functions must not use `return`. If you do, you'll get a `SyntaxError`. |

Create a lambda function that computes the result of adding three numbers. Use that lambda function as an argument to a compute function `compute(n1, n2, n3, op)`.

In [38]:
add_nums = lambda n1, n2, n3 : n1 + n2 + n3

def compute(n1, n2, n3, op):
    return op(n1, n2, n3)

compute(1, 2, 3, add_nums)

6

### Immediately applying parameters to a lambda

Create a lambda function that returns the next integer to the one given and invoke it immediately.

In [39]:
(lambda x : x + 1)(4)

5

### Using lambdas in functions that accept functions as arguments

Consider the following list of named tuples:

```python
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")
 
tasks = [
    Task("Homework", "Physics and math", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Utility", "Pay bills", 5)    
]
```

Use a lambda to get the the list list sorted by urgency, in reverse order.

In [31]:
from collections import namedtuple

from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")

tasks = [
    Task("Homework", "Physics and math", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Utility", "Pay bills", 5)
]

tasks.sort(key=lambda task: task.urgency, reverse=True)
print(tasks)

[Task(title='Homework', description='Physics and math', urgency=5), Task(title='Homework', description='Physics and math', urgency=5), Task(title='Internet', description='Upgrade plan', urgency=5), Task(title='Utility', description='Pay bills', urgency=5), Task(title='Museum', description='Egypt exhibit', urgency=4), Task(title='Camera', description='Export photos', urgency=4), Task(title='Laundry', description='Wash clothes', urgency=3), Task(title='Floor', description='Mop the floor', urgency=3), Task(title='Toaster', description='Clean the toaster', urgency=2)]


### Some Lambdas pitfalls and caveats

In Python, Lambdas are used for simple one-time jobs. If you plan to reuse the function is much better to use the `def` keyword and name the function. That will help with the debugging.

Additionally, sometimes you don't even need a lambda and can simply refer to an existing function.

Consider the following list of numbers: `[-4, 3, 7, 0, -6]`. Sort them using their absolute value.

As unexperienced Python devs who have just learnt Lambdas we would do:

In [2]:
integers = [-4, 3, 7, 0, -6]

integers.sort(key=lambda n: abs(n))
print(integers)

[0, 3, -4, -6, 7]


A more experience programmer would have done:

In [3]:
integers = [-4, 3, 7, 0, -6]

integers.sort(key=abs)
print(integers)

[0, 3, -4, -6, 7]


The snippet above is much more concise and Pythonic that our first approach.

Consider the following list of tuples representing a student's scores in Math, Science, and Art. Find out what tuple has the highest total score.

```python
scores = [(93, 95, 94), (92, 95, 96), (94, 97, 91), (95, 97, 99)]
```

In [4]:
scores = [(93, 95, 94), (92, 95, 96), (94, 97, 91), (95, 97, 99)]

print(max(scores, key=lambda x: x[0] + x[1] + x[2]))

(95, 97, 99)


Or in a more Pythonic and succinct way:

In [6]:
scores = [(93, 95, 94), (92, 95, 96), (94, 97, 91), (95, 97, 99)]

print(max(scores, key=sum))

(95, 97, 99)


### Functions as objects

Everything in Python is an object, and that applies to functions too. As a result, functions can be used as arguments to other functions, kept in data containers such as dicts and lists, etc.

The following example illustrate this idea:

In [7]:
def get_mean(data):
    return "this calculates the mean"

def get_min(data):
    return "this calculates the min"

def get_max(data):
    return "this calculates the max"

actions = {"mean": get_mean, "min": get_min, "max": get_max}


def fallback_action(data):
    return "error: unknown action"


def process_data(data, action):
    action_to_apply = actions.get(action, fallback_action)
    result = action_to_apply(data)
    return result

print(process_data([1, 2, 3], "mean"))

this calculates the mean


### The `map`, `filter`, and `reduce` higher-order functions

`map` and `filter` functions are available in Python's core package. `reduce` is available in the `functools` standard library.

1. Use `map` to create the list of squares given a list of integers.
2. Use `filter` to filter odd numbers from a given list of integers.
3. Use `reduce` to calculate the sum of a given list of numbers.

In [49]:
# As with zip, `map` do not materialize the results
print(map(lambda n : n * n, range(0, 11)))

# You need to force materialization with list
print(list(map(lambda n : n * n, range(0, 11))))

print(list(filter(lambda n : n % 2 == 0, range(0, 11))))

from functools import reduce

print(reduce(lambda acc, n : acc + n, range(0, 11), 0))


<map object at 0x7fde6157db70>
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[0, 2, 4, 6, 8, 10]
55


#### Using the `map` function

The `map` function creates a `map` iterator which you can use to transform (map), each and every element of an iterable.

Consider the following list of strings: `["1.23", "4.56", "7.89"]`. Create a program that transforms that list into the equivalent list of floats.
 

In [8]:
numbers_str = ["1.23", "4.56", "7.89"]

numbers = list(map(float, numbers_str))
assert numbers == [1.23, 4.56, 7.89]


### Hello, closures

Using closures, implement a function `make_power_fn(power)` that returns a function `fn(base)` that produces `base ** power` when invoked.

In [51]:
def make_power_fn(power):
    def power_fn(base):
        return base ** power
    return power_fn

square_fn = make_power_fn(2)
print(square_fn(5))

cube_fn = make_power_fn(3)
cubes = list(map(cube_fn, range(0, 11)))
print(cubes)

25
[0, 1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]


#### Another closure example

Create a function `increment_maker(num)` that returns a function that increment its argument by `num`.

In [10]:
def increment_maker(num):
    def increment(num0):
        return num + num0

    return increment

plus_one = increment_maker(1)
assert plus_one(5) == 6

plus_five = increment_maker(5)
assert plus_five(19) == 24

### A function with a weird signature

Given the function signature:

```python
def weird(param1, param2, *, prefix=None, **kwargs)
```

Explore the parameter list of the function.

| NOTE: |
| :---- |
| See the explanation below for `*` in the function signature. |

In [63]:

def weird(param1, param2, *, prefix=None, **kwargs):
    print("== invocation results")
    print("param1:", param1)
    print("param2:", param2)
    print("prefix:", prefix)
    print("kwargs:", kwargs)
    print()

# param1 and param2 are regular args, non-optional
# weird() # this fails because param1 and param2 not passed
# weird("p1") # this fails because param2 not passed
# weird(param2="p2") # this fails because param1 not passed


weird("p1", "p2")
weird(param2="p2", param1="p1")

# the '*' prevents other positional args to be sent
# weird("p1", "p2", "other",) # this fails
weird("p1", "p2", some="some", other="other", values="params") # this fails
weird("p1", "p2", prefix="yay", some="some", other="other", values="params") # this fails

# For example
def weird2(param1, param2, prefix=None, **kwargs):
    print("== invocation results")
    print("param1:", param1)
    print("param2:", param2)
    print("prefix:", prefix)
    print("kwargs:", kwargs)
    print()

weird2("p1", "p2", "other") # this works

== invocation results
param1: p1
param2: p2
prefix: None
kwargs: {}

== invocation results
param1: p1
param2: p2
prefix: None
kwargs: {}

== invocation results
param1: p1
param2: p2
prefix: None
kwargs: {'some': 'some', 'other': 'other', 'values': 'params'}

== invocation results
param1: p1
param2: p2
prefix: yay
kwargs: {'some': 'some', 'other': 'other', 'values': 'params'}

== invocation results
param1: p1
param2: p2
prefix: other
kwargs: {}



Thus, `*` in the function signature prevents other positional arguments to be passed.

## Section 14 &mdash; OOP

Python supports object-oriented programming with additional keywords and syntax.

### Classes basics

The following example illustrates how to declare a class with a method, and how to instantiate it:

In [9]:
class Duck:
    def quack(self):
        print("quack!")

don = Duck()
don.quack()

quack!


Methods are defined using the `def` keyword, and must include `self` in the signature. The `self` argument is a reference to the current object instance.

### Constructors

Constructors in Python are declared using the function name `__init__`:

In [13]:
class Duck:

    def __init__(self, name, color):
        self.name = name
        self.color = color

    def quack(self):
        print(f"The {self.color} duck named {self.name} says: quack!")

duck = Duck("Howie", "pearl white")
duck.quack()

The pearl white duck named Howie says: quack!


You can access an object's properties using `.`.

In [14]:
print("howie's name:", duck.name)

howie's name: Howie


### Hello, OOP: A `Rectangle` class

Create a `Rectangle` class with the following capabilities:

+ A `Rectangle` object can be instantiated by passing its width and height dimensions. (HINT: Python constructors are named `def __init__(self, param1, param2...)`)

+ A `Rectangle` must feature the following instance methods:
  + `scale`: which returns a new `Rectangle` with its dimensions scaled by the given factor
  + `area`: which returns the area of the rectangle
  + `__eq__`: which checks for equality
  + `__repr__`: which provides the string representation of a rectangle (used in `print`)

In [7]:
class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def area(self):
        return self.width * self.height

    def __eq__(self, other):
        return self.width == other.width and self.height == other.height

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"


r = Rectangle(2, 3)
print(r)

print(r.scale(2))
print(r.area())

r2 = Rectangle(2, 3)
print(r == r2)
print(r == r2.scale(2))



Rectangle(w: 2, h: 3)
Rectangle(w: 4, h: 6)
6
True
False


### Operator overloading

Python supports operator overloading using special method names such as:
+ `__mul__`: when the class instance comes on the left-hand side
+ `__rmul`: when the class instance comes on the right-hand side

Enhance the `Rectangle` class to support things such as:

+ `Rectangle(2, 3) * 2`, which requires implementing `__mul__`
+ `2 * Rectangle(2, 3)`, which requires implementing `__rmul__`

In [8]:
class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def area(self):
        return self.width * self.height

    def __mul__(self, factor):
        return self.scale(factor)

    def __rmul__(self, factor):
        return self.scale(factor)

    def __eq__(self, other):
        return self.width == other.width and self.height == other.height

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"


r = Rectangle(2, 3)
print(2 * r)
print(r * 3)

Rectangle(w: 4, h: 6)
Rectangle(w: 6, h: 9)


### Class/Static methods

Pyton support class/static methods through the `@classmethod` decorator. Also, these methods should be declared as:

```python
@classmethod
  def method(cls, <param1>, ...):
      ...
```

Enhance the `Rectangle` class by defining a method `square(side)` which returns rectangle whose dimensiones are the given side.


In [1]:
class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def area(self):
        return self.width * self.height

    def __mul__(self, factor):
        return self.scale(factor)

    def __rmul__(self, factor):
        return self.scale(factor)

    def __eq__(self, other):
        return self.width == other.width and self.height == other.height

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"

    @classmethod
    def square(cls, side):
        return Rectangle(side, side)


sq = Rectangle.square(2)
print(sq)

Rectangle(w: 2, h: 2)


### The `__dict__` property

The `__dict__` property, when applied to a class instance returns the instance fields; when applied to a class returns the class methods.

Use this property on the `Rectangle` class and in an instance of the class.

In [3]:
class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def area(self):
        return self.width * self.height

    def __mul__(self, factor):
        return self.scale(factor)

    def __rmul__(self, factor):
        return self.scale(factor)

    def __eq__(self, other):
        return self.width == other.width and self.height == other.height

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"

    @classmethod
    def square(cls, side):
        return Rectangle(side, side)

print(Rectangle.__dict__)

r = Rectangle(2, 3) # shows all the methods in the Rectangle class
print(r.__dict__)   # shows the class fields (width and height)

{'__module__': '__main__', '__init__': <function Rectangle.__init__ at 0x7fb35bc92dd0>, 'scale': <function Rectangle.scale at 0x7fb35bc92e60>, 'area': <function Rectangle.area at 0x7fb35bc92ef0>, '__mul__': <function Rectangle.__mul__ at 0x7fb35bc92f80>, '__rmul__': <function Rectangle.__rmul__ at 0x7fb35bc93010>, '__eq__': <function Rectangle.__eq__ at 0x7fb35bc930a0>, '__repr__': <function Rectangle.__repr__ at 0x7fb35bc93130>, 'square': <classmethod(<function Rectangle.square at 0x7fb35bc931c0>)>, '__dict__': <attribute '__dict__' of 'Rectangle' objects>, '__weakref__': <attribute '__weakref__' of 'Rectangle' objects>, '__doc__': None, '__hash__': None}
{'width': 2, 'height': 3}


### Inheritance

Python syntax for inheritance is:

```python
ClassName(superClassName):
  ...
```

Create a `Square` class that inherits from `Rectangle`.

| HINT: |
| :---- |
| You will need to use `super().__init__(...)` to invoke the constructor of the superclass. |

In [4]:
# Rectangle is the super-class for Square
class Rectangle:
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def area(self):
        return self.width * self.height

    def __mul__(self, factor):
        return self.scale(factor)

    def __rmul__(self, factor):
        return self.scale(factor)

    def __eq__(self, other):
        return self.width == other.width and self.height == other.height

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"

class Square(Rectangle):
    def __init__(self, side):
        super().__init__(side, side)

    def scale(self, factor):
        return Square(self.width * factor)

    def __repr__(self):
        return f"Square(s: {self.width})"

sq = Square(1)
print(sq)
print(sq.area())
print(sq.scale(2))


Square(s: 1)
1
Square(s: 2)


### Abstract classes

Python supports abstract classes by inheriting from a special class named `ABC`.

Create a simple class hierarchy following thse guidelines:
+ Create an abstract base class `Shape`. (HINT: you will need to `import ABC from abc`)

    + Create an empty implementation for the methods `area` and `scale`. This will set the interface. (HINT: to create empty implementations you can either use `pass` or `...`. Also, use the `@abstractmethod` decorator to tag the method as abstract)

    + Create an implementation of `__eq__` that relies on `__dict__` to check for equality, based on the underlying properties.

    + Create an implementation of `__mul__` and `__rmul__` that rely on `scale()`.

+ Create a concrete class `Rectangle` respecting the behavior implemented in the previous exercises.

+ Create a concrete class `Square` respecting the behavior implemented in the previous exercises.

+ Create a concrete class `Circle`.

In [9]:
from abc import ABC, abstractmethod
from math import pi

class Shape(ABC):

    @abstractmethod
    def area():
        ...         # same as pass

    @abstractmethod
    def scale(self, factor):
        ...

    def __eq__(self, other):
        return self.__dict__ == other.__dict__

    def __mul__(self, factor):
        return self.scale(factor)

    def __rmul__(self, factor):
        return self.scale(factor)

class Rectangle(Shape):
    def __init__(self, width, height):
        self.width = width
        self.height = height

    def area(self):
        return self.width * self.height

    def scale(self, factor):
        return Rectangle(self.width * factor, self.height * factor)

    def __repr__(self):
        return f"Rectangle(w: {self.width}, h: {self.height})"

class Square(Rectangle):
    def __init__(self, side):
        super().__init__(side, side)

    def scale(self, factor):
        return Square(self.width * factor)

    def __repr__(self):
        return f"Square(s: {self.width})"

class Circle(Shape):
    def __init__(self, radius):
        self.radius = radius

    def area(self):
        return pi * self.radius * self.radius

    def scale(self, factor):
        return Circle(self.radius * factor)

    def __repr__(self):
        return f"Circle(r: {self.radius})"

# s = Shape() # Err: can't instantiate abstract class

# Rectangle
r = Rectangle(2, 3)
print(r)
print(r.area())
print(r.scale(2))
print(r == Rectangle(3, 4))
print(r == Rectangle(2, 3))

# Square
s = Square(1)
print(s)
print(s.area())
print(s.scale(2))
print(s == Square(2))
print(s == Square(1))
print(s == Rectangle(1, 1)) # d'oh!

# Circle
c = Circle(1)
print(c)
print(c.area())
print(c.scale(2))
print(c == Circle(2))
print(c == Circle(1))
print(c == Square(1))
print(c == Rectangle(2, 3))

Rectangle(w: 2, h: 3)
6
Rectangle(w: 4, h: 6)
False
True
Square(s: 1)
1
Square(s: 2)
False
True
True
Circle(r: 1)
3.141592653589793
Circle(r: 2)
False
True
False
False


### Static properties

You can create static properties for a class by declaring them outside of any method.

Create a class with a static property `class_name` set to the name of the class, and another property `num_instances` to track the number of instances created.

In [12]:
class MyClass:
    class_name = "MyClass"
    num_instances = 0

    def __init__(self):
        MyClass.num_instances += 1

    def __repr__(self):
        return f"{MyClass.class_name} has {MyClass.num_instances} live instances"

print(MyClass.num_instances)
c1 = MyClass()
print(MyClass.num_instances)

0
1


### Setters and Getters

There are two ways of creating setters and getters in Python.

+ using the `property()` function which identifies the functions that will act as setters, getters, and delete functions:

    ```python
    def set_something(self, value):
      self.__something = value

    def get_something(self):
      return self.__something

    def del_something(self):
      del self.__something

    something = property(get_something, set_something, del_something)
    ```

+ using the `@property` decorator on the functions

    ```python
    @property
    def get_something(self):
      return self.__something

    @something.setter
    def set_something(self, value):
      self.__something = value

    @something.deleter
    def del_something(self):
      del self.__something
    ```

Create a simple `Person` class with name and age properties using the two approaches described above.

In [15]:
# Using `property`
class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age

    def set_name(self, name):
        self._name = name

    def get_name(self):
        return self._name

    def set_age(self, age):
        self._age = age

    def get_age(self):
        return self._age

    def __repr__(self):
        return f"Person(name={self._name}, age={self._age})"

    name = property(get_name, set_name, None)
    age = property(get_age, set_age, None)

p = Person("Adri", 14)
print(p)

p.age = 15
print(p.age)
print(p)


Person(name=Adri, age=14)
15
Person(name=Adri, age=15)


In [19]:
# using decorators

class Person:
    def __init__(self, name, age):
        self._name = name
        self._age = age

    @property
    def name(self):
        return self._name

    @name.setter
    def name(self, name):
        self._name = name

    @property
    def age(self):
        return self._age

    @age.setter
    def age(self, age):
        self._age = age


    def __repr__(self):
        return f"Person(name={self._name}, age={self._age})"

p = Person("Adri", 14)
print(p)

p.age = 15
print(p.age)
print(p)

Person(name=Adri, age=14)
15
Person(name=Adri, age=15)


### Creating read-only/write-only attributes

Banking on the `@property` decorator it becomes very easy to create read-only and write-only managed attributes.

Create a `Person` class that includes:
  + a `password` attribute that is a write-only attribute
  + a `name` attribute that is read-only and can only be set in the constructor.

In [24]:
class Person:
    def __init__(self, name, password):
        self._name = name
        self._password = password

    def set_password(self, new_password):
        self._password = new_password

    password = property(fset=set_password)

    def get_name(self):
        return self._name

    name = property(fget=get_name)

user = Person("sergio", "supersecret")
# print(user.password)    # unreadable attribute "password"
print(user.name)

# user.name = "sergio74"  # can't set attribute name
user.password = "tiger"

sergio


### Private fields in Python classes

Python does not support private/public qualifiers for class methods and attributes.

However, it is customary to prefix internal implementation methods and attributes with `_`. That approach gives a visual indication to the reader that those methods and attributes should not be used from consumer code. Note that this does not prevent the client code to use them.

Python also supports prefixing your methods and attributes with a double underscore `__`. This approach forces a *name mangling*, so that it'll be much more difficult for the class consumer to use that method or attribute (yet, it will be possible).

As a result, it is conventional to:
+ Use `_prefix` for names used in internal implementation details, but that you **want** to keep available to subclasses and end-consumer code.

+ Use `__prefix` for names used in internal implementation details that you **don't want** to make available to any code outside of the current class.

Create a class that have fields of both kind and illustrate the concepts above:
+ when using `_prefix` the attribute/method is available to subclasses and consumer code.
+ when using `__prefix` the attribute/method is not available to subclasses or consumer code.

We will create a simple class hierarchy to illustrate name mangling, etc.

In [31]:
class Vehicle:
    def __init__(self, num_wheels, has_motor):
        self._num_wheels = num_wheels
        self.__has_motor = has_motor

class Car(Vehicle):
    def __init__(self):
        super().__init__(4, True)

    def __repr__(self):
        # return f"Car(num_wheels: {self._num_wheels}, has_motor: {self.__has_motor})" # Error: Car object has no attribute __has_motor
        return f"Car(num_wheels: {self._num_wheels} and maybe a motor)" # Error: Car object has no attribute __has_motor

c = Car()
print(c)

Car(num_wheels: 4 and maybe a motor)


### Checking the type of an instance with `isinstance()`

The built-in function `isinstance()` lets you check if a class is of a particular type.

Create a simple class hierarchy (e.g., Vehicle, Car) and instantiate an object of each type.

Use `isinstance` to check:
+ whether it returns `True` when checking the `Vehicle` instance against `Vehicle` class.
+ whether it returns `True` when checking the `Car` instance against `Car` class.
+ whether it returns `True` when checking the `Vehicle` instance against `Car` class.
+ whether it returns `True` when checking the `Car` instance against `Vehicle` class.

What can you derive from the results?

In [7]:
class Vehicle:
    ...

class Car(Vehicle):
    ...

vObj = Vehicle()
cObj = Car()

print(isinstance(vObj, Vehicle)) # expected True, got True
print(isinstance(cObj, Car))     # expected True, got True
print(isinstance(vObj, Car))     # expected False, got False
print(isinstance(cObj, Vehicle)) # expected True, got True

True
True
False
True


The results are consistent with the expectations, and you can use `isinstance` to check if a particular object is part of a class hierarchy.

### `isinstance()` for built-in types

You can also use `isinstance()` to check for the type of built-in types using:
+ `str` for strings
+ `int` for integers
+ `float` for floating-point numbers
+ `complex` for complex numbers
+ `bool` for booleans
+ `list` for lists
+ `tuple` for tuples
+ `range` for ranges
+ `dict` for dictionaries
+ `set` for sets



Use `isinstance` to check:
+ that a string variable is actually a string
+ that a tuple is actually a tuple
+ that a dictionary is actually a dictionary

In [4]:
s = "Hello!"
t = (1, "uno")
d = { "one": 1, "two": 2}

# string
print(isinstance(s, str))
print(isinstance(t, str))
print(isinstance(d, str))
print()

# tuple
print(isinstance(s, tuple))
print(isinstance(t, tuple))
print(isinstance(d, tuple))
print()

# dictionary
print(isinstance(s, dict))
print(isinstance(t, dict))
print(isinstance(d, dict))
print()

True
False
False

False
True
False

False
False
True



### Using `issubclass()` to check if an instance is a subclass

The function `issubclass()` lets you check if a class is of a particular type.

| NOTE: |
| :---- |
| `issubclass` requires classes not instances. |

Create a simple class hierarchy (e.g., Person, Student) and validate the behavior of `issubclass`.

How would you use `issubclass` if you only have access to a particular instance and not the class? (HINT: look for extra properties on the instance)

In [14]:
class Person:
    ...

class Student(Person):
    ...


print(issubclass(Person, Person))   # expected True, got True
print(issubclass(Student, Person))  # expected True, got True
print(issubclass(Person, Student))  # expected False, got False

# if you have an instance
sObj = Student()
pObj = Person()

print(issubclass(sObj.__class__, Person))
print(issubclass(Student, pObj.__class__))
print(issubclass(sObj.__class__, pObj.__class__))

True
True
False
True
True
True


## Section 15 &mdash; Creating and importing libraries

You can import your own custom modules using the same syntax used for core packages. As these files will not be available in a central location, you will need to specify where the code for those modules can be located:

```python
from [<dir>.]<lib> import <lib_fn_or_property>
```

If the library sits in the same directory from where you're running your code you can omit the `[<dir>.]` part.


### Hello, custom libraries

Create a library in a source file `my_lib.py` in the same directory as this notebook. In it, define a function `greet_me` that prints a message. Define also a `square` function that returns the square of a number given.
Then import that library in a cell and make sure you can invoke the function.

| NOTE: |
| :---- |
| Note if you change the library after having imported it, you might need to restart the Jupyter kernel to see the change reflected. |

In [1]:
from my_lib import greet_me, square

greet_me()
greet_me("sergio")
print(f"5^2={square(5)}")

Hello, stranger!
Hello, sergio!
5^2=25


### Importing custom libraries from a folder

Saving your libraries in the same directory in which you keep your notebooks is not usually a very good idea.

Python allows you to reference libraries that sit on subdirectories other using the sytax introduced at the beginning of this section.

Create a `utils/` folder and define a new source file `my_lib.py` for another library in which you declare a function `cube(num)`. Import it and use it in a cell.

In [4]:
from utils.my_lib import cube

print(f"5^3={cube(5)}")

5^3=125


### The concept of **"main"** in modules

There are Python files that can be both used as libraries, whose individual elements might be imported into a larger program, or executed as standalone scripts.

In those use cases, you will find the following piece of code useful:

```python
if __name__ == "__main__":
    # ... things to run as standalone script ...
```

Create a module `./utils/db_module.py` that exposes two functions `delete_db()` and `create_db()` that announce themselves using `print()`.
Include a code snippet such as the one above so that when invoking the program as a standalone program using `python ./utils/db_module.py` the main section is executed, but when importing it on a notebook cell, those functions are not.

Confirm that when removing the guard `if __name__ == "__main__"` those functions are executed as side-effects when the module is imported.

In [5]:
from utils.db_module import do_some_db_stuff

do_some_db_stuff()

> doing some db stuff
> in delete_db
> in create_db


## Section 16 &mdash; Documenting your code

As Python is a dynamic programming language, and not strongly typed, documenting your code is key for a good DX.

There are two fundamental pieces you should master to properly document your code in Python:
+ DocStrings: conventions about comments in files, functions, classes, etc. DocStrings let the IDEs provide additional information to the consumers of your code.

+ Type hints: available from Python 3.5, type hints allows you to add type annotations to your code to declare the expected types such as in `def hello(name: str) -> str:`

### DocStrings

DocStrings can be used for files, classes, methods, and standalone functions.

The typically follow this approach:

```python
"""Summary line for the "thing" being documents

Details spread across multiple lines describing the
thing, how to use it, recommendations, examples, etc.
"""
```

For functions, this is one of the most common templates:

```python
"""Summary line describing what the function does

Parameters
----------
param1 : str
  The description for param1, whose type is string
param2 : bool, optional
  The description for param2, which in an optional boolean

Returns
-------
list
  a list of strings
```

Using this approach, define, and document a function `square` that returns the square of a given number.

In [1]:
"""square returns the square of a number

Parameters
----------
num : number
    The number whose square value is about to be computed

Returns
-------
    The square of the number given
"""
def square(num):
    return num * num


square(5)

25

An alternative docstring style would be the following:

```python
def quotient(dividend, divisor, taking_int=False):
  """
  Calculate the quotient of two numbers

  :param dividend: int | float, the dividend in the division
  :param divisor: int | float, the divisor in the division
  :param taking_int: bool, whether only taking the integer part of the quotient;
  default: False, which calculates the precise quotient of the two numbers

  :return float | int, the quotient of the dividend and divisor

  :raises ZeroDivisionError, when the divisor is 0
  """
  if divisor == 0:
    raise ZeroDivisionError("division by zero")
  result = dividend / divisor
  if taking_int:
    result = int(result)
  return result
```

### Documenting a class attribute

A class attribute can be documented using DocStrings as follows:

```python
class AClass:

    c = 'class attribute'
    """This is AClass.c's docstring."""
```

### Type Hints

Type hints are type annotations that can be added to functions to indicate the types of the parameters received and the result of executing the function among other things.

Note that in order for the runtime to catch errors and enforce types you will needs to use a separate type checker not include in the Python runtime.

#### Type annotations

Type hints are used to add types to variables, parameters, function arguments, and their corresponding return values, class attributes and methods.

##### Variable annotations

The following examples illustrate how variables can be annotated with types:

```python
my_list: list = ['A', 'B', 'C']
my_num: float = 2.4
```

with the variable being one of the following:
+ `int`
+ `float`
+ `str`
+ `bool`
+ `bytes`
+ `list`
+ `tuple`
+ `dict`
+ `set`
+ `frozenset`
+ `None`

##### Function annotations

The following example illustrate how to annotate a function:

```python
def add_nums(x: int, y: int, z: float) -> float:
  return x + y + z
```

If a function does not return anything, you can set the return type to `None`.

##### Class annotations

You can annotate attributes and methods inside classes using the following approach:

```python
class MyClass:
  static_val: str = "Static Value"
  num_instances: int = 0

  def say_hello(s: str) -> str:
    return f"Hello to {s}"

```

##### Annotating list of complex types

Complex types, such as list of floats, etc. can be annotated using the `typing` package:

```python
from typing import List

def my_fun(l: List[float]) -> float:
  return sum(l)
```

When importing that package, you can also create a type, so that it can be used to qualify variables and make the declaration easier to read:

```python
from typing import List

NestedList = List[List[str]] # list of lists of strings

my_super_list: NestedList = [["Hello", "To"], ["Jason"]]
```

##### Annotating dicts of complex types

The following example illustrates how annotate the types of a dictionary keys and values.

```python
my_dict_type = Dict[str, float]

my_dict: my_dict_type = {"key": 1.1}
```

##### Annotating unions

A *union* lets you specify two different types for a given attribute:

```python
def load_model(filename: str, cache_folder: Union[str, Path]) -> None:
  ...
```

In more modern Python, unions syntax will be replaced by `|` as in:

```python
def load_model(filename: str, cache_folder: str|Path) -> None:
  ...
```


##### Dictionaries with fixed schema using `TypedDict`

You can use `TypedDict` to model dictionaries with fixed schemas (known string keys).

```python
from typing import TypedDict

class InterestsTypedDict(TypedDict):
  name: str
  interests: List[str]

my_interests: InterestsTypedDict = {"name": "sergio", "interests": ["movies", "Golang", "Python", "in that order"]}
```

##### Annotating function parameters with Callable

You must use `Callable` to model arguments that are functions:

```python
from typing import Callable

def sum_numbers(x: int, y: int) -> int:
  return x + y

def compute(x: int, y: int, fn: Callable) -> int:
  return fn(x, y)
```

You can go above and beyond and also document the expected args using:

```python
from typing import Callable

def compute(x: int, y: int, fn: Callable[[int, int], int]) -> int:
  return fn(x, y)
```

##### Using `Any` when nothing else matches

You can use the `Any` type annotations when you want to explicitly state that the function doesn't care about the type it receives or returns.

```python
from typing import Any

def foo(x: Any) -> None:
  print(x)
```

##### Using `Optional` for optional parameters

You can use the following syntax give type hints about optional parameters:

```python
from typing import Optional

def foo(x: Optional[bool] = False) -> None:
  ...
```

##### Using `Sequence` for indexed types

You can use `Sequence` as a type hint for anything that can be indexed such as lists, tuples, strings, etc.

```python
from typing import Sequence

def print_sequence_elem(sequence: Sequence[str]):
  for i, s in enumerate(sequence):
    print(f"{i}: {s}")
```

| NOTE: |
| :---- |
| Sets and Dictionaries cannot be indexed, and therefore do not qualify as sequnces. |

##### Types tuples with `Tuple`

While you can use `tuple` when you don't care about the types of the tuple elements, you can also create typed tuples with `Tuple`:

```python
from typing import Tuple

t: tuple = (1, 2, 3, "catorce")

t_2: Tuple[int, int, int, int] = (1, 2, 3, 14)
```

#### Confirming that Python doesn't enforce type hints

Create a Python script that breaks the type hinting and validate:
+ Whether the notebook cell report the problem
+ Whether the same script, when use outside the notebook, the error is identified, and whether the error is run.

In [1]:
def greet(name: int) -> None:
    return f"Hello, {name}!!!"


print(greet("sergio"))

Hello, sergio!!!


See how in the notebook the type hints are completely ignored.

However, in [breaking_type_hints.py](exercises/section_16-docs/breaking_type_hints/breaking_type_hints.py) I included the same program and the following is displayed:

![breaking type hints](pics/type_hint_errors.png)

However, you can run it without problems.

#### Basic type hints

Create and annotate `add_nums(a, b)` that receives two numbers and returns the sum of such numbers.

In [2]:
def add_nums(a: int, b: int) -> int:
    return a + b

print(add_nums(2, 3))

5


## Section 17 &mdash; Files

Python standard library provides a large number of functions to deal with file operations.

### Building paths

Python allows you to create paths by concatenating a path and string using the `/` operator (which is overloaded for this purposes).

Create the file path `path/to/file.ext` using the `/` operator to concatenate a path created with `pathlib.Path` along with a string for the file.

In [1]:
import pathlib

path_prefix = pathlib.Path("path/to")
file_suffix = "file.ext"

path = path_prefix / file_suffix
print(path)

path/to/file.ext


### Renaming files

Create a simple file renaming program that given a path, a prefix pattern, and a wildcard of files, scans that path and renames the file matching the wildcard using the rule:

```
{out_prefix_pattern}_{counter}.{original_extension}
```

For example, if you have a directory with the files:

```
IMG_1642.jpg
IMG_4598.jpg
IMG_1763.jpg
```

you should be able to transform them into:
```
photo_001.jpg
photo_002.jpg
photo_003.jpg
```

by using `photo` as the `out_prefix_pattern`.

In [7]:
import pathlib
import shutil

path = pathlib.Path("./exercises/section_17-files/renaming_files/")
print(f"using path: {path}")

# We start by copying the original files to the out directory
orig_path = path / "orig"
out_path = path / "out"
for orig_file in orig_path.glob('*'):
    shutil.copy2(orig_file, out_path)

out_prefix = "file"
file_list = []
counter = 1

for file in sorted(out_path.glob('*')):
    file_list.append(file)

for file in file_list:
    resulting_file_name = f"{out_prefix}_{counter}{file.suffix.lower()}"
    file.rename(out_path / resulting_file_name)
    print(f"{orig_path}/{file.name} -> {out_path}/{resulting_file_name}")
    counter += 1



using path: exercises/section_17-files/renaming_files
exercises/section_17-files/renaming_files/orig/a_file.txt -> exercises/section_17-files/renaming_files/out/file_1.txt
exercises/section_17-files/renaming_files/orig/another_file.txt -> exercises/section_17-files/renaming_files/out/file_2.txt
exercises/section_17-files/renaming_files/orig/yet_another_file.txt -> exercises/section_17-files/renaming_files/out/file_3.txt


## Section 18 &mdash; The `with` statement

The `with` statement is used in exception handling code to simplify the management of resources such as files and database connections, so that they are correctly handled in error situations.

### Using `with` to control file exceptions

Consider a block of code that opens a file for writing, writes a string into the file, and the closes the file.

Write the block of code using three different approaches:

1. Don't use any exception control. Explain why the approach is weak.

2. Use try/catch/finally solving all the problems of the first approach.

3. Use `with` and discuss the functionality and readability of the approach.

In [8]:
# Option 1
path = "./exercises/section_17-files/using_with/delete_me.txt"

file = open(path, "x") # x: exclusing creation (will fail if file already exists)
file.write("Hello to Jason Isaacs!\n")
file.close()

The previous snippet has no exception handling and yet each of the statements is subject of failing:
+ the file might already exist
+ the disk might be read-only or full, which will make `file.write` fail.
+ the close operation might fail for some reason.

In [None]:
# Option 2

path = "./exercises/section_17-files/using_with/delete_me.txt"

try:
    file = open(path, "x") # x: exclusing creation (will fail if file already exists)
    file.write("Hello to Jason Isaacs!\n")
except Exception as err:
    print(f"error found while writing to file: {err}")
finally:
    file.close()

The second snippet solves the problems of the previous approach, as the file will be closed under all circumstances, and the error will be reported so that it's not lost.

In [11]:
# Option 3
path = "./exercises/section_17-files/using_with/delete_me.txt"

with open(path, "x") as file:
    file.write("Hello to Jason Isaacs!\n")


The previous snippet features the same behavior as the second snippet, but it is much more succinct, as all the exception handling is happening behind the scenes.

### Providing support to `with` in custom classes

Write a simple class `MessageWriter` that supports the following syntax:

```python
with MessageWriter("filename") as xfile:
  xfile.write(str)
```

When using the previous approach, `MessageWriter` should write the given string to a file, doing proper resource management with the file.

In [16]:
class MessageWriter(object):
    def __init__(self, name):
        self.name = name

    def __enter__(self):
        self.file = open(self.name, "x")
        return self.file

    def __exit__(self, exception_type, exception_value, traceback):
        if exception_type is not None:
            traceback.print_exception(exception_type, exception_value, traceback)
        self.file.close()

    def write(self, str):
        self.file.write(str)

path = "./exercises/section_17-files/using_with/delete_me.txt"

with MessageWriter(path) as xfile:
    xfile.write("Hello to Jason Isaacs!\n")


## Section 19 &mdash; Interacting with the underlying OS

This section illustrates different ways in which you can interact with the underlying OS.

### Exiting a program with `quit()`

You can exit from a running program/script using the function `quit()`.

Create a program that simulates the rolling of a dice and reports the number of consecutive times you obtain an even number. When an odd number is found, the program should quit.

| NOTE: |
| :---- |
| The `quit()` function does not work on notebook cells, so the exercise must be implemented as a standalone function. |

The solution is implemented in [dice_quit/main.py](exercises/section_19-os/dice_quit/main.py)

### Exiting a program by raising a `SystemExit`

Besides `quit()`, it is possible to raise a `SystemExit` exception to terminate a running program.

This will cause the program to stop even on notebook cells, so it's more portable than quit.

Implement the previous exercise on a notebook cell using `SystemExit`.

In [2]:
from random import randint

consecutive_throws_count: int = 0
while True:
    dice_throw: int = randint(1, 6)
    print(f"You obtained {dice_throw}")
    if dice_throw % 2 == 0:
        consecutive_throws_count += 1
    else:
        print(f"You reached {consecutive_throws_count} consecutive throws")
        raise SystemExit


You obtained 4
You obtained 1
You reached 1 consecutive throws


SystemExit: 

## Section 20 &mdash; Date and Time

### Parsing a string into a timezone-aware datetime object

Python has support for parsing strings into `datetime` objects using `datetime.strptime()` function.

Use this function to parse:
1. 1974-02-05T14:05:18
2. 17/05/2008 23:15:47

| HINT: |
| :---- |
| You will need to provide the format to `strptime` (see https://docs.python.org/3/library/datetime.html#datetime.datetime.isoformat) for examples. |

In [7]:
from datetime import datetime

dt1 = datetime.strptime("1974-02-05T14:05:18", "%Y-%m-%dT%H:%M:%S")
print(dt1)

dt2 = datetime.strptime("17/05/2008 23:15:47", "%d/%m/%Y %H:%M:%S")
print(dt2)

1974-02-05 14:05:18
2008-05-17 23:15:47


### Basic datetime objects

The `datetime` module contains three primary types of objects:
+ `date`
+ `time`
+ `datetime`

Arithmetic operations for these objects are only supported within the same data type, but it is easy to convert from one to the other.

1. Create a variable that holds today's date
2. Create a variable that holds the first day of 2024.
3. Create a variable that holds noon's time
4. Create a variable that holds current datetime
5. Create a variable that holds the datetime 1974-02-05T14:05:48
6. Try to subtract noon from today's date. What exception do you get?
7. Convert date to a datetime using `datetime`
8. Combine a date and a time into a datetime using `datetime.combine`.

In [16]:
from datetime import datetime, date, time

# 1: today's date
todays_date = date.today()
print(todays_date)

# 2: first day of 2024
new_years_day = date(2024, 1, 1)
print(new_years_day)

# 3: noon's time
noon_time = time(12, 0, 0)
print(noon_time)

# 4: current's datetime
now = datetime.now()
print(now)

# 5: variable for dt
dt = datetime(1974, 2, 5, 14, 5, 48)
print(dt)

# 6: subtracting noon from today's date (it fails)
try:
    print(todays_date - noon_time)
except Exception as ex:
    print(f"exception caught: {ex}")

# 7: converting a date to a datetime
todays_datetime = datetime(todays_date.year, todays_date.month, todays_date.day)
print(todays_datetime)

# 8: combining a date and time into a datetime
todays_noon_datetime = datetime.combine(todays_datetime, noon_time)
print(todays_noon_datetime)

2023-05-24
2024-01-01
12:00:00
2023-05-24 09:21:32.290283
1974-02-05 14:05:48
exception caught: unsupported operand type(s) for -: 'datetime.date' and 'datetime.time'
2023-05-24 00:00:00
2023-05-24 12:00:00


### Constructing timezone-aware datetime objects

A `datetime` object is considered *naive* if it is unaware of the timezone information.

To make it timezone aware, you have to provide the UTC offset and timezone abbreviation as a function of date and time.

Build a time-aware datetime object by:
1. Defining a `datetime` object and passing the `tzinfo` information that you will need to have previously defined as an object using `timezone`.
2. Repeat the same exercise giving a name to the `timezone` in and use `dt.tzname()` to retrieve it.

In [20]:
from datetime import datetime, timezone, timedelta

# 1
cest_tz = timezone(timedelta(hours=1))
dt = datetime(2023, 5, 24, 9, 23, 35, tzinfo=cest_tz)
print(dt) # time
print(dt.tzname())

# 2
dt = datetime(2023, 5, 24, 9, 23, 35, tzinfo=timezone(timedelta(hours=1), "CEST"))
print(dt) # time
print(dt.tzname())




2023-05-24 09:23:35+01:00
UTC+01:00
2023-05-24 09:23:35+01:00
CEST


### Computing time differences

Time differences are computed using the `timedelta` module included in `datetime`.

Compute:
1. The difference between now and "1974-02-05" (no time).
2. The number of days between those dates.
3. The number of seconds between those dates.
4. Define a function `get_date_n_days_after_today` that returns the date resulting from adding n days after today's date.
5. Define a function `get_date_n_days_before_today` that returns the date resulting from subtracting n days before today's date.

In [33]:
from datetime import datetime, timedelta

# 1: difference between a date and now
now = datetime.now()
print(now - datetime(1974, 2, 5))

# 2: difference in days
diff = now - datetime(1974, 2, 5)
print(f"I've lived for {diff.days} days")

# 3: difference in seconds
diff_seconds = diff.days * 24 * 60 * 60 + diff.seconds
print(f"I've lived for {diff.total_seconds()} seconds")
print(f"I've lived for {diff_seconds} seconds")


# 4: get_date_n_days_after_today
def get_date_n_days_after_today(num_days, date_format="%d %B %Y"):
    then = datetime.now() + timedelta(days=num_days)
    return then.strftime(date_format)

print(get_date_n_days_after_today(2))

# 5: get_date_n_days_before_today
def get_date_n_days_before_today(num_days, date_format="%d %B %Y"):
    then = datetime.now() - timedelta(days=num_days)
    return then.strftime(date_format)

print(get_date_n_days_before_today(2))


18005 days, 12:21:05.729326
I've lived for 18005 days
I've lived for 1555676465.729326 seconds
I've lived for 1555676465 seconds
26 May 2023
22 May 2023


## Section 21 &mdash; Python Style Guide

The official style guide for Python is described in [PEP 8 &mdash; Style Guide for Python Code](https://www.python.org/dev/peps/pep-0008/).

The principle for any style guide is to give a series of simple rules that should be used to improve the readability of the code and make it consistent across libraries and projects.

Also, any style guide should be taken as a pragmatic document: it is OK to break the rules for the sake of simplicity, or even readability.

For example, Google publishes its own style guide in [Google Python Style Guide](https://google.github.io/styleguide/pyguide.html).

They provide a `pylintrc` file that you can configure in your projects to ensure you're following the style guide.

## Section 22 &mdash; More on Operators

Python features the expected arithmetic operators found in other programming languages, along with other not so common such as `**` for exponentiation and `//` for floor division (integer division).

### Exponentiation and Floor Division
Use the exponentiation and floor division operators to compute:

+ $ 2^4 $
+ integer division of 4.5 / 2 (should be 2.0)

In [5]:
print(2 ** 4)
print(4.5 // 2)

16
2.0


### Overloaded operators

As Python supports overloading operators you will find situations in which you can use certain mathematical or logical operators to perform functions on built-in types or classes.

For example, `+` can be used to concatenat strings.

Use the `+` operator to concatenat strings

In [7]:
s1 = "Hello to"
s2 = "Jason Isaacs!"

print(s1 + " " + s2)

Hello to Jason Isaacs!


### Boolean operators

Python provides `not`, `and`, and `or` operators for logical operations.

Define two conditions:
+ condition 1: 1 > 2
+ condition 2: 2 > 1

print:
+ the negation of condition 1
+ the and of condition 1 and 2
+ the or of condition 1 and 2

In [8]:
c1 = 1 > 2
c2 = 2 > 1

print(not c1)
print(c1 and c2)
print(c1 or c2)

True
False
True


### Expressions with falsy values and short-circuiting

You can use `and` and `or` as expressions in Python.

`or` can be used in an expression to return the value of the first operand if it is not falsy, and return the second if it is.

This can be used in an expression to obtain a default value.

`and` in an expression only evaluates the second argument if the first one is true. As a result, you can use `x and y` as a shortcut for `if x is False then x else y`.

Validate with an example.

In [11]:
# default value
user_input = None
val = user_input or "default"
print(val)

# assigned value
user_input = "hello"
val = user_input or "default"
print(val)

default
hello


In [19]:
x = 0
y = 1

print(x and y)
print(x if x == False else y) # ternary op

0
1


### Bitwise operators

The following bitwise operators are available in Python:

+ `&`: binary AND
+ `|`: binary OR
+ `^`: binary XOR
+ `~`: binary NOT
+ `<<`: shift left
+ `>>`: shift right

### `is` and `in` operators

`is` is the identity operator, returns True if and only if two objects are the same object.

`in` is the membership operator, it returns `True` if a value is contained in a sequence.

Create some snippets to test both.

In [26]:
# strings
s1 = "Hello"
s2 = "Hello"
print(s1 == s2)
print(s1 is s2)
print()

# numbers
i1 = 5
i2 = 5
print(i1 == i2)
print(i1 is i2)
print()

# booleans
b1 = True
print(b1)
print(b1 is True)
print(b1 == True)
print()

# class
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def __eq__(self, other):
        if self.name == other.name and self.age == other.age:
            return True
        else:
            return False

p1 = Person("Jason", 53)
p2 = Person("Jason", 53)
print(p1 == p2)
print(p1 is p2)


True
True

True
True

True
True
True

True
False


### The ternary operator

The ternary operator in Python is defined as:

```python
result_if_true if condition else result_if_false
```

Use the ternary operator to create an expression that returns true when passed a value over 18 and wrap it in a function called `is_adult`:

In [28]:
def is_adult(age: int) -> bool:
    return True if age >= 18 else False

print(is_adult(15))
print(is_adult(18))
print(is_adult(21))


False
True
True


## Section 23 &mdash; More on strings

Strings in Python can be enclosed in quotes or double-quotes, and you can use `+` to concatenate strings and `+=` to append to a string.

Strings has a bunch of built-in methods:

+ `isalpha()` &mdash; returns true if the string contains only chars and is not empty
+ `isalnum()` &mdash; true if the string contains characters or digits and is not empty

+ `isdecimal()` &mdash; true if a string contains digits and is not empty

+ `lower()` &mdash; returns a lowercase version of a string

+ `islower()` &mdash; true if a string is lowercase

+ `upper()` &mdash; returns an uppercase version of a string

+ `isupper()` &mdash; true if a string is uppercase

+ `title()` &mdash; gets a capitalized version of a string

+ `startswith()` &mdash; checks if a string starts with a specific substring

+ `endswith()` &mdash; checks if a string ends with a specific string

+ `replace()` &mdash; replaces a part of a string

+ `split()` &mdash; splits a string on a specific character

+ `strip()` &mdash; trims whitespace characters from a string

+ `join()` &mdash; joins strings

+ `find()` &mdash; finds the position of a substring

All those methods work in the same way &mdash; they return a new immutable string without modifying the original string.

Use some of these methods in an example:

In [1]:
str = "the world in which we live in"
print(str.title())

The World In Which We Live In


### The `len` built-in function

The `len(str)` function returns the length of a string.

Use it calculate the length of a string:

In [3]:
name = "Idris Elba"
print(len(name))

10


### The `in` operator with strings

The `in` operator can be used to check if a string contains a substring.

Use the `in` operator to check if you can 'son' in 'Jason Isaacs':

In [4]:
print("son" in "Jason Isaacs")

True


### Escaping characters within a string

You can use `\` to escape characters within a string.

Use it to quote a string.

In [6]:
name = "Margot Robbie"

print(f"\"{name}\" is an actor.")

"Margot Robbie" is an actor.


### Indexing and slicing

Individual characters or sets of characters within string can be accessed like elements in a list.

Predict the result of the following operations:

```python
name = "foobar"
name[0]       # f
name[1]       # o
name[-1]      # r
name[-2]      # a
name[0:2]     # fo
name[3:]      # bar
name[:2]      # fo
name[1:-1]    # ooba
```

In [7]:
name = "foobar"
print(name[0])       # f
print(name[1])       # o
print(name[-1])      # r
print(name[-2])      # a
print(name[0:2])     # fo
print(name[3:])      # bar
print(name[:2])      # fo
print(name[1:-1])    # ooba

f
o
r
a
fo
bar
fo
ooba


### String formatting

While the modern way to format string is to use *f-string formatting* as in:

```python
f"Hi, this is {name}!"
```

You can use a alternative way consisting in:

```python
"Hi, this is {}!".format(name)
```

There was an even older way, which used the `%` symbol and format specifiers.


Use both approaches to print a string with two embedded variables name and age.

In [10]:
name = "sergio"
age = 49

print(f"My name is {name} and I'm {age} years old.")
print("My name is {} and I'm {} years old.".format(name, age))
print("My name is %s and I'm %d years old." % (name, age))

My name is sergio and I'm 49 years old.
My name is sergio and I'm 49 years old.
My name is sergio and I'm 49 years old


### Using `isalnum()` to check whether strings represent alphanumeric values

In [43]:
username = "123@!"
assert username.isalnum() == False

username = "123avs"
assert username.isalnum() == True


### Using `isalpha()` to check whether a string contains letters only

In [44]:
assert "Homework".isalpha() == True
assert "CS101".isalpha() == False

### Using `isnumeric()` to check whether a string contains numbers only



In [47]:
from datetime import datetime

done = False
while not done:
    age = input("Type your age:")
    if age.isnumeric():
        print(f"You were born on {datetime.now().year - int(age)} ")
        done = True
    else:
        print("ERROR: age must be a number")

You were born on 2011 


Please note that strings that represent negative integers won't pass the `isnumeric` check.

Also, empty strings return False.

In [49]:
assert "-2".isnumeric() == False
assert "".isnumeric() == False

### Casting strings to numbers

You can use `float("string")` and `int("string")` to cast a string into either an integer or a floating-point number.

When the casting fails, you'll get a `ValueError`:

In [58]:
def cast_to_float(number_str):
    try:
        casted_number = float(number_str)
    except ValueError:
        print(f"ERROR: '{number_str}' cannot be casted into a number")
    else:
        print(f"Casted '{number_str}' to {casted_number}")

cast_to_float("2.345")
cast_to_float("2.d345")

Casted '2.345' to 2.345
ERROR: '2.d345' cannot be casted into a number


### Concatenating f-strings in a single string

You might need to concatenate in a single string several f-strings. The following syntax needs to be applied in those cases:

In [62]:
settings = {
    "font_size": "large",
    "font": "Arial",
    "color": "Black",
    "align": "center"
}

styles = f"font-size={settings['font_size']}, " \
         f"font={settings['font']}, " \
         f"color={settings['color']}, " \
         f"align={settings['align']}"

print(styles)

font-size=large, font=Arial, color=Black, align=center


### Quoting f-strings

You can use the format `!r` to quote a string within an f-string as seen below:

In [4]:
name = "Jason"

print(f"Hello, your name is {name!r}")

Hello, your name is 'Jason'


## Section 24 &mdash; More on Booleans

The `bool` type can have the values `True` and `False`.



There are few coalescing rules for non-boolean types that you should be aware of:

+ numbers are always `True` except for number `0`.
+ strings are `False` only when empty.
+ lists, tuples, sets, and dictionaries are `False` only when empty

You can check if a given variable is bool by using `isinstance(var, bool)`.

Create a program that validates the coalescing rules explained above:

In [4]:
n = 0
if 0:
    print("Unexpected")
else:
    print("expected")

s1 = ""
s2 = "hi!"


if s1:
    pass
else:
    print("empty string")

if s2:
    print("non-empty string")

l = [1, 2, 3]
d = {}

if l:
    print("non-empty list")

if d:
    pass
else:
    print("non-empty dict")

expected
empty string
non-empty string
non-empty list
non-empty dict


### `any` and `all`
Python defines the following built-in functions that receive an iterable:
+ `any()` returns `True` if any of the elements of the iterable is `True`
+ `all()` returns `True` if all of the elements of the iterable are `True`

Create a simple program that illustrates the usage and behavior of `any` and `all`.

In [10]:
# print(any(False, False, True)) # ERROR: any takes one argumen
print(any([False, False, True]))
print(any([False, False, False]))

print(all([True, True, True]))
print(all([False, False, False]))

True
False
True
False


## Section 25 &mdash; Enums

Enums are readable names bound to constant values.

The following snippet defines an enum `State` with values `DISABLED` and `ENABLED`.

```python
from enum import Enum

class State(Enum):
  DISABLED = 0
  ENABLED = 1
```

Python will allow you to refer to enums using:

```Python
State.ENABLED
State(1)
STATE["DISABLED"]
```

You can get the value of an enum using:

```Python
State.ENABLED.value
```

And you can list and count the different values of an enum using:

```Python
list(State)
len(State)
```

## Section 26 &mdash; Emulating constants

Python has no language construct to enforce that a variable should be a constant, but you can emulate them using an `Enum`:

```Python
from enum import Enum

class Constants(Enum):
  WIDTH = 1024
  HEIGHT = 768
```

While this will serve the purpose, as things like

```Python
Constants.WIDTH = 2048 # ERROR: cannot reassign
```

The DX is far from perfect, as to get the value you'll need to do:

```python
print(Constants.WIDTH.value) # 1024
print(Constants.WIDTH)       # Constants.WIDTH

```

In [11]:
from enum import Enum

class Constants(Enum):
  WIDTH = 1024
  HEIGHT = 768

print(Constants.WIDTH)

Constants.WIDTH


The alternative is relying on naming conventions:

```python
WIDTH = 1024
```

## Section 27 &mdash; Reading user input

You can read information the user types in the standard input using the `input()` built-in function.

In [12]:
print("About to delete resource")
print("Are you sure you want to continue?")
user_input = input()
print(f"The user typed: {user_input}")

About to delete resource
Are you sure you want to continue?


## Section 28 &mdash; More on `if`

Python supports the `elif` keyword to implement *else-if* branching:

In [None]:
n = 3
if n == 1:
    print("the number was one")
elif n == 2:
    print("the number was two")
else:
    print("the number was not one or two")

## Section 29 &mdash; More on functions

### Arguments are passed by reference

Parameters is Python are always passed by reference, and all types in Python are objects.

However, as some of Python objects are immutable, it might seem that a certain argument is received as a value instead:

In [1]:
def change_int(val):
    val *= 2
    print("In change_int: val=", val)

num = 5
change_int(num)
print(num)

In change_int: val= 10
5


### Returning multiple values

A function in Python can return multiple values using the syntax explained below. The consumer code will receive the results in a tuple.

In [2]:
def foo(name):
    return num, "bar", 55

print(foo("Idris"))

(5, 'bar', 55)


### Nested functions

Python supports nested functions, that is, functions defined within functions.

You will always be able to read a variable defined in the enclosing function (see below), but when modifying it, you will need to use the keyword `nonlocal`.

In [4]:
def count():
    count = 0

    def increment():
        nonlocal count
        count += 1
        print("within increment: count:", count)

    return increment

inc = count()
inc()
inc()


within increment: count: 1
within increment: count: 2


### Setting and calling default arguments in functions

Python allows you to define functions with default arguments, so that if the client code does not provide it, it takes that default value.

This approach is taken by many built-in and stdlib functions:

In [7]:
numbers = [4, 5, 7, 2]

numbers.sort()  # using default value
assert numbers == [2, 4, 5, 7]

numbers.sort(reverse=True)  # using non default value
assert numbers == [7, 5, 4, 2]

Actually, the signature for the `sort` function is even more complicated:

```python
sort(*, key=None, reverse=False)
```

The asterisk `*` in the `sort` method dictates that all the arguments following the asterisk should be set with their parameter names, that is, those are keyword arguments instead of positional arguments.

As a result, you cannot invoke the function doing `sort(True)`.

Defining functions with default arguments is easy.

Consider the following class that models a Task entity. A function is also provided a method to complete a task with an optional completion message:

In [14]:
class Task:
    def __init__(self, title, description, urgency):
        self.title = title
        self.description = description
        self.urgency = urgency


def complete_task(task, note=""):
    task.status = "completed"
    task.note = note
    print(f"{task.title}'s has been completed; note={task.note}")


task = Task("Homework", "Physics + Math", 5)
complete_task(task, "took me 2 hours!")

print(f"task's note = {task.note}")

Homework's has been completed; note=took me 2 hours!
task's note = took me 2 hours!


### Arguments vs. Parameters

When we define functions, we refer to the variables specified in the function head as *parameters*. When we call functions, we refer to the variables we use as arguments.

> Parameters are the variables used in a function definition; arguments are the variables used in a function's invocation.

### Setting default arguments for mutable fields

Consider the following situation in which instead of using a default value for a string (which we know is immutable) we use a mutable one such as a list or a dictionary.

In particular, we have a function that let us put the completed task into a particular *queue* of tasks. The default value is set as `[]` with the intention of instantiating an empty list if the caller doesn't provide one.

In [21]:
class Task:
    def __init__(self, title, description, urgency):
        self.title = title
        self.description = description
        self.urgency = urgency


def complete_task(task, group=[]):
    task.status = "completed"
    group.append(task.title)
    print(f"{task.title}'s has been completed")
    print(f"   >> id(group)=", id(group))
    return group


homework = Task("Homework", "Physics + Math", 5)
videogames = Task("Videogames", "Play videogame", 2)

boring_tasks = []
complete_task(homework, boring_tasks)
print(f"completed boring tasks: ", boring_tasks)

grouped_tasks = complete_task(videogames)
print(f"unclassified tasks: ", grouped_tasks)
grouped_tasks = complete_task(homework)
print(f"unclassified tasks: ", grouped_tasks)

print(id(grouped_tasks))

Homework's has been completed
   >> id(group)= 140098145907456
completed boring tasks:  ['Homework']
Videogames's has been completed
   >> id(group)= 140098167892288
unclassified tasks:  ['Videogames']
Homework's has been completed
   >> id(group)= 140098167892288
unclassified tasks:  ['Videogames', 'Homework']
140098167892288


Note that as counterintuitive as it might be, when invoking `complete_task` with no value for the second parameter, the title gets added to the previous state of the list.

The reason is the following:

> Python evaluates the function when it's defined, not when it's called. As a result, any mutable default arguments are created during this evaluation and become part of the function.

Therefore, the `group` argument gets bound to the function and initialized only once.

You can check those default arguments using `__defaults__`

It's common practice to sort out this issues by way of setting the default value to None and using additional code to create the corresponding list (or mutable object):

In [22]:
class Task:
    def __init__(self, title, description, urgency):
        self.title = title
        self.description = description
        self.urgency = urgency


def complete_task(task, group=None):
    task.status = "completed"
    if group is None:
        group = []
    group.append(task.title)
    print(f"{task.title}'s has been completed")
    return group


homework = Task("Homework", "Physics + Math", 5)
videogames = Task("Videogames", "Play videogame", 2)

boring_tasks = []
complete_task(homework, boring_tasks)
print(f"completed boring tasks: ", boring_tasks)

grouped_tasks = complete_task(videogames)
print(f"unclassified tasks: ", grouped_tasks)
grouped_tasks = complete_task(homework)
print(f"unclassified tasks: ", grouped_tasks)



Homework's has been completed
completed boring tasks:  ['Homework']
Videogames's has been completed
unclassified tasks:  ['Videogames']
Homework's has been completed
unclassified tasks:  ['Homework']


### Functions returning multiple values

Python allows you to return multiple values. Those values will effectively wrapped into a `tuple`.

In [28]:
from statistics import mean, stdev

def generate_stats(measures):
    mean_value = mean(measures)
    std_dev = stdev(measures)
    return mean_value, std_dev

results = generate_stats([1, 2, 3, 4, 5])
print(f"results={results}; {type(results)}")


results=(3, 1.5811388300841898); <class 'tuple'>


Those can easily unpacked into individual variables, as you'd do in other programming languages supporting multiple return values (Go, I'm looking at you!)

In [30]:
from statistics import mean, stdev

def generate_stats(measures):
    mean_value = mean(measures)
    std_dev = stdev(measures)
    return mean_value, std_dev

mean_val, std_dev = generate_stats([1, 2, 3, 4, 5])
print(f"mean_val={mean_val}, std_dev={std_dev}")

mean_val=3, std_dev=1.5811388300841898


### Partial functions

We often define multiple parameters in a function so that it can handle different forms of input to derive the needed result for different scenarios.

Consider the following function that run a stats model:

```python
def run_stats_model(dataset, model, output_path):
  # process the dataset
  # apply the model
  # save the stats to the output_path
  calculated_stats = ...
  return calculated_stats
```

This function is quite generic so tht it can be used in multiple scenarios:

```python
run_stats_model(dataset_a1, "model_a", "project_a/stats/")
run_stats_model(dataset_a2, "model_a", "project_a/stats/")
run_stats_model(dataset_b1, "model_b", "project_b/stats/")
run_stats_model(dataset_b2, "model_b", "project_b/stats/")
```

While you can do:

```python
run_stats_model_a(dataset):
  results = run_stats_model(dataset, "model_a", "project_a/stats")
  return results
```

Python provides a more Pythonic way to do so using the `partial` function from the `functools` module:

```python
from functools import partial

run_stats_model_a = partial(run_stats_model, model="model_a", output_path="project_a/stats/")

run_stats_model_a("dataset_a")
```

## Section 30 &mdash; More on loops

`for` loops in Python are more functional than in other programming languages, and they are supposed to be used to iterate over the elements of an iterable:

In [4]:
items = [1, 2, 3, 4]
sum = 0
for item in items:
    sum += item

print("sum:", sum)

sum: 10


Many times, you will use the `range` function to emulate the usual behavior of for loops.

In [5]:
for item in range(4):
    print(item)

0
1
2
3


That function allows for fine-tuning of the values of the iterable returned by `range`:

In [6]:
sum = 0
for n in range(1, 100, 10):
    print("n:", n)
    sum += n

print("sum:", sum)

n: 1
n: 11
n: 21
n: 31
n: 41
n: 51
n: 61
n: 71
n: 81
n: 91
sum: 460


### `enumerate` in loops

You can use the global function `enumerate()` to obtain an iterable of tuples that contain the index of the element and its value:

In [7]:
items = range(50, 55)
for index, item in enumerate(items):
    print(f"{index}: {item}")

0: 50
1: 51
2: 52
3: 53
4: 54


### `break` and `continue`

The statements `break` and `continue` work as in other programming languages:
+ `break` &mdash; step out of the current loop
+ `continue` &mdash; stop current iteration and go to the next one

In [8]:
items = [1, 2, 3, 4, 5]

for item in items:
    if item % 3 == 0:
        continue
    elif item == 4:
        break
    else:
        print("item:", item)

item: 1
item: 2


## Section 31 &mdash; The Python Standard Library

Python provides a large collection of utilities through its standard library (https://docs.python.org/3/library/index.html):

+ `math` &mdash; Math utils
+ `re` &mdash; regular expressions
+ `json` &mdash; JSON utils
+ `datetime` &mdash; date/time related utilities
+ `sqlite3` &mdash; SQLite utils
+ `os` &mdash; Operating System utils
+ `random` &mdash; random number generation
+ `statistics` &mdash; statistics utils
+ `requests` &mdash; HTTP request utils
+ `http` &mdash; HTTP server utils
+ `urllib` &mdash; URL management utils

## Section 32 &mdash; The PEP8 Python Style Guide

Python defines its conventions in the PEP8 style guide. PEP stands for *Python Enhancement Proposals*, and the whole document can be found in https://www.python.org/dev/peps/pep-0008/

A quick summary of the important topics addressed by PEP8 are:
+ Indent using spaces, not tabs.
+ Indent using 4 spaces.
+ Encode Python source code files using UTF-8.
+ Use 80 cols for your source code.
+ Write each statement on its own line.
+ Use *snake_case* for functions, variable names and file names.
+ Use CamelCase for class names.
+ Use lowercase for package names, do not separate individual words with underscores.
+ Write constants in UPPERCASE.
+ Add spaces around operators.
+ Do not use unnecessary whitespace.
+ Add a blank line before a function.
+ Add a blank line betwwen methods in a class.
+ Use blank lines within functions and methods to separate related blocks of code.


## Section 33 &mdash; Variable scope rules in Python

When you declare a variable outside of any function in Python, the variable will be visible to any code after the declaration.

That is called a global variable.

A global variable in Python will be visible without requiring any additional keyword, but you won't be able to modify its value.

In [18]:
name = "Margot"

def say_hello():
    print(f"Hello, {name}")  # using global

say_hello()
print(name)

Hello, Margot
Margot


When you define a variable with the same name, within the function, the new variable will hide the global variable:

In [19]:
name = "Margot"

def say_hello():
    name = "Florence"
    print(f"Hello, {name}")  # using local

say_hello()
print(name)

Hello, Florence
Margot


### The `global` keyword

If you want to modify the value of a global variable within the scope of a function, you need to use the `global` keyword:

In [20]:
name = "Margot"

def say_hello():
    global name
    name = "Emma"
    print(f"Hello, {name}")  # using global

say_hello()
print(name)

Hello, Emma
Emma


### The `nonlocal` keyword

Similarly, you need to use the `nonlocal` keyword to identify access to variables from an outer scope:

In [22]:
def say_hello():
    name = "Charlize"

    def prepare_message():
        nonlocal name
        name += " Theron"
        return name

    prepare_message()
    print(name)

say_hello()

Charlize Theron


## Section 34 &mdash; Decorators

Decorators are a way to change, enhance, and alter the way a function or a method works.

They are defined with the symbol `@`, followed by the decorator name right before the function/method definition:

```python
@logtime
def greet_me():
  print("Hello to you!")
```

Behind the scenes, a decorator is nothing more than a function that takes a function as a parameter, wraps the function in an inner function that performs the job associated to the decorator, and then returns it:

```python
def logtime(func):
  def wrapper():
    # ...decorator logic here...
    val = func()
    return val
  return wrapper
```

### Creating a logging decorator for functions

Create a decorator `@announce` that logs some information in the console when a function is decorated.

In [4]:
def announce(fn):
    def wrapper():
        print(">>> about to call the function", fn.__name__)
        val = fn()
        print(f">>> the function {fn.__name__}has been called")
        return val
    return wrapper

@announce
def greet_me():
    print("Hello to you!")

greet_me()

>>> about to call the function greet_me
Hello to you!
>>> the function greet_mehas been called


### Checking a function's performance with decorators

Finding the time it takes for a function to execute is a common concern during development.
This can be done with `time` as illustrated below:

In [11]:
import time
import random

def example():
    print("--- example execution starts")
    start_t = time.time()
    # simulate workload
    random_delay = random.randint(1, 5) * 0.1
    time.sleep(random_delay)
    end_t = time.time()
    print(f"*** example execution took: {end_t - start_t:2f}")

example()
example()

--- example execution starts
*** example execution took: 0.400565
--- example execution starts
*** example execution took: 0.300569


This approach works well for a single function, but won't really scale in a large project with hundreds of functions to observe.

Additionally, the code introduced to observe the function's execution time is intrusive and keeps the reader focus away from what the function actually does.

#### Decorators to the rescue!

Decorators are functions that provide additional functionalities to the decorated functions. In this particular case, we'll use decorators to track the function's execution time.

A decorator is a closure with the signature:

```python
def decorator_name(func):
  def decorator_impl(*args, **kwargs):
    # before actions
    # ...
    result = func(*args, **kwargs)
    # after actions
    # ...
    return result
```

Once define, decorators can be applied to functions using:

```python
@decorator_name
def example(arg):
  ...
```

| NOTE: |
| :---- |
| The rationale of using `*args` and `**kwargs` is to make the function compatible with all functions, regardless of their calling signatures. |

In [13]:
import random
import time


def log_time(func):
    def logger(*args, **kwargs):
        print(f"--- {func.__name__} execution starts")
        start_t = time.time()
        result = func(*args, **kwargs)
        end_t = time.time()
        print(f"*** {func.__name__} execution took: {end_t - start_t:.2f} msec")
        return result

    return logger

@log_time
def example():
    # simulate workload
    random_delay = random.randint(1, 5) * 0.1
    time.sleep(random_delay)

example()
example()

--- example execution starts
*** example execution took: 0.40 msec
--- example execution starts
*** example execution took: 0.20 msec


### Create a function monitor with decorators

Create a decorator `@monitor` which makes the function announce itself when invoked.

In [14]:
def monitor(func):
    def announce(*args, **kwargs):
        print(f">>> {func.__name__} invoked")
        result = func(*args, **kwargs)
        print(f"<<< {func.__name__} complete")
        return result

    return announce

@monitor
def greet_me(name):
    print(f"Hello, {name}!!!")

greet_me("Jason")

>>> greet_me invoked
Hello, Jason!!!
<<< greet_me complete


### Create a function monitor with custom arguments using decorators

Create a decorator `@monitor(msg)` which makes the function announce itself when invoked with the given message.

| HINT: |
| :---- |
| Creating a decorator that accepts parameters requires wrapping a decorator into another function that accepts the parameter. |

In [22]:
def monitor(msg):
    def monitor_decorator(func):
        def announce(*args, **kwargs):
            print(f"{msg} invoked")
            result = func(*args, **kwargs)
            print(f"{msg} completed")
            return result

        return announce

    return monitor_decorator

@monitor(f">>> greet_me")
def greet_me(name):
    print(f"Hello, {name}!!!")

greet_me("Jason")

>>> greet_me invoked
Hello, Jason!!!
>>> greet_me completed


### Wrapping to carry over the decorated function's metadata

When decorating a function, its docstring and other metadata will be lost if you're not careful:

In [20]:
def monitor(func):
    def announce(*args, **kwargs):
        print(f"{msg} invoked")
        result = func(*args, **kwargs)
        print(f"{msg} completed")
        return result

    return announce

def say_hi(person):
    """Greet someone"""
    print(f"Hello to {person}")

print(f"doc={say_hi.__doc__}, name={say_hi.__name__}")

@monitor
def say_hello(person):
    """Greet someone"""
    print(f"Hello to {person}")

print(f"doc={say_hello.__doc__}, name={say_hello.__name__}")

doc=Greet someone, name=say_hi
doc=None, name=announce


Note how the `__doc__` and `__name__` are lost when applying a decorator to a function.

The way to solve it involves decorated the closure within our custom decorator with `@functools.wraps(func)`:

In [21]:
import functools

def monitor(func):
    @functools.wraps(func)
    def announce(*args, **kwargs):
        print(f"{msg} invoked")
        result = func(*args, **kwargs)
        print(f"{msg} completed")
        return result

    return announce

@monitor
def say_hello(person):
    """Greet someone"""
    print(f"Hello to {person}")

print(f"doc={say_hello.__doc__}, name={say_hello.__name__}")


doc=Greet someone, name=say_hello


### Wrapping to carry over the decorated function's metadata when a decorator accepts arguments

Create a decorator `@monitor(msg)` that uses wrapping not to lose the decorated function metadata.

In [26]:
import functools

def monitor(msg):
    def monitor_decorator(func):
        @functools.wraps(func)
        def announce(*args, **kwargs):
            print(f"{msg} invoked")
            result = func(*args, **kwargs)
            print(f"{msg} completed")
            return result

        return announce

    return monitor_decorator

@monitor(f">>> greet_me")
def greet_me(name):
    """Greets using the given name"""
    print(f"Hello, {name}!!!")

greet_me("Jason")

print(f"doc={greet_me.__doc__}; name={greet_me.__name__}")

>>> greet_me invoked
Hello, Jason!!!
>>> greet_me completed
doc=Greets using the given name; name=greet_me


## Section 35 &mdash; Introspection/Reflection

Python provides a set of functions you can use to obtain information about other functions at runtime.

First of all, you can use `type()` to get the type of an object:

In [5]:
def greet_me():
    print("Hello")

class Person:
    def __init__(self, name):
        self.name = name

num = 5
word = "foobar"

print(type(greet_me))
print(type(Person))
print(type(num))
print(type(word))

<class 'function'>
<class 'type'>
<class 'int'>
<class 'str'>


The `dir()` global function provides all the methods and attributes of an object.

In [6]:
word = "foobar"

print(dir(word))

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']


The [`inspect`](https://docs.python.org/3/library/inspect.html) standard library provides more utility functions for introspection.

### Using `type` to get the type of a variable

The following snippet illustrates what's the result of using `type` on the input collected by a user:

In [40]:
age = input("Please type your age: ")
print(type(age))

<class 'str'>


## Section 36 &mdash; More on Exceptions

In its simplest form, a try-except block looks like:

```python
try:
  # ... block of code that might raise exceptions ..
except <ExceptionType>:
  # ... handling ExceptionType errors ...
```

But it is common to find more complicated forms that include several exception types, an `else` block to do something when no exceptions are found, and a `finally` block when you want to run something whether an exception was run or not.

```python
try:
  # ... block of code that might raise exceptions ..
except <ExceptionType_1>[as <err1>]:
  # ... handling ExceptionType_1 errors and assigning them to err1...
except <ExceptionType_2>[as <err2>]:
  # ... handling ExceptionType_2 errors and assigning them to err2...
...
except <ExceptionType_N>[as <errN>]:
  # ... handling ExceptionType_N errors and assigning them to errN...
else:
  # ... no exception was raised in the try block ...
finally:
  # ... cleaning up resources here ...
```


In [8]:
def divide(dividend, divisor):
    try:
        result = dividend / divisor
    except TypeError:
        print("Arguments must be numbers!")
    except Exception as err:
        print("Could not perform division:", err)

divide(2, 0)

Could not perform division: division by zero


### Throwing/Raising Exceptions

Exceptions are thrown using the `raise` statement:

In [9]:
try:
    raise Exception("A general exception has been raised")
except Exception as ex:
    print(ex)

try:
    raise TypeError("It was the wrong type")
except Exception as ex:
    print(ex)

A general exception has been raised
It was the wrong type


### Creating your own Exceptions

You can easily create your own exception classes by extending the `Exception` class:

In [10]:
class MyCustomException(Exception):
    pass

try:
    raise MyCustomException("I own this exception")
except Exception as err:
    print(err)

I own this exception


### Handling multiple exceptions in a single except clause

It is possible to manage multiple exceptions in a single except clause:

In [None]:
pending_task = None

def process_task(text):
    title, urgency_str = text.split(",")
    try:
        urgency = int(urgency_str)
        pending_task.urgency = urgency
    except (ValueError, NameError):
        print(f"Oops!")


### Using `else` in exception handling to minimize the try clause

It is a good practice to minimize the length of the `try` clause by including only the code that can raise exception.

The code to run after the `try` body when no exception has been raised can be safely included in the `else` block:

In [7]:
from collections import namedtuple
import logging

logger = logging.getLogger(__name__)

Task = namedtuple("Task", "title, urgency")

def process_task_str(text):
    title, urgency_str = text.split(",")
    try:
        urgency = int(urgency_str)
    except ValueError as ex:
        logger.exception("Couldn't convert %s to int", urgency_str)
        return None
    else:
        task = Task(title, urgency)
        return task


assert process_task_str("Laundry,3") == Task("Laundry", 3)

assert process_task_str("Laundry,#3") is None


ERROR:__main__:Couldn't convert #3 to int
Traceback (most recent call last):
  File "/tmp/ipykernel_4500/324115465.py", line 11, in process_task_str
    urgency = int(urgency_str)
ValueError: invalid literal for int() with base 10: '#3'


### Supplying custom messages in Exceptions

Exceptions allow you to pass custom messages, so that the consumer of your code is better prepared to deal with the problem:



In [8]:
from collections import namedtuple

Task = namedtuple("Task", "title, urgency")

def process_task_str(text):
    title, urgency_str = text.split(",")
    try:
        urgency = int(urgency_str)
    except ValueError as ex:
        raise ValueError(f"Incorrect value for urgency: {urgency_str!r}")
    else:
        task = Task(title, urgency)
        return task


try:
    process_task_str("Laundry,#3") is None
except ValueError as e:
    print(f"Oops: {e}")


Oops: Incorrect value for urgency: '#3'


### Exceptions Hierarchy and custom Exception classes

The following diagram details the most common Exceptions in Python

![Exception hierarchy](pics/exceptions-hierarchy.png)

As a rule of thumb, we shouldn't inherit from `BaseException` to avoid catching system-exiting exceptions such as `SystemExit` or `KeyboardInterrupt`.

Instead, when creating our own custom exceptions, we should inherit from `Exception`.

It is also recommended to use the existing class hierarchy instead of creating our custom classes, as the former will be familiar to Python developers. If necessary, you can supply your own custom message for clarity.

Consider the following `Task` definition.

In [9]:
class Task:
    def __init__(self, title):
        if isinstance(title, str):
            self.title = title
        else:
            raise TypeError("Please instantiate Task providing a string as its title")

try:
    task = Task(100)
except Exception as e:
    print(f"Ooops: {e}")

Ooops: Please instantiate Task providing a string as its title


Instead of reinventing the wheel, we reuse the existing `TypeError` exception but we supply a custom message so that the client of our code knows how to deal with the problem.

When creating a custom Exception, it can start very simple with a simple custom base class:

In [None]:
class MyCustomError(Exception):
    pass



Then, we can create our custom hierarchy from it:

In [10]:
class MyCustomError(Exception):
    pass

class MyFileExtError(MyCustomError):
    def __init__(self, filepath):
        super().__init__()
        self.filepath = filepath

    def __str__(self):
        return f"The file {self.filepath!r} is not a valid CSV file"

try:
    raise MyFileExtError("log.txt")
except Exception as e:
    print(f"Ooops: {e} (type: {type(e)})")

Ooops: The file 'log.txt' is not a valid CSV file (type: <class '__main__.MyFileExtError'>)


Note that the custom exception class can take additional arguments for instantiation.

Note also that we override the `__str__` method so that the consumer of our code gets a good description of our custom extension.

## Section 37 &mdash; More on operator overloading

Python supports operator overloading:


In [11]:
class Amount:
    def __init__(self, amt, currency):
        self.amt = amt
        self.currency = currency

    def __gt__(self, other):
        if self.currency == other.currency and self.amt > other.amt:
            return True
        else:
            return False

fifteen_bucks = Amount(15, "USD")
fifty_bucks = Amount(50, "USD")

if fifty_bucks > fifteen_bucks:
    print("I'm rich, biyatch")
else:
    print("not rich")

I'm rich, biyatch


You have to implement the following functions to override operators:

+ `__eq__()` &mdash; `==`
+ `__ne__()` &mdash; `!=`
+ `__lt__()` &mdash; `<`
+ `__le__()` &mdash; `<=`
+ `__gt__()` &mdash; `>`
+ `__ge__()` &mdash; `>=`
+ `__add__()` &mdash; `+`
+ `__sub__()` &mdash; `-`
+ `__mul__()` &mdash; `*`
+ `__truediv__()` &mdash; `/`
+ `__floordiv__()` &mdash; `//`
+ `__mod__()` &mdash; `%`
+ `__rshifht__()` &mdash; `>>`
+ `__lshifht__()` &mdash; `<<`
+ `__and__()` &mdash; `&`
+ `__or__()` &mdash; `|`
+ `__xor__()` &mdash; `^`

## Section 38 &mdash; Collection functions

Python provides a group of useful global functions for collections:

+ `sum` &mdash; returns the sum the elements of a collection
+ `max` &mdash; returns the max element of a collection
+ `min` &mdash; returns the min element of a collection
+ `sorted` &mdash; returns a sorted collection from the given one, without mutating the original one.
+ `reversed` &mdash; returns a the reversed collection from the given one, without mutating the original one.

In [13]:
import random

nums = [random.randint(0, 10) for _ in range(0, 10)]
print(nums)
print("sum(nums):", sum(nums))
print("max(nums):", max(nums))
print("min(nums):", min(nums))
print("sorted(nums):", sorted(nums))
print("reversed(nums):", list(reversed(nums)))

[6, 1, 4, 2, 10, 6, 8, 3, 9, 2]
sum(nums): 51
max(nums): 10
min(nums): 1
sorted(nums): [1, 2, 2, 3, 4, 6, 6, 8, 9, 10]
reversed(nums): [2, 9, 3, 8, 6, 10, 2, 4, 1, 6]


## Section 39 &mdash; CLI/Terminal Arguments basics in Python

You can pass additional arguments to a Python program by simply typing:

```bash
python file-name.py arg1 arg2 arg3
```

The most straightforward way to retrieve those arguments in your program relies on the `sys` package:

In [14]:
import sys

print(len(sys.argv))
print(sys.argv)

11
['/home/ubuntu/miniconda3/lib/python3.10/site-packages/ipykernel_launcher.py', '--ip=127.0.0.1', '--stdin=9003', '--control=9001', '--hb=9000', '--Session.signature_scheme="hmac-sha256"', '--Session.key=b"3a1e019e-cffa-40cf-b88b-d120cf09e633"', '--shell=9002', '--transport="tcp"', '--iopub=9004', '--f=/home/ubuntu/.local/share/jupyter/runtime/kernel-v2-96499KV4Mk2a68wtR.json']


For more advanced capabilities, you can rely on the [argparse]((https://docs.python.org/3/library/argparse.html)) package.


| EXAMPLE: |
| :------- |
| See [greeter.py](exercises/section_38-cli/greeter.py) for an example on how to use `argparse` to build a CLI tool. |

## Section 40 &mdash; Regular Expressions (regex/regexp)

Python's `str` class has useful methods such as `find` and `rfind`, for searching substrings, but many scenarios require going beyond these basic methods, especially when you're dealing with complex pattern matching.

Form example, this snippet let us split some messy line using different types of separators using regex:

In [5]:
import re

messy_data = "field1,field2;field3;field4_field5"

regex = re.compile(r"[,_;]")
fields = regex.split(messy_data)

print(fields)

['field1', 'field2', 'field3', 'field4', 'field5']


Python's `re` module provides all the features related to regular expressions.

There are two ways to use this module: using OOP, and using the functional approach.

When using the OOP approach, you first create a `Pattern` object by compiling a string pattern, and then use the `Pattern` onject to search the occurrences that match the pattern.

In [11]:
import re

regex = re.compile("do")
print(regex.pattern)

print(regex.search("do homework"))
print(regex.findall("don't do that"))


do
<re.Match object; span=(0, 2), match='do'>
['do', 'do']


The functional style call the functions directly in the module:

In [12]:
import re

print(re.search("do", "do homework"))
print(re.findall("do", "don't do that"))

<re.Match object; span=(0, 2), match='do'>
['do', 'do']


The difference between the two approaches is that the OOP approach lets you cache the `Pattern` object. Thus, if your use case includes checking the same regex multiple times most probably the OOP approach will work better for you.

### Using *raw* strings to create a regex pattern

To create regex patterns we often need to use raw strings as in `r"pattern"`.

This is needed because you need to be very concise and precise. For example, we use `\d` to match any digit `\w` to denote a word, etc. This clashes with Python using backslashes to denot special characters such as `\t` for tabs, `\n` for newline or `\\` for backslash.

If there weren't raw strings, regex will even become weirder:

In [13]:
task_pattern = re.compile("\\\\task")  # looking "\\task"
texts = ["\task", "\\task", "\\\task", "\\\\task"]

for text in texts:
    print(f"Match {text!r}: {task_pattern.match(text)}")

Match '\task': None
Match '\\task': <re.Match object; span=(0, 5), match='\\task'>
Match '\\\task': None
Match '\\\\task': None


This becomes much easier with raw strings, as we don't need to escape the backslash character, making it easier to read.

In [14]:
task_pattern = re.compile(r"\\task")
texts = ["\task", "\\task", "\\\task", "\\\\task"]

for text in texts:
    print(f"Match {text!r}: {task_pattern.match(text)}")

Match '\task': None
Match '\\task': <re.Match object; span=(0, 5), match='\\task'>
Match '\\\task': None
Match '\\\\task': None


### Crash course on Python regex syntax

Regular expressions constitute a separate language with its own unique syntax. We'll deal with different aspects in each own subsection:

#### Boundary anchors

The boundary anchors lets you specify whether a string begins or ends with a particular pattern.

| Regex | Description |
| :---- | :---------- |
| ^hi   | starts with hi |
| task$ | ends with task |
| ^hi task$ | starts and ends with "hi task" |

In [18]:
import re

print(re.search(r"^hi", "hi, Python!"))
print(re.search(f"task$", "do the task"))
print(re.search(f"^hi task$", "hi task"))
print(re.search(f"^hi task$", "hi Python task"))

<re.Match object; span=(0, 2), match='hi'>
<re.Match object; span=(7, 11), match='task'>
<re.Match object; span=(0, 7), match='hi task'>
None


#### Quantifiers

Quantifiers are used when we need to search for a pattern appearing a certain number of times

| Regex | Description |
| :---- | :---------- |
| hi?   | h followed by zero or one i |
| hi*   | h followed by zero or more i |
| hi+   | h followed by one or more i |
| hi{3} | h followed by iii |
| hi{1,3} | h followed by i, ii, or iii |
| hi{2,} | h followed by 2 or more i |

`?`, `*`, and `+` are greedy, meaning that the regex engine will try to match the longest sequence whenever possible. You can disable that behavior adding the suffix `?` to the quantifier (as in `hi+?`).

In [23]:
import re

test_string = "h hi hii hiii hiiii"
test_patterns = [r"hi?", r"hi*", r"hi+", r"hi{3}", r"hi{2,3}", r"hi{2,}",
                 r"hi??", r"hi*?", r"hi+?", r"hi{2,}?"]

for pattern in test_patterns:
    print(f"{pattern:<9}==> {re.findall(pattern, test_string)}")


hi?      ==> ['h', 'hi', 'hi', 'hi', 'hi']
hi*      ==> ['h', 'hi', 'hii', 'hiii', 'hiiii']
hi+      ==> ['hi', 'hii', 'hiii', 'hiiii']
hi{3}    ==> ['hiii', 'hiii']
hi{2,3}  ==> ['hii', 'hiii', 'hiii']
hi{2,}   ==> ['hii', 'hiii', 'hiiii']
hi??     ==> ['h', 'h', 'h', 'h', 'h']
hi*?     ==> ['h', 'h', 'h', 'h', 'h']
hi+?     ==> ['hi', 'hi', 'hi', 'hi']
hi{2,}?  ==> ['hii', 'hii', 'hii']


#### Character classes and sets

The following table lists the most common character sets supported in Python (the table is not exhaustive):

| regex for the character set | Description |
| :-------------------------- | :---------- |
| \d | any decimal digit |
| \D | any character that is not a decimal digit |
| \s | any whitespace character including space, \t, \n, \r, \f, \v |
| \S | any character that isn't a whitespace |
| \w | any word character (alphanumeric plus underscores) |
| \W | any character that is not a word character |
| .  | any character except a newline |
| [] | a set of defined characters (as in [abc] for a, b, or c) |

When using `[]` you can include:
+ individual characters as in `[abcxyz]` to match any of these six characters
+ ranges of chacters [a-z], [A-Z]
+ combination of different ranges [a-zA-Z0-9], [a-dw-z]

In [24]:
test_text = "#1$wm_ M\t"

patterns = ["\d", "\D", "\s", "\S", "\w", "\W", ".", "[lmn]"]
for pattern in patterns:
    print(f"{pattern:<9}==> {re.findall(pattern, test_text)}")

\d       ==> ['1']
\D       ==> ['#', '$', 'w', 'm', '_', ' ', 'M', '\t']
\s       ==> [' ', '\t']
\S       ==> ['#', '1', '$', 'w', 'm', '_', 'M']
\w       ==> ['1', 'w', 'm', '_', 'M']
\W       ==> ['#', '$', ' ', '\t']
.        ==> ['#', '1', '$', 'w', 'm', '_', ' ', 'M', '\t']
[lmn]    ==> ['m']


#### Logical operators

Regular expressions also support logical operations.

| Regex | Description |
| :---- | :---------- |
| a|b   | a or b |
| (abc) | abc as a group |
| [^a]  | any character other than a |

In [27]:
import re

print(re.findall(r"a|b", "a c d d b ab"))
print(re.findall(r"a|b", "c d d b"))

print(re.findall(r"(abc)", "ab bc abc ac"))

print(re.findall(r"[^a]", "abcde"))

['a', 'b', 'a', 'b']
['b']
['abc']
['b', 'c', 'd', 'e']


#### Creating Match objects

The `match` and `search` methods are often used for pattern searching:

+ `match` is interested on finding the pattern at the beginning of the string
+ `search` scans the string until if finds a match (if any)

In [29]:
import re

match = re.search(r"(\w\d)+", "xyza2b1c3dd")
print(match)

print("matched:", match.group())
print("span:", match.span())
print(f"start: {match.start()} & end: {match.end()}")

<re.Match object; span=(3, 9), match='a2b1c3'>
matched: a2b1c3
span: (3, 9)
start: 3 & end: 9


A `Match` object (such as the one returned by `search` and `match`) evaluate to True is a match is identified:

```python
import re

match = re.match("pattern", "string to match")
if match:
    # ...do something if found...
else:
    print("no matches found")
```

A match can have multiple groups:

In [33]:
import re

match = re.match(r"(\w+), (\w+)", "Homework, urgent; today")
print(match)

print(match.groups())

print(match.group(0))
print(match.group(1))
print(match.group(2))

<re.Match object; span=(0, 16), match='Homework, urgent'>
('Homework', 'urgent')
Homework, urgent
Homework
urgent


Note that by default, group 0 is the entire match.

The same can be applied to spans:

In [36]:
import re

match = re.match(r"(\w+), (\w+)", "Homework, urgent; today")
print(match)

print(match.span(0))
print(match.span(1))
print(match.span(2))

<re.Match object; span=(0, 16), match='Homework, urgent'>
(0, 16)
(0, 8)
(10, 16)


#### Common methods

##### `search`

`search` returns a `Match` if a match is found anywhere in the string.

In [2]:
import re

print(re.search(r"\d+", "ab12xy"))  # 12
print(re.search(r"\d+", "abxy"))    # None

<re.Match object; span=(2, 4), match='12'>
None


##### `match`

`match` returns a match only if a match is found at the string's beginning.

In [3]:
import re

print(re.match(r"\d+", "ab12xy"))   # None
print(re.match(r"\d+", "12abxy"))   # 12

None
<re.Match object; span=(0, 2), match='12'>


##### `findall`

`findall` returns a list of strings that match the pattern. When the pattern has multiple groups, the item is a tuple.

In [5]:
import re

print(re.findall(r"h[ie]\w", "hi hey hello"))  # hey hel
print(re.findall(f"(h|H)(i|e)", "Hey hello"))  # [("H", "e"), ('h', 'e')]

['hey', 'hel']
[('H', 'e'), ('h', 'e')]


##### `finditer`

`finditer` returns an iterator that yields `Match` objects.

In [6]:
import re

print(re.finditer(r"(h|H)(i|e)", "hi Hey Hello"))

<callable_iterator object at 0x7f1d47937940>


##### `split`

`split` splits the string by the pattern.

In [8]:
import re

re.split(r"\d+", "a1b2c3d4e")  # ["a", "b", "c", "d", "e"]

['a', 'b', 'c', 'd', 'e']

##### `sub`

`sub` creates a string by replacing the matched with the replacement.

In [9]:
import re

re.sub(r"\D", "-", "123,456_789")   # 123-456-789

'123-456-789'

### Extracting data using regex (I)

Extract the text data from the text string `"abc_,abc__,abc,,__abc_,_abc"`, where abc stands for the needed data values.

In [10]:
import re

text_line = "abc_,abc__,abc,,__abc_,_abc"

data = re.split(r"[_,]+", text_line)
print(f"data: {data}")

data: ['abc', 'abc', 'abc', 'abc', 'abc']


### Extracting data using regex (II)

Suppose that we have the following text that contains multiple valid records along with invalid records identified by random records (mimicking what you'd get out of a db log after a crash).

```
101, Homework; Complete physics and math
some random nonsense
102, Laundry; Wash all the clothes today
54, random; record
103, Museum; All about Egypt
1234, random; record
Another random record
```

Valid records look like the following:

"^(\d+), (\w+); (.+)$"

In [2]:
import re

text_data = """101, Homework; Complete physics and math
some random nonsense
102, Laundry; Wash all the clothes today
54, random; record
103, Museum; All about Egypt
1234, random; record
Another random record"""

regex_pattern = "^(\d+), (\w+); (.+)$"
regex = re.compile(regex_pattern)
for line in text_data.split("\n"):
    match = regex.match(line)
    if match:
        print(f"{'Matched:':<12}{match.group()}")
    else:
        print(f"{'No match':<12}{line}")


Matched:    101, Homework; Complete physics and math
No match    some random nonsense
Matched:    102, Laundry; Wash all the clothes today
Matched:    54, random; record
Matched:    103, Museum; All about Egypt
Matched:    1234, random; record
No match    Another random record


The approach worked as expected, so the next step is to extract the individual fields from each line (task_id, task_title, task_description):

In [4]:
import re

text_data = """101, Homework; Complete physics and math
some random nonsense
102, Laundry; Wash all the clothes today
54, random; record
103, Museum; All about Egypt
1234, random; record
Another random record"""

regex_pattern = "^(\d+), (\w+); (.+)$"
regex = re.compile(regex_pattern)
tasks = []
for line in text_data.split("\n"):
    match = regex.match(line)
    if match:
        task_id = match.group(1)
        task_title = match.group(2)
        task_desc = match.group(3)
        print(f"task_id={task_id}; task_title={task_title}, task_description={task_desc}")
        tasks.append((task_id, task_title, task_desc))

print(tasks)

task_id=101; task_title=Homework, task_description=Complete physics and math
task_id=102; task_title=Laundry, task_description=Wash all the clothes today
task_id=54; task_title=random, task_description=record
task_id=103; task_title=Museum, task_description=All about Egypt
task_id=1234; task_title=random, task_description=record
[('101', 'Homework', 'Complete physics and math'), ('102', 'Laundry', 'Wash all the clothes today'), ('54', 'random', 'record'), ('103', 'Museum', 'All about Egypt'), ('1234', 'random', 'record')]


### Using named groups in regex

Textual information provides more semantic information that raw regular expressions.

Python supports the syntax:

```python
?P<group_name>pattern
```

to give a name to a pattern group. Then, you can use `match.group(<group_name>)` instead of its index, which makes the code more readable.

The following snippet illustrate this using the previous scenario:

In [5]:
import re

text_data = """101, Homework; Complete physics and math
some random nonsense
102, Laundry; Wash all the clothes today
54, random; record
103, Museum; All about Egypt
1234, random; record
Another random record"""

regex_pattern = "^(?P<task_id>\d+), (?P<task_title>\w+); (?P<task_desc>.+)$"
regex = re.compile(regex_pattern)
tasks = []
for line in text_data.split("\n"):
    match = regex.match(line)
    if match:
        tasks.append((match.group("task_id"), match.group("task_title"), match.group("task_desc")))

print(tasks)

[('101', 'Homework', 'Complete physics and math'), ('102', 'Laundry', 'Wash all the clothes today'), ('54', 'random', 'record'), ('103', 'Museum', 'All about Egypt'), ('1234', 'random', 'record')]


### Using `groupdict` with regular expressions with group names

While you can use `group` to retrieve the individual items from the named groups you can also use the `groupdict` method which creates a dictionary of the names groups: 

In [6]:
import re

text_data = """101, Homework; Complete physics and math
some random nonsense
102, Laundry; Wash all the clothes today
54, random; record
103, Museum; All about Egypt
1234, random; record
Another random record"""

regex_pattern = "^(?P<task_id>\d+), (?P<task_title>\w+); (?P<task_desc>.+)$"
regex = re.compile(regex_pattern)
tasks = []
for line in text_data.split("\n"):
    match = regex.match(line)
    if match:
        tasks.append(match.groupdict())

print(tasks)

[{'task_id': '101', 'task_title': 'Homework', 'task_desc': 'Complete physics and math'}, {'task_id': '102', 'task_title': 'Laundry', 'task_desc': 'Wash all the clothes today'}, {'task_id': '54', 'task_title': 'random', 'task_desc': 'record'}, {'task_id': '103', 'task_title': 'Museum', 'task_desc': 'All about Egypt'}, {'task_id': '1234', 'task_title': 'random', 'task_desc': 'record'}]


## Section 41 &mdash; More on Data Containers

### When you should choose lists over tuples

Lists are mutable, while tuples are immutable. Thus, when dealing with lists you can append new items, insert items into the middle, change items, and remove items. To support this mutability, you hape methods such as `append`, `extend`, and `remove`.

In [6]:
numbers = [0, 1, 2, 3]

numbers.insert(0, -1)
print(numbers)

numbers.append(4)
print(numbers)

numbers.extend([5, 6, 7])
print(numbers)

# remove by value
numbers.remove(-1)
print(numbers)

try:
    numbers.remove(8)
except ValueError as e:
    print(f"oops: {e}")

# remove by pos
del numbers[7]
print(numbers)



[-1, 0, 1, 2, 3]
[-1, 0, 1, 2, 3, 4]
[-1, 0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7]
oops: list.remove(x): x not in list
[0, 1, 2, 3, 4, 5, 6]


By contrast, tuples are immutable.

In [7]:
numbers = (1, 2, 3)

try:
    numbers[0] = 2
except Exception as e:
    print(f"Oops: tuples are immutable: {e}")

Oops: tuples are immutable: 'tuple' object does not support item assignment


As expected, while tuples are mutable, when tuples hold references to objects, the objects themselves will be subject of beings changed:

In [8]:
my_tuple = ([1, 2, 3], ["a", "b"])
my_tuple[0].append(4)

print(my_tuple)

([1, 2, 3, 4], ['a', 'b'])


### Sorting lists using a custom function

Python has a built-in method design to sort lists in place:

In [14]:
numbers = [12, 4, 1, 3, 7, 5, 9, 8]
numbers.sort()
print(numbers)

names = ["Jennifer", "Idris", "Jason", "Florence", "Kenneth"]
names.sort()
print(names)

names.sort(reverse=True)
print(names)

# mixed list: fails
mixed = [3, 1, 2, "John", ["c", "a"], ["a", "b"]]

try:
    mixed.sort()
except Exception as e:
    print(f"oops: {e}")


[1, 3, 4, 5, 7, 8, 9, 12]
['Florence', 'Idris', 'Jason', 'Jennifer', 'Kenneth']
['Kenneth', 'Jennifer', 'Jason', 'Idris', 'Florence']
oops: '<' not supported between instances of 'str' and 'int'


For complicated lists, you can provide a built-in function as the sorting key.

In [16]:
mixed = [3, 1, 2, "John", ["c", "a"], ["a", "b"]]
mixed.sort(key=str)

print(mixed)

[1, 2, 3, 'John', ['a', 'b'], ['c', 'a']]


The same approach can be used to sort more complicated

In [17]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

def using_urgency_level(task):
    return task["urgency"]

tasks.sort(key=using_urgency_level)
print(tasks)

[{'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}, {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3}, {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5}]


You can use a lambda function for this use cases:

In [19]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

tasks.sort(key=lambda x: x["urgency"], reverse=True)
print(tasks)

[{'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5}, {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3}, {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}]


Python provides another method `sorted`, which can take any iterable and return a sorted list.

That method is appropriate when you need to preserve the original list, or when you need to sort other containers such as tuples which are immutable.

### Alternative data models in Python

There are several ways to represent the same domain model entity in Python:
+ lists
+ tuples
+ dictionaries
+ classes

Consider the following example representing a ToDo task with a title, a description, and an urgency field.

In [None]:
task_list = ["Laundry", "Wash Clothes", 3]

task_tuple = ("Laundry", "Wash Clothes", 3)

task_dict = {"title": "Laudry", "desc": "Wash Clothes", "Urgency": 3}

class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

task_class = Task("Laundry", "Wash Clothes", 3)

All these approaches have their strengths and drawbacks:
+ Lists:
  + mutable, so it might not work well for scenarios in which the data should not be changed. 
  + should not hold heterogeneous data.
  + do not contain metadata, so you need unpacking/indexing to access the individual elements.
+ Tuples:
  + immutable
  + do not contain metadata, so you need unpacking/indexing to access the individual elements.
+ Class:
  + more verbose approach
  + more memory usage

#### Named Tuples (`namedtuple`)

One good solution is to use *named tuples*. A named tuple lets you define tuples whose elements have names associated with them:

In [1]:
from collections import namedtuple

Task = namedtuple('Task', 'title desc urgency')

task_nt = Task("Laundry", "Wash Clothes", 3)

assert task_nt.title == "Laundry"
assert task_nt.desc == "Wash Clothes"
assert task_nt.urgency == 3

The named tuple attributes can either be defined as a single string with the attributes separated by spaces, or commas, or as a list object: 

In [None]:
from collections import namedtuple

# using comma as separator
Task = namedtuple('Task', 'title, desc, urgency')

# using list
Task = namedtuple('Task', ['title', 'desc', 'urgency'])

Now we can use the following approach to load the following data into a list of Tasks namedtuples:

```
Laundry,Wash clothes,3
Homework,Physics + Math,5
Museum,Epyptian things,2
```

In [3]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "desc", "urgency"])

task_data = '''Laundry,Wash clothes,3
Homework,Physics + Math,5
Museum,Epyptian things,2'''

tasks = []
for task_record in task_data.split("\n"):
    task_title, task_desc, task_urgency = task_record.split(",")
    tasks.append(Task(task_title, task_desc, task_urgency))

print(tasks)



[Task(title='Laundry', desc='Wash clothes', urgency='3'), Task(title='Homework', desc='Physics + Math', urgency='5'), Task(title='Museum', desc='Epyptian things', urgency='2')]


#### Using dictionaries

Dictionaries are also a good option to keep the data model, as they contain additional metadata about the dictionary elements:

In [5]:
task_data = '''Laundry,Wash clothes,3
Homework,Physics + Math,5
Museum,Epyptian things,2'''

tasks = []
for task_record in task_data.split("\n"):
    task_title, task_desc, task_urgency = task_record.split(",")
    task = {"title": task_title, "desc": task_desc, "urgency": task_urgency}
    tasks.append(task)

print(tasks)


[{'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': '3'}, {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': '5'}, {'title': 'Museum', 'desc': 'Epyptian things', 'urgency': '2'}]


Dictionaries allows you to get *dynamic views* through the `keys`, `values` and `items` methods.

In [7]:
task = {"title": "Laundry", "desc": "Wash clothes", "urgency": 3}
print("keys:", task.keys())
print("values:", task.values())

keys: dict_keys(['title', 'desc', 'urgency'])
values: dict_values(['Laundry', 'Wash clothes', 3])


For example, we can create a dictionary out of a list of dictionaries with the task title and urgency, and then use 

In [10]:
task_data = '''Laundry,Wash clothes,3
Homework,Physics + Math,5
Museum,Epyptian things,2'''

tasks = []
for task_record in task_data.split("\n"):
    task_title, task_desc, task_urgency = task_record.split(",")
    task = {"title": task_title, "desc": task_desc, "urgency": task_urgency}
    tasks.append(task)

urgencies = dict()
for task in tasks:
    urgencies[task["title"]] = task["urgency"]

print("keys:", urgencies.keys())
print("values:", urgencies.values())
print("items:", urgencies.items())


keys: dict_keys(['Laundry', 'Homework', 'Museum'])
values: dict_values(['3', '5', '2'])
items: dict_items([('Laundry', '3'), ('Homework', '5'), ('Museum', '2')])


Note that `keys`, `values`, and `items` methods provide dynamic views over the dictionary items &mdash; that is, those are automatically updated when the data in the dictionary is updated.

Trying to access a dictionary item that does not exist raises a `KeyError` exception:

In [6]:
urgencies = {"Laundry": 3, "Homework": 5, "Museum": 2}

try:
    urgencies["Gardening"]
except KeyError as e:
    print(f"Oops: {e}")

Oops: 'Gardening'


One way to prevent that is using and if-else:

In [7]:
urgencies = {"Laundry": 3, "Homework": 5, "Museum": 2}

if "Gardening" in urgencies:
    print(urgencies["Gardening"])
else:
    print("Gardening is not in the urgencies dictionary")

Gardening is not in the urgencies dictionary


This approach is not considered very pythonic. Instead, Python provides the `get` method which lets you specify a default value when they key doesn't exist and doesn't raise an exception:

In [8]:
urgencies = {"Laundry": 3, "Homework": 5, "Museum": 2}

print(urgencies.get("Homework"))            # existing key
print(urgencies.get("Gardening"))           # non-existing key, without default value
print(urgencies.get("Gardening", "N/A"))    # non-existing key, with default value

5
None
N/A


##### Hello, kwargs!

While we will learn about kwargs in detail in the future, we will introduce it here.

kwargs is a naming convention used in functions that that can receive a variable number of keyword arguments.

In [None]:
def do_something(arg0, arg1, **kwargs):
    optional_kw_arg_0 = kwargs.get("kwarg0")
    optional_kw_arg_1 = kwargs.get("kwarg1")
    optional_kw_arg_2 = kwargs.get("kwarg2")
    ...


do_something(1, "a")
do_something(1, "a", kwarg0=5)
do_something(1, "a", kwarg0=5, kwarg2="red")


##### Hello, setdefault!

The `setdefault` method on dictionary objects is similar to setting the value using `dict[key] = val`, but it also returns the value associated to `key` if that already exists in the dictionary.

The name is confusing, and it is also mixing the set and get behaviors, so it's better not to use it and has been included here for completion.

In [13]:
urgencies = {"Laundry": 3, "Homework": 5, "Museum": 2}

# Behavior when key exists and no default value is given
print(urgencies.setdefault("Homework"))  # it is like a get
print(urgencies)

# Behavior when key exists and a default value is given
print(urgencies.setdefault("Homework", "N/A"))  # it is like a get, default is ignored
print(urgencies)

# Behavior when key does not exist and no default value is given
print(urgencies.setdefault("Gardening"))  # it is like a set with no value
print(urgencies)

# Behavior when key does not exist and a default value is given
print(urgencies.setdefault("Running", "N/A"))  # it is like a set, default value is used
print(urgencies)

5
{'Laundry': 3, 'Homework': 5, 'Museum': 2}
5
{'Laundry': 3, 'Homework': 5, 'Museum': 2}
None
{'Laundry': 3, 'Homework': 5, 'Museum': 2, 'Gardening': None}
N/A
{'Laundry': 3, 'Homework': 5, 'Museum': 2, 'Gardening': None, 'Running': 'N/A'}


### Unhashable types for Dictionaries and Sets

Lists and tuples have no restriction regarding the data types that can be saved in them. 

Dictionaries are also useful as they store key-value pairs, but for dictionaries and sets there are some types that you cannot use.

When objects are unhashable, you cannot use them as keys in sets and dicts.

In [4]:
try:
    failed_dict = {[0, 2]: False}
except Exception as e:
    print(f"Oops: {e}")

try:
    failed_set = {{"a": 0}}
except Exception as e:
    print(f"Oops: {e}")

Oops: unhashable type: 'list'
Oops: unhashable type: 'dict'


This happens because both sets and dicts share the same underlying storage mechanism: a hash table.

In the example below, we use `timeit` to check the performance of lookups on sets vs. lists. Note how the sets lookup performance stays the same when the number of elements grow, while the lookups performance on lists decreases.

In [7]:
from timeit import timeit

for count in [10, 100, 1000, 10000, 100000]:
    setup_str = f"""from random import randint; n = {count};
numbers_set = set(range(n));
numbers_list = list(range(n))"""
    stmt_set = "randint(0, n-1) in numbers_set"
    stmt_list = "randint(0, n-1) in numbers_list"
    t_set = timeit(stmt_set, setup=setup_str, number=10000)
    t_list = timeit(stmt_list, setup=setup_str, number=10000)
    print(f"{count: >6}: {t_set:e} vs. {t_list:e}")

    10: 3.894067e-03 vs. 3.700906e-03
   100: 4.265092e-03 vs. 6.091089e-03
  1000: 3.789745e-03 vs. 2.172317e-02
 10000: 4.160614e-03 vs. 2.073374e-01
100000: 7.591529e-03 vs. 1.756933e+00


Python comes with an OOB hasher:

In [9]:
print(hash("hello world!"))
print(hash(100))

try:
    hash([1, 2, 3])
except Exception as e:
    print(f"Oops: {e}")

-834751430566916460
100
Oops: unhashable type: 'list'


Lists and dictionaries are not hashable because they're mutable. As the hash function needs to compute the same hash for the same object, a mutable object will make that requirement impossible to fulfill.

The `Hashable` function can also be used to check whether a given object is hashable:

In [14]:
from collections.abc import Hashable

class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

def check_hashability():
    print(f"{'Data Type': <25}   Hashable")
    items = [
        {"a": 1},
        [1],
        {1},
        1,
        1.2,
        "test",
        (1,2),
        True,
        None,
        Person("Jason", 54)
    ]
    for item in items:
        print(f"{str(type(item)): <25} | {isinstance(item, Hashable)}")


check_hashability()

Data Type                   Hashable
<class 'dict'>            | False
<class 'list'>            | False
<class 'set'>             | False
<class 'int'>             | True
<class 'float'>           | True
<class 'str'>             | True
<class 'tuple'>           | True
<class 'bool'>            | True
<class 'NoneType'>        | True
<class '__main__.Person'> | True


| NOTE: |
| :---- |
| The `abc` submodule defines a series of Abstract Base Classes (ABC). |

Note that strings are hashable because they are not mutable:

In [16]:
msg = "Hello, world!"

try:
    msg[0] = 'h'
except Exception as e:
    print(f"Oops: {e}")

Oops: 'str' object does not support item assignment


If you need to change a string, use the `replace` method which returns a new string:

In [17]:
msg = "Hello, world!"
msg_new = msg.replace("H", "h")

print(f"{msg} (id={id(msg)})")
print(f"{msg_new} (id={id(msg_new)})")

Hello, world! (id=140138983659760)
hello, world! (id=140138983661744)


### Checking if all the elements of a list are contained in another list

You can leverage Sets to solve specific uses cases such as checking if all elements of a list are contained in another list.

Consider a list holding the vetted stocks and another couple of list with a selection of a stocks.

We're interested in understanding if the latter elements are contained in the former:

In [19]:
good_stocks = ["AAPL", "GOOG", "AMZN", "NVDA"]
client0 = ["GOOG", "AMZN"]
client1 = ["AMZN", "SNAP"]

good_stocks_set = set(good_stocks)

contained0 = good_stocks_set.issuperset(client0)
print(f"Are all client0 stocks contained in good stock? {contained0}")

contained1 = good_stocks_set.issuperset(client1)
print(f"Are all client1 stocks contained in good stock? {contained1}")

Are all client0 stocks contained in good stock? True
Are all client1 stocks contained in good stock? False


By transforming the original list in a set using:

```python
good_stocks_set = set(good_stocks)
```

And interrogating the newly created set using `issuperset` we can answer the question.

### Checking whether a list contains any element of another list

Continuing with the previous example.
Let's now assume that we need to check whether a client's list of stocks contains any of the recommendations.

In [24]:
good_stocks = ["AAPL", "GOOG", "AMZN", "NVDA"]
client0 = ["GOOG", "AMZN"]
client1 = ["AMZN", "SNAP"]

good_stocks_set = set(good_stocks)

contain_any0 = bool(good_stocks_set & set(client0))
contain_any1 = bool(good_stocks_set & set(client1))
print(f"Does client 0 contain any of the recommended stocks? {contain_any0}")
print(f"Does client 1 contain any of the recommended stocks? {contain_any1}")

Does client 0 contain any of the recommended stocks? True
Does client 1 contain any of the recommended stocks? True


Using the intersection operator `&` and transforming the intersection into a boolean object using `bool` we get the correct response.

The following diagram illustrates the different operations we can perform on sets and the corresponding shorthand Python operator:

![Operations on sets](./pics/operations_on_sets.png)

### Using deques for FIFO operations

In certain operations you will need to deal with queues (FIFO data structures).

While you can use regular lists to do so, using `deque` (pronounced "deck") is far more efficient.

A deque is a double-ended queue (it supports insertion and removal from both ends).

The following compares the cost of the `pop` operation in regular lists and deque's:

In [11]:
from collections import deque
from timeit import timeit

def time_fifo_testing(n):
    integer_l = list(range(n))
    integer_d = deque(range(n))
    t_l = timeit(lambda : integer_l.pop(0), number=n)
    t_d = timeit(lambda : integer_d.popleft(), number=n)
    return f"{n: >9} list: {t_l:.6e} | deque: {t_d:.6e}"

numbers = (100, 1000, 10000, 100000)
for number in numbers:
    print(time_fifo_testing(number))

      100 list: 7.147988e-06 | deque: 4.379006e-06
     1000 list: 7.647490e-04 | deque: 4.165999e-05
    10000 list: 9.636527e-02 | deque: 4.328600e-04
   100000 list: 8.621189e+00 | deque: 5.058431e-03


## Section 42 &mdash; More on Slicing

When we retrieve a subsequence of a `list` we can use a technique known as slicing.

In its simplest form, you do `list[start:end]`, which includes the start index, and the element at the end index is excluded:

In [25]:
fruits = ["apple", "orange", "banana", "strawberry"]

assert fruits[1:3] == ["orange", "banana"]

The most notable variations of the simplest form includes:
+ ignoring the start or the end index
+ applying stride to the slicing (to retrieve evenly spaced items)

In [33]:
fruits = ["apple", "orange", "banana", "strawberry"]

assert fruits[:3] == ["apple", "orange", "banana"]
assert fruits[1:] == ["orange", "banana", "strawberry"]
assert fruits[:] == fruits

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
assert numbers[2:5:2] == [3, 5]
assert numbers[1::2] == [2, 4, 6, 8, 10]
assert numbers[::-1] == [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
assert list(reversed(numbers)) == [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

Using negative strides as in `numbers[::-1]` is not recommended as it reduces the readability of the code. It is more readable to use the reverse method.

Using the slicing syntax on list is equivalent to using the `slice` constructor to create a slice object:

In [40]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(numbers[2:5:2])

even_slice = slice(2, 5, 2)
print(numbers[even_slice])


[3, 5]
[3, 5]


Named slices like the ones above are helpful when we need to make sense of complicated data found in lists.

Consider the following example in which we need to create a list of tuples extracting the following information from a text string:

0....5..............20..........................48......
1001 Laundry        Wash all clothes            3
1002 Museum Visit   Go to the Egypt exhibit     4
1003 Do Homework    Physics and math            5
1004 Go to Gym      Work out for 1 hour         2


In [48]:
tasks = """
0....5..............20..........................48......
1001 Laundry        Wash all clothes            3
1002 Museum Visit   Go to the Egypt exhibit     4
1003 Do Homework    Physics and math            5
1004 Go to Gym      Work out for 1 hour         2
"""

tasks_lines = tasks.split("\n")

data_lines_slice = slice(2,-1)
id_field_slice = slice(0, 5)
task_field_slice = slice(5, 20)
desc_field_slice = slice(20, 48)
urgency_field_slice = slice(48, 50)

tasks = []
for data_line in tasks_lines[data_lines_slice]:
    id = data_line[id_field_slice].strip()
    task = data_line[task_field_slice].strip()
    desc = data_line[desc_field_slice].strip()
    urgency = data_line[urgency_field_slice].strip()
    task_tuple = (id, task, desc, urgency)
    tasks.append(task_tuple)

print(tasks)





[('1001', 'Laundry', 'Wash all clothes', '3'), ('1002', 'Museum Visit', 'Go to the Egypt exhibit', '4'), ('1003', 'Do Homework', 'Physics and math', '5'), ('1004', 'Go to Gym', 'Work out for 1 hour', '2')]


### Slice surgery

Slice surgery is the technique in which you manipulate a list subsequence with a slice object in order to replace, extend, shrink, or remove portions of the original list.

This only works with mutable sequences (such as lists).

In [51]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8]


# mutate
numbers[:3] = [10, 11, 12]
assert numbers == [10, 11, 12, 3, 4, 5, 6, 7, 8]

# extend
numbers[3:] = [13, 14, 15, 16, 17, 18, 19, 20]
assert numbers == [10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

# shrink
numbers[:5] = [0, 1]
assert numbers == [0, 1, 15, 16, 17, 18, 19, 20]

Note that the subsequence doesn't have to be contiguous (i.e., you can use stride):

In [53]:
numbers == [0, 1, 15, 16, 17, 18, 19, 20]

# non-contiguous mutation
numbers[::2] = [0, 0, 0, 0]
assert numbers == [0, 1, 0, 16, 0, 18, 0, 20]

For the removal you can either:
+ use the `del` keyword
+ use slices assigning an empty list

In [63]:
numbers = [0, 1, 0, 16, 0, 18, 0, 20]

# removal using `del`
del numbers[:4]
assert numbers == [0, 18, 0, 20]

# removal by assigning empty list
numbers[-2:] = []
assert numbers == [0, 18]

### Using positive and negative index when slicing

Indexes in slices tend to create some confusion.

In this section we will consider the following list of data representing the monthly revenue of a company by month:

```python
revenue_by_month = [95, 100, 80, 93, 92, 110, 102, 88, 96, 98, 115, 120]
```

Let's see how to obtain:
+ revenue in January
+ revenues in Q2
+ revenue in Nov
+ revenues in Q4

In [70]:
revenue_by_month = [95, 100, 80, 93, 92, 110, 102, 88, 96, 98, 115, 120]

# revenue in Jan
revenue_jan = revenue_by_month[0]
assert revenue_jan == 95

# revenue in Q2
revenues_q2 = revenue_by_month[3:6]
assert revenues_q2 == [93, 92, 110]

# revenue in Nov
revenue_nov = revenue_by_month[-2]
assert revenue_nov == 115

revenue_nov = revenue_by_month[10]
assert revenue_nov == 115

# revenues in Q4
revenues_q4 = revenue_by_month[9:]
assert revenues_q4 == [98, 115, 120]

revenues_q4 = revenue_by_month[-3:]
assert revenues_q4 == [98, 115, 120]


Note that negative indexing starts with `-1` to identify the last element of the list. That is `revenue_by_month[-1]` is the revenue for December.

This can be applied to individual indices as seen above, or to slices:

```python
revenues_q4 = revenue_by_month[-3:]
```

It's also very common to mix positive and negative indices:

In [71]:
# revenues excluding Jan and Dec
revenue_by_month = [95, 100, 80, 93, 92, 110, 102, 88, 96, 98, 115, 120]

mid_revenues = revenue_by_month[1:-1]
assert mid_revenues == [100, 80, 93, 92, 110, 102, 88, 96, 98, 115]


## Section 43 &mdash; Finding items in a sequence

In Python, to check an item's presence on a sequence you use the `in` keyword:

```python
item in sequence # returns True if present, False otherwise
```


In [73]:
assert (8 in [1, 2, 3, 4, 5]) == False

assert ("cool" in "Python is cool!") == True

assert (404 in (404, "Page Not Found")) == True

To locate an item you use the `index` method which accepts the object to be searched:

In [74]:
assert [1, 2, 3, 4, 5].index(4) == 3  # 4 is in index 3

assert (404, "Page Not Found").index(404) == 0

assert "Python is cool!".index("cool") == 10

The `index` method raises a `ValueError` when the item is not found:

In [76]:
try:
    [1, 2, 3, 4, 5].index(8)
except ValueError as e:
    print(f"Oops: {e}")

Oops: 8 is not in list


Python uses the EAFP principle (Easier to Ask for Forgiveness than Permission), instead of the LBYL (Look Before You Leap) as other programming languages.

When dealing with substrings, you can use the `find` and `rfind` methods which returns `-1` when the substring is not found:

In [78]:
assert "Python is cool!".index("cool") == 10
assert "Python is cool!".find("cool") == 10
assert "Python is cool!".find("Go") == -1


### Finding an instance of a custom class in a list

Consider the following `Task` class and a few instances held in a `tasks` list.

```python
class Task:
   def __init__(self, title, urgency):
       self.title = title
       self.urgency = urgency


tasks = [
   Task("Laundry", 3),
   Task("Museum", 4),
   Task("Homework", 5),
   Task("Ticket", 2)
]
```

We need to locate the task index (if any) whose urgency is 5:

In [81]:
class Task:
   def __init__(self, title, urgency):
       self.title = title
       self.urgency = urgency


tasks = [
   Task("Laundry", 3),
   Task("Museum", 4),
   Task("Homework", 5),
   Task("Ticket", 2)
]


needed_urgency = 5
needed_task_index = None

for i in range(len(tasks)):
    task = tasks[i]
    if task.urgency == needed_urgency:
        needed_task_index = i
        break

if needed_task_index:
    print(f"Task found: {needed_task_index}")
else:
    print(f"No Task found")



No Task found


With a list of custom class instances, the `index` method won't work:

In [3]:
class Task:
   def __init__(self, title, urgency):
       self.title = title
       self.urgency = urgency


tasks = [
   Task("Laundry", 3),
   Task("Museum", 4),
   Task("Homework", 5),
   Task("Ticket", 2)
]

needed_task = Task("Homework", 5)
try:
    tasks.index(needed_task)
except ValueError as e:
    print(f"Oops: {e}")

assert id(tasks[2]) != id(needed_task)

Oops: <__main__.Task object at 0x7f0a380f0b20> is not in list


## Section 44 &mdash; More on Iterables and Iterators

*Iterators* are a special data type from which we can retrieve each of their elements through a process called *iteration*.

Under the hood is performed with two functions `iter` and `next`:

+ An iterator is created from an iterable using `iter`.
+ Elements are produced using `next`. Calling `next` on the iterator retrieves the next element if available.
+ When all the elements have been produced, the `StopIteration` exception is raised to signal that `next` cannot produce more elements.

### Manually triggering iteration on an iterable with an iterator

The following sequence of steps illustrates how to iterate over the elements of an iterable using `iter` and `next`:

In [9]:
tasks = ["task0", "task1", "task2"]

tasks_iterator = iter(tasks)

elem1 = next(tasks_iterator)
print(f"elem1={elem1}")

elem2 = next(tasks_iterator)
print(f"elem2={elem2}")

elem3 = next(tasks_iterator)
print(f"elem3={elem3}")

try:
    nxt_elem_not_exists = next(tasks_iterator)
except StopIteration as e:
    print(f"No more elements: {e}")

elem1=task0
elem2=task1
elem3=task2
No more elements: 


Obviously, you will never need to follow that approach, as Python provides automatic ways to handle iteration with `for` loops:

In [None]:
tasks = ["task0", "task1", "task2"]

for task in tasks:
    print(task)

### Checking if an object is iterable

You can determine if an object is iterable using the EAFP principle (Easier to Ask for Forgiveness than Permission):

In [14]:
def is_iterable(obj):
    try:
        _ = iter(obj)
    except TypeError:
        return False
    else:
        return True

print(is_iterable(5))
print(is_iterable([1, 2, 3]))
print(is_iterable("Hello"))
print(is_iterable((1, 2, "Hello")))
print(is_iterable({1: "one", 2: "two"}))

False
True
True
True
True


### Creating iterables programmatically using `list`, `dict`, `tuple` and `set`

While you can use literals to create lists, dictionaries, and sets, many times you'll need to use the `list`, `dict`, and `set` functions to create them:

In [18]:
integers_list = list(range(10))
assert integers_list == [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

dict_items = [("one", 1), ("two", 2), ("three", 3), ("catorce", 14)]
integers_dict = dict(dict_items)
assert integers_dict == {"one": 1, "two": 2, "three": 3, "catorce": 14}

some_integers = (1, 2, 4, 2, 5, 3, 4, 6, 7, 1, 2, 5)
unique_integers = set(some_integers)
assert unique_integers == {1, 2, 3, 4, 5, 6, 7}

### Creating a list of letters from a string

Strings are iterable. Suppose that we have a str object "ABCDE". What's the best way to create a list of its characters as in `["A", "B", "C", "D", E"]`?

You can just create a list out of the string:

In [19]:
str = "ABCDE"

assert list(str) == ["A", "B", "C", "D", "E"]

### Using `map` to convert the elements of a list

Suppose that we have a list of strings representing floating point numbers. Transform them into the equivalent list of numbers.

```python
["1.23", "4.56", "7.89"]
```

In [20]:
nums_str = ["1.23", "4.56", "7.89"]

nums_float = list(map(float, nums_str))
assert nums_float == [1.23, 4.56, 7.89]

### Using `zip` to create a dictionary from two lists

Suppose we have two lists with the id numbers and titles of certain tasks to be performed. Create a dictionary using `zip` in which the keys are the task IDs and the titles are the corresponding values:

In [21]:
id_numbers = [101, 102, 103]
titles = ["Laundry", "Homework", "Soccer"]

tasks = dict(zip(id_numbers, titles))

assert tasks == {101: "Laundry", 102: "Homework", 103: "Soccer"}

## Section 45 &mdash; More on list comprehensions, dictionary comprehensions, and set comprehensions

Comprehensions are a concise way of creating lists, dictionaries, and sets.

We've already learned the basics of list comprehensions:

In [22]:
numbers = [1, 2, 3, 4]
squares = [x * x for x in numbers]

assert squares == [1, 4, 9, 16]

### Creating a list of elements from a named tuple

Consider the following list of named tuples:

```python
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")
 
tasks = [
    Task("Homework", "Physics and math", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4)
]
```

Create a list with all the titles of that list using a list comprehension:

In [27]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
    Task("Homework", "Physics and math", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4)
]

task_titles = [task.title for task in tasks]

assert task_titles == ["Homework", "Laundry", "Museum"]

Alternatively, we could have used `map` and a custom function:

In [28]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
    Task("Homework", "Physics and math", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4)
]

def get_title(task):
    return task.title

task_titles = list(map(get_title, tasks))

assert task_titles == ["Homework", "Laundry", "Museum"]

### Creating a dictionary from an iterable using dictionary comprehension

Using the following list as our starting point:
```python
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]
```

Create a dictionary object in which the titles are the keys, and the description are the values using a dictionary comprehension.


In [1]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

title_dict = {task["title"]: task["desc"] for task in tasks}

assert title_dict == {
    "Laundry": "Wash clothes",
    "Homework": "Physics + Math",
    "Museum": "Egyptian things"
    }

### Creating a set from iterables using set comprehension

Using the following list as our starting point:
```python
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]
```

Create a set object with the task titles.

In [3]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

titles_set = {task["title"] for task in tasks}

assert titles_set == {"Laundry", "Homework", "Museum"}

### Applying a filtering condition to an iterable

Using the following list as our starting point:
```python
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]
```

Filter out all the tasks whose urgency is less than or equal to 3.

In [4]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

urgent_tasks = [task for task in tasks if task["urgency"] > 3]

assert urgent_tasks == [{'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5}]

An more functional way to solve that would be using the `filter` higher-order function:

In [35]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

urgent_tasks = list(filter(lambda task: task["urgency"] > 3, tasks))

assert urgent_tasks == [{'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5}]

Note that we had to use `list` to materialize the iterable into a list, because filter (as well as chain and others) return an Iterable.

### Using nested loops in comprehensions

Using the following list as our starting point:
```python
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]
```

Create a flattened list of items, in which all the individual task values are elements in the resulting list.


Let's take a stab at it with the regular nested loop.

In [7]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

for task in tasks:
    for item in task.values():
        print(item)

Laundry
Wash clothes
3
Homework
Physics + Math
5
Museum
Egyptian things
2


Thus, the non-Pythonic way to get that resulting list would be:

In [9]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

flattened_list = []
for task in tasks:
    for item in task.values():
        flattened_list.append(item)

assert flattened_list == [
    "Laundry", "Wash clothes", 3,
    "Homework", "Physics + Math", 5,
    "Museum", "Egyptian things", 2
]

In a more pythonic way, you can use embedded for loops in a comprehension:

In [10]:
tasks = [
   {'title': 'Laundry', 'desc': 'Wash clothes', 'urgency': 3},
   {'title': 'Homework', 'desc': 'Physics + Math', 'urgency': 5},
   {'title': 'Museum', 'desc': 'Egyptian things', 'urgency': 2}
]

flattened_list = [item for task in tasks for item in task.values()]

assert flattened_list == [
    "Laundry", "Wash clothes", 3,
    "Homework", "Physics + Math", 5,
    "Museum", "Egyptian things", 2
]

Note that the for loops are written in the same order in which you'd write them in the non-pythonic approach.

### When you shouldn't use comprehensions

You shouldn't need to use comprehensions in the following cases.

+ When you are not going to manipulate the items, and instead you will be transforming an iterable into another:

In [None]:
numbers = [1, 2, 4 ,2, 4, 5, 1, 2, 3, 4]

nums_set = set(numbers) # no need to use a set comprehension

+ When the expression within the comprehension is too complex.

For example, consider the following lists holding the style, color and size. We need to create all the possible variations:

In [13]:
styles = ['long-sleeve', 'v-neck']
colors = ['white', 'black']
sizes = ['L', 'S']

# Not recommended: hard to read and debug
variations = [" ".join([style, color, size]) for style in styles for color in colors for size in sizes]

print(variations)

['long-sleeve white L', 'long-sleeve white S', 'long-sleeve black L', 'long-sleeve black S', 'v-neck white L', 'v-neck white S', 'v-neck black L', 'v-neck black S']


### Is tuple comprehension a thing?

Consider the following code in which a jr developer is trying to use the comprehension syntax to create a tuple out of a list of objects:

```python
task1 = ["Laundry", "Wash clothes", 3]
task_tuple = (item for item in task1)
```

Is task_tuple a tuple?

No. There's no such thing as a tuple comprehension.

In fact, that syntax creates a generator object:

In [17]:
task1 = ["Laundry", "Wash clothes", 3]

task_tuple = (item for item in task1)

print(type(task_tuple))

<class 'generator'>


If you want to create a tuple out of the task list you have to do:

In [18]:
task1 = ["Laundry", "Wash clothes", 3]

task_tuple = tuple(task1)

assert task_tuple == ("Laundry", "Wash clothes", 3)

### Enumerating items with `enumerate`

Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]
```

Create a basic report showing:

```
Task 1: Homework   Physics and math   5
Task 2: Laundry    Wash clothes       3
Task 3: Museum     Egypt exhibit      4
```

In [20]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]

for i, task in enumerate(tasks):
    print(f"Task {i + 1}: {task.title: <12} {task.description: <20} {task.urgency:^3}")


Task 1: Homework     Physics and math      5 
Task 2: Laundry      Wash clothes          3 
Task 3: Museum       Egypt exhibit         4 


### Reversing items in an iterable with `reversed`


Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]
```

We want to create a report where the tasks show up in reversed order, while keeping the original list untouched.

```
Task: Task(title='Museum', description='Egypt exhibit', urgency=4)
Task: Task(title='Laundry', description='Wash clothes', urgency=3)
Task: Task(title='Homework', description='Physics and math', urgency=5)
```

The solution is to use `reversed`, which returns a copy of the original list in which the items are reversed:

In [21]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]

for task in reversed(tasks):
    print(task)

Task(title='Museum', description='Egypt exhibit', urgency=4)
Task(title='Laundry', description='Wash clothes', urgency=3)
Task(title='Homework', description='Physics and math', urgency=5)


### Combining more than two iterables with `zip`

Consider the following iterables consisting of a list of named tuples with fields "title", "description", and "urgency", a list of dates and a list of locations.

```python
tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]
dates = ["May 5, 2022", "May 9, 2022", "May 11, 2022"]
locations = ["School", "Home", "Downtown"]
```

Create the following report: 

```
Homework: by May 5, 2022 at School
Laundry: by May 9, 2022 at Home
Museum: by May 11, 2022 at Downtown
```

In [23]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]
dates = ["May 5, 2022", "May 9, 2022", "May 11, 2022"]
locations = ["School", "Home", "Downtown"]

for task, date, location in zip(tasks, dates, locations):
    print(f"{task.title}: by {date}, at {location}")

Homework: by May 5, 2022, at School
Laundry: by May 9, 2022, at Home
Museum: by May 11, 2022, at Downtown


### Chaining multiple iterables with `chain`

Consider the following lists of named tuples with fields "title", "description", and "urgency" that describe the outstanding and completed tasks:

```python
tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]

completed_tasks = [
   Task("Toaster", "Clean the toaster", 2),
   Task("Camera", "Export photos", 4),
   Task("Floor", "Mop the floor", 3)
]
```

Create a report that shows the titles from both lists.

In [30]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]

completed_tasks = [
   Task("Toaster", "Clean the toaster", 2),
   Task("Camera", "Export photos", 4),
   Task("Floor", "Mop the floor", 3)
]

for task in tasks + completed_tasks:
    print(task.title)

Homework
Laundry
Museum
Toaster
Camera
Floor


A more Pythonic solution involves the use of the `chain` function from `itertools` package.

This solution is also more effective in terms of 

In [31]:
from collections import namedtuple
from itertools import chain

Task = namedtuple("Task", ["title", "description", "urgency"])

tasks = [
   Task("Homework", "Physics and math", 5),
   Task("Laundry", "Wash clothes", 3),
   Task("Museum", "Egypt exhibit", 4)
]

completed_tasks = [
   Task("Toaster", "Clean the toaster", 2),
   Task("Camera", "Export photos", 4),
   Task("Floor", "Mop the floor", 3)
]

for task in chain(tasks, completed_tasks):
    print(task.title)

Homework
Laundry
Museum
Toaster
Camera
Floor


### Breaking early from loops

Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Implement a loop that stops after a task categorized with urgency 5 is found.

In [37]:
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")

tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

for i, task in enumerate(tasks):
    print(f"Checking task {i}: {task.title}")
    if task.urgency == 5:
        print(f"Urgent task detected: {task}")
        break

Checking task 0: Toaster
Checking task 1: Camera
Checking task 2: Homework
Urgent task detected: Task(title='Homework', description='Physics and math', urgency=5)


### Short-circuiting to the next iteration with `continue`

Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Create a report in which all the tasks categorized as 4 or 5 are are printed by calling a function, while the others are skipped.

In [1]:
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")

tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]


def print_task(id, task):
    print(f"{id}: {task.title: <12} {task.description: < 20} ")

for i, task in enumerate(tasks):
    if task.urgency < 4:
        continue
    print(f"Important/Urgent task: {task}")

Important/Urgent task: Task(title='Camera', description='Export photos', urgency=4)
Important/Urgent task: Task(title='Homework', description='Physics and math', urgency=5)
Important/Urgent task: Task(title='Internet', description='Upgrade plan', urgency=5)
Important/Urgent task: Task(title='Museum', description='Egypt exhibit', urgency=4)
Important/Urgent task: Task(title='Utility', description='Pay bills', urgency=5)


### Using `else` in `for` loops

Python allows you to append an `else` statement in `for` loops.

```python
for item in iterable:
  # loop body
else:
  # execute once when looping is complete, and break not used
```

That is, when `break` is used in the loop body the else operation will be skipped, otherwise it will be executed once when the iteration is complete.

Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

We want to locate the first task with the desired urgency level. Define a function `locate_task(urgency_level)` that loops through the tasks until it finds the required urgency level and then prints it. When not found, it should print `None`.

In [3]:
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")

tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

def locate_task(urgency_level):
    for task in tasks:
        if task.urgency == urgency_level:
            found_task = task
            break
    else:
        found_task = None
    print(f"Found Task: {found_task}")

locate_task(1)

locate_task(4)

Found Task: None
Found Task: Task(title='Camera', description='Export photos', urgency=4)


Note that `break` was used to skip the `else` part of the loop, otherwise it would have been executed anyway.

In [4]:
for i in range(6):
    print(i)
else:
    print("done!")

0
1
2
3
4
5
done!


### Using `else` in `while` loops

As with `for`, Python allows you to use an `else` statement in a `while` loop that is executed once, when the regular iterations has been completed, and skipped when `break` is used to stop the iteration:

```python
while condition:
  # loop body
else:
  # executed once, if break not used
```

That is, if `break` is used, the `else` operations will be skipped.

Consider the following list of named tuples with fields "title", "description", and "urgency".

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Suppose that we need to rest while we complete a series of tasks in each session. The implementation will involve setting a resting threshold as the sum of the total urgency level for the completed tasks.

Create a function `complete_tasks_with_break(resting_threshold)`

In [6]:
from collections import namedtuple

Task = namedtuple("Task", "title, description, urgency")

tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

def complete_tasks_with_break(resting_threshold):
    sum_task_urgencies = 0
    while tasks:
        if sum_task_urgencies > resting_threshold:
            print("Coffee break now!")
            break
        next_task = tasks.pop()
        print(f"Completed: {next_task}")
        sum_task_urgencies += next_task.urgency
    else:
        print("Party! Completed all the tasks!")

complete_tasks_with_break(7)

complete_tasks_with_break(25)

Completed: Task(title='Utility', description='Pay bills', urgency=5)
Completed: Task(title='Museum', description='Egypt exhibit', urgency=4)
Coffee break now!
Completed: Task(title='Laundry', description='Wash clothes', urgency=3)
Completed: Task(title='Internet', description='Upgrade plan', urgency=5)
Completed: Task(title='Floor', description='Mop the floor', urgency=3)
Completed: Task(title='Homework', description='Physics and math', urgency=5)
Completed: Task(title='Camera', description='Export photos', urgency=4)
Completed: Task(title='Toaster', description='Clean the toaster', urgency=2)
Party! Completed all the tasks!


## Section 46 &mdash; Type Hinting: A gentle and focused introduction

Python allows you to provide type hints to indicate the type of a variable:

In [None]:
number: int = 1

name: str = "John"

primes: list = [1, 2, 3]

Note that Type hinting does not make Python a statically typed language, and it doesn't enforce the typing of a variable &mdash; those are only hints.

Despite that fact, type hints may become extremely useful as code documentation as seen below:

In [1]:
from statistics import mean, stdev

def generate_stats(measures: list) -> tuple:
    measure_mean = mean(measures)
    measures_std = stdev(measures)
    return measure_mean, measures_std

Type hints will be used by IDEs and inform you when you're calling a function with incorrect parameter types. Also, the Python runtime will be able to report the proper invocation using `help`:

In [2]:
help(generate_stats)

Help on function generate_stats in module __main__:

generate_stats(measures: list) -> tuple



You can also use type hints with custom classes and named tuples as seen in the hints given to `namedtuple`:

In [None]:
from collections import namedtuple

Task = namedtuple("Task", ["title", "description", "urgency"])

class User:
    pass

def assign_task(pending_task: Task, user: User):
    pass

For data containers such as lists and tuples that can contain other objects, specifying only the type of the container will only get you so far:

In [3]:
from collections import namedtuple

Task = namedtuple("Task", "title description urgency")

def complete_tasks(tasks: list):
    for task in tasks:
        pass

# Which one is expected?
complete_tasks(["Homework", "Wash Clothes"])
complete_tasks([Task("Homework", "Physics + Math", 5), Task("Laundry", "Wash Clothes", 3)])

For those cases, Python allows you to identify the type of objects the container holds:

In [None]:
from collections import namedtuple

Task = namedtuple("Task", "title description urgency")

def complete_tasks(tasks: list[Task]):
    for task in tasks:
        pass


The following table summarizes the most common built-in container object annotations:

| Container Type | Example | Description |
| :------------- | :------ | :---------- |
| `list` | `list[str]`<br>`list[int]` | list of str objects<br>list of int objects |
| `tuple` | `tuple[float, int]`<br>`tuple[float, ...]` | tuple of float object and an int object<br>tuple of multiple float objects |
| `dict` | `dict[int, str]`<br>`dict[int, list[int]]` | dict of int keys and str values<br>dict of int keys and list of int object values |
| `set` | `set[int]`<br>`set[str]` | set of int objects<br>set of str objects |

As it's possible for a function to take different types for an specific parameter, type hinting supports the use of the `|` to separate the different options:

In [None]:
from statistics import mean, stdev

def generate_stats(measures: list[float] | tuple[float, ...]) -> tuple[float, ...]:
    pass

The vertical bar can also be used to indicated the type of the objects held by a data container:

In [None]:
from statistics import mean, stdev

def generate_stats(measures: list[float | int]) -> tuple[float, ...]:
    pass

### Advanced type hinting using the `typing` module

The `typing` module includes higher-level typing information that is not available out-of-the-box.

For example, instead of using `list[float] | tuple[float, ...]` we could use the `Sequence` type that capture any sequence data type

In [None]:
from statistics import mean, stdev
from typing import Sequence

def generate_stats(measures: Sequence[float]) -> tuple[float, ...]:
    pass

## Section 47 &mdash; Increasing function flexibility with `*args`, `**kwargs` and `/`

Consider the signature of the built-in `print` function:

```python
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)
```

With that succinct signature, `print` is able to support a variable number of arguments.

This happens thanks to the `*objects` definition (note the "*"), which means a variable number (zero or more) positional (that is, unnamed) arguments.

Closely related to `*args` is `**kwargs` &mdash; a way to specify a variable number of keyword arguments.

Certain functions are defined as follows:

```python
sort(*, key=None, reverse=False)
```

That syntax dictates that all the arguments behind it should be set as keyword-only arguments (no positional arguments allowed).

Similarly, you can use the `/` to dictate that all the arguments before "/" should be positional only.

```python
sum(iterable, /, start=0)
```

The previous function accepts a single positional argument `iterable`. After that, a single keyword argument is used.

### Accepting a variable number of positional arguments

Define a function `stringify` that takes a variable number of arguments and returns a list in which each element is the argument converted into a string:

```python
def stringify(*items) -> list[str]
```

In [1]:
def stringify(*items: any) -> list[str]:
    print(f"got {items} of {type(items)}")
    result = [str(item) for item in items]
    return result

stringify(1, "two", "None")

got (1, 'two', 'None') of <class 'tuple'>


['1', 'two', 'None']

As you can see, `*items` is actually received as a tuple of three elements.

### Accepting a function that accepts a single positional arg and a variable number of positional arguments

When we need to define a function that takes one argument and a variable number of positional arguments, the latter should be placed at the end as an `*args`:

In [3]:
def stringify_a(item0, *items):
    print(item0, items)

stringify_a(0)
stringify_a(0, 1)
stringify_a(0, 1, 2)

0 ()
0 (1,)
0 (1, 2)


If we place the variable number of arguments before the single positional argument, Python won't know how to place them:

In [7]:
def stringify_b(*items, item):
    print(items, item)

try:
    stringify_b(0)
except Exception as e:
    print(f"Oops: {e}")

try:
    stringify_b(0, 1)
except Exception as e:
    print(f"Oops: {e}")

try:
    stringify_b(0, 1, 2)
except Exception as e:
    print(f"Oops: {e}")

Oops: stringify_b() missing 1 required keyword-only argument: 'item'
Oops: stringify_b() missing 1 required keyword-only argument: 'item'
Oops: stringify_b() missing 1 required keyword-only argument: 'item'


However, above function could still be called using keyword args:

In [8]:
def stringify_b(*items, item):
    print(items, item)

stringify_b(1, item=None)
stringify_b(1, item=0)
stringify_b(1, 2, item=0)

(1,) None
(1,) 0
(1, 2) 0


However, there are far more idiomatic ways of defining that function, so the previous definition should be avoided and the variable number of arguments `*args` should be placed last.

### Accepting a variable number of keyword arguments with `**kwargs`

Consider a function used to create the qualification report for a student. As such, the function API should look like:

```python
create_report("John", math=100, phys=98, bio=95)
```

Implement that function so that it displays the report as:

```python
***** Report Begin for John *****
### math: 100
### phys: 98
### bio: 95
***** Report End for John *****
```


In [12]:
def create_report(student_name: str, **grades: dict[str, int]) -> None:
    print(f">>>DEBUG: got {grades} as {type(grades)}")
    print(f"***** Report Begin for {student_name} *****")
    for subject, grade in grades.items():
        print(f"### {subject}: {grade}")
    print(f"***** Report Begin for {student_name} *****")

create_report("John", math=100, phys=98, bio=95)

>>>DEBUG: got {'math': 100, 'phys': 98, 'bio': 95} as <class 'dict'>
***** Report Begin for John *****
### math: 100
### phys: 98
### bio: 95
***** Report Begin for John *****


Thus, effectively, `**kwargs` is received as a dictionary, and it accepts a variable number of keyword arguments.

When you use `**kwargs` it should be placed after all the the other parameters.

The general order for a function definition will be:

```python
def example(arg0, arg1, *args, kwarg0, kwarg1, **kwargs2):
  ...
```

In short:

| Left side | Divider | Right side |
| :-------- | :------ | :--------- |
| Positional-only | / | Positional or keyword |
| Positional or keyword | * | keyword-only |

Thus, the slash "/" is used to ensure that you pass any preceding arguments by position.

If you want to define a function that only accepts keyword arguments, you should define it as:

### Writing a function that accepts only keyword arguments

Write a function that accepts only keyword arguments.

In [15]:
def example(*, item1, item2, item3):
    print(f"{item1}; {item2}; {item3}")

example(item1="hello", item2="to", item3="Jason")

try:
    example("hello", "to", "Jason")
except Exception as e:
    print(f"Oops: {e}")

hello; to; Jason
Oops: example() takes 0 positional arguments but 3 were given


### Writing a function that accepts only positional arguments

Write a function that accepts only positional arguments.

In [18]:
def example(item1, item2, item3, /):
    print(f"{item1}; {item2}; {item3}")

example("hello", "to", "Jason")

try:
    example(item1="hello", item2="to", item3="Jason")
except Exception as e:
    print(f"Oops: {e}")




Oops: example() got some positional-only arguments passed as keyword arguments: 'item1, item2, item3'
hello; to; Jason


### Writing a function that accepts a positional argument, a variable number of positional arguments, and a named keyword argument

Write a function that accepts a positional argument, a variable number of positional arguments, and a named keyword argument.

In [26]:
def example(pos0, *pos, kw_last):
    print(f"pos0={pos0}; pos={pos}; kw_last={kw_last}")

example("positional_0", kw_last="last_kw")
example(pos0="positional_0", kw_last="last_kw")
example("positional_0", "positional_1", "positional_2", kw_last="last_kw")

# Syntax Error (does not compile!)
# example(pos0="positional_0", "positional_1", "positional_2", kw_last="last_kw")


pos0=positional_0; pos=(); kw_last=last_kw
pos0=positional_0; pos=(); kw_last=last_kw
pos0=positional_0; pos=('positional_1', 'positional_2'); kw_last=last_kw


### Using slash and asterisk

Write a function that requires:
+ two arguments to be passed by position only
+ followed by an argument that can be passed as either keyword or position
+ followed by an argument that can only be passed as keyword

In [30]:
def example(pos_0, pos_1, /, kw_or_pos_2, *, kw_3):
    print(f"pos_0={pos_0}; pos_1={pos_1}; kw_or_pos_2={kw_or_pos_2}; kw_3={kw_3}")

example("item0", "item1", kw_or_pos_2="item2", kw_3="item3")
example("item0", "item1", "item2", kw_3="item3")

# These won't work
try:
    example(pos_0="item0", pos_1="item1", kw_or_pos_2="item2", kw_3="item3")
except Exception as e:
    print(f"Oops: {e}")

pos_0=item0; pos_1=item1; kw_or_pos_2=item2; kw_3=item3
pos_0=item0; pos_1=item1; kw_or_pos_2=item2; kw_3=item3
Oops: example() got some positional-only arguments passed as keyword arguments: 'pos_0, pos_1'


Note that you cannot put the "*" before the "/" as it will require you to pass keyword arguments between the two symbols.

## Section 48 &mdash; More on OOP

This section deals with additional details and examples on OOP. 

### `self` is not a keyword

By convention, we use `self` to refer to the instantiated object when dealing with OOP. However, `self` is not a reserved keyword:

In [1]:
class Task:
    def __init__(this, title, description, urgency):
        this.title = title
        this.description = description
        this.urgency = urgency

task = Task("Homework", "Physics + Math", 4)
print(f"task.title={task.title}")


task.title=Homework


Even when it works, it doesn't mean that we should be using it. 

> use `self` to reference the instance object.

### `self` is being implicitly populated by Python

When you create a class, the `__init__` method will require `self` as the first argument.

This argument is set implicitly by Python.

Behind the scene, the instantiation of a class consists of two steps as seel below:

In [4]:
class Task:
    def __init__(self):
        print(f"__init__ has been called: creating object: {id(self)}")

    def __new__(cls):
        new_task = object.__new__(cls)
        print(f"__new__ has been called, creating object:  {id(new_task)}")
        return new_task

task = Task()

__new__ has been called, creating object:  140585690831200
__init__ has been called: creating object: 140585690831200


See how when overridding the `__new__` class method, it gets automatically invoked. Within the method, we can invoke the `object.__new__` to instantiate an instance and then return it. That is the `self` that is automatically sent as argument to the class constructor.

Note that you will even be able to demonstrate this process by invoking those methods:

In [5]:
# This is what Python does behind the scenes

task = Task.__new__(Task)
Task.__init__(task)

__new__ has been called, creating object:  140585690831584
__init__ has been called: creating object: 140585690831584


### Checking an instance attribute with `__dict__`

An instance special attribute `__dict__` lets you see the attributes assigned to an instance:

In [7]:
class Task:
    def __init__(self, title, description, urgency):
        self.title = title
        self.description = description
        self.urgency = urgency

task = Task("Homework", "Physics + Math", 3)

print(f"task.__dict__={task.__dict__!r}")

task.__dict__={'title': 'Homework', 'description': 'Physics + Math', 'urgency': 3}


### Guidelines when designing the `__init__` method

1. Identify the required arguments.
2. Prioritize key arguments, placing the more important ones before the less important ones.
3. Use key arguments as positional. You want users to be able to set important arguments as positional arguments (as it is cleaner).
4. Limit the number of positional arguments. Use no more than four positional arguments as positional and make the rest as keyword-only.
5. Use sensible default values to foster better DX.

### Specifying all the attributes in the constructor

While not strictly required, it is recommended to specify all the class attributes in `__init__` as otherwise it'll be unclear for the user the different attributes a class can have:

In [9]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.description = desc
        self.urgency = urgency

    def complete(self):
        self.status = "completed"

    def add_tag(self, tag):
        if not self.tags:
            self.tags = []
        self.tags.append(tag)

The previous code gives a terrible user experience:

In [10]:
task = Task("Homework", "Physics + Math", 3)

try:
    print(task.status)
except Exception as e:
    print(f"Oops: {e}")

task.complete()
print(task.status)

Oops: 'Task' object has no attribute 'status'
completed


As a result, it is recommended to specify all of the object attributes in `__init__`:


In [11]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.description = desc
        self.urgency = urgency
        self.status = "created"
        self.tags = []

    def complete(self):
        self.status = "completed"

    def add_tag(self, tag):
        self.tags.append(tag)


task = Task("Homework", "Physics + Math", 3)

print(task.status)

task.complete()
print(task.status)

created
completed


### Defining class attributes

Examples above defined *per-instance* attributes. However, you will sometimes find that some attributes may be shared for all instance objects.

Those are called *class attributes*.

Consider that all instances of the `Task` class share the same attribute `user` that identifies the logged in user:

In [None]:
class Task:
    user = "the logged in user"

    def __init__(self, title, desc, urgency):
        ...

Class attributes should be placed after the class definition and before the `__init__` method declaration.

### instance, static, and class methods

The methods we've seen so far are known as instance methods.
Instance methods are intended to be called on an instance object. As a result, those are defined with `self` as the first parameter of the method.

In addition to those, Python allows you to define static and class methods for other user cases:
+ static methods are used for utility related functions that are not specific to any instance.
+ class methods are used for accessing class-level attributed.


Unlike instance methods, static methods do not use `self` as their first parameter, and need to be decorated with `@staticmethod`.

A static method is invoked using the class name.

For example, the following snippet defines a static method `get_timestamp` that returns the current date and time:

In [1]:
from datetime import datetime

class Task:
    @staticmethod
    def get_timestamp():
          now = datetime.now()
          timestamp = now.strftime("%b %d %Y, %H:%M")
          return timestamp

    def __init__(self, title, desc, urgency):
        ...

print(Task.get_timestamp())

Sep 06 2023, 08:07


Class methods may need access the attributes of the class. When defining those methods, you use `cls` as its first parameter, which refers to the class. A class method also uses the `classmethod` decorator.

| NOTE: |
| :---- |
| As it happens with `self`, `cls` is not a keyword, it is a convention that should be respected. |

Consider the following example, in which we have a Task modeled as a dictionary, and we want to create a class instance from it.

A class method is a good solution for this:

In [None]:
task_dic = {"title": "Laundry", "desc": "Wash Clothes", "urgency": 3}

class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    @classmethod
    def task_from_dict(cls, task_dict):
        title = task["title"]
        desc = task["desc"]
        urgency = task["urgency"]
        task_obj = cls(title, desc, urgency)
        return task_obj


task = Task.task_from_dict(task_dict)

Note that static methods (as opposed to class methods), do not manipulate any instance of the class. Oftentimes, static methods can be defined outside of the class as they tend to implement generic functionality that does not have to do anything with the class (as the `get_timestamp`).

### Invoking internal methods within a class

When dealing with classes, oftentimes you will find that some methods are intended to be internal methods used only by the methods of the class.

Consider the following scenario in which a `complete` instance method relies on a `format_note` method to get a formatted representation of the task:

In [3]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def complete(self, note = ""):
        self.status = "completed"
        self.note = self.format_note(note)

    def format_note(self, note):
        formatted_note = note.title() # Returns an uppercased version of a string
        return formatted_note


task = Task("Laundry", "Wash clothes", 3)
task.complete()

note = "Hello"
print(note.title())

Hello


The user can call `format_note` directly, which isn't the desired behavior.

However, Python doesn't have any formal mechanism to restrict access to any attribute or method. Instead, the convention is to use `_` as the prefix for protected methods (the ones that should be available to the current class and subclasses), and `__` for private methods.

In [6]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def complete(self, note = ""):
        self.status = "completed"
        self.note = self._format_note(note)

    def _format_note(self, note):
        formatted_note = note.title() # Returns an uppercased version of a string
        return formatted_note


task = Task("Laundry", "Wash clothes", 3)
task.complete()

If you want to make it *private*, so that it is not available to subclasses, you should use `__`:

In [7]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def complete(self, note = ""):
        self.status = "completed"
        self.note = self.__format_note(note)

    def __format_note(self, note):
        formatted_note = note.title() # Returns an uppercased version of a string
        return formatted_note


task = Task("Laundry", "Wash clothes", 3)
task.complete()

### Creatubg read-only attributes with the property decorator

Consider the scenario of the `Task` class which exposes a `status` property that we want to make read-only, so that can only be interrogated, or updated through the `complete` instance method.

You can do that using the `@property` decorator:

In [11]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency
        self._status = "created"

    @property
    def status(self):
        return self._status

    def complete(self):
        self._status = "completed"


task = Task("Laundry", "Wash clothes", 3)
print(f"task.status={task.status}")

task.complete()
print(f"task.status={task.status}")

try:
    task.status = "reopened"
except Exception as e:
    print(f"Oops: {e}")


task.status=created
task.status=completed
Oops: can't set attribute 'status'


### Using property setters

We've seen that `@property` creates a read-only property for the Task class.

Sometimes, we might need to expose a *managed* mechanism to control how the value of an attribute is set.

Consider the example of a Task class which exposes an `status` property that can only be set if given the values "created", "started", "completed", or "suspended".

In [13]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency
        self._status = "created"

    @property
    def status(self):
        return self._status

    @status.setter
    def status(self, value):
        allowed_values = ["created", "started", "completed", "suspended"]
        if value in allowed_values:
            self._status = value
        else:
            print(f"Invalid status: {value}; must be one of {allowed_values}")


task = Task("Laundry", "Wash clothes", 3)
task.status = "suspended"
print(f"task.status = {task.status}")

task.status = "undefined"
print(f"task.status = {task.status}")

task.status = suspended
Invalid status: undefined; must be one of ['created', 'started', 'completed', 'suspended']
task.status = suspended


### Using a property setter to control the values of a property

Enhance the `Task` class so that urgency is an integer between 1 and 5.

In [21]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self._urgency = urgency

    @property
    def urgency(self) -> int:
        return self._urgency

    @urgency.setter
    def urgency(self, value: int):
        if type(value) != int:
            raise TypeError("value must be an int")
        elif value < 1 or value > 5:
            raise ValueError("value must be between 1 and 5")
        else:
            self._urgency = value


task = Task("Laundry", "Wash clothes", 3)
print(f"task urgency: {task.urgency}")

task.urgency = 5
assert task.urgency == 5

try:
    task.urgency = "Highest"
except Exception as e:
    print(f"Ooops: {e}")

try:
    task.urgency = -1
except Exception as e:
    print(f"Ooops: {e}")

try:
    task.urgency = 6
except Exception as e:
    print(f"Ooops: {e}")

task urgency: 3
Ooops: value must be an int
Ooops: value must be between 1 and 5
Ooops: value must be between 1 and 5


### Customizing the string representation for a class instance with `__str__`

You can use the special method `__str__` to provide the string representation of the instance.

Create a `Task` implementation that includes the definition of the `__str__` which should show:

```
Laundry: Wash clothes, urgency level 3
```


In [24]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"


task = Task("Laundry", "Wash clothes", 3)
print(task)

planned_task = Task("Homework", "Physics + Math", 5)
print(f"next task: {planned_task}")

Laundry: Wash clothes, urgency level 3
next task: Homework: Physics + Math, urgency level 5


The `str` method can be used to get the string representation of an instance, and it invokes the `obj.__str__` method behind the scenes:

In [25]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"


task = Task("Laundry", "Wash clothes", 3)
str(task) == "Laundry: Wash clothes, urgency level 3"

True

### Using `__repr__` for the string representation of an instance in the interactive console.

When using the interactive console, the special method that is invoked to get the string representation of an instance is `__repr__` instead of `__str__`.

As such, it is customary to define both.

First let's see how `__str__` is not being called in certain circumstances:

In [26]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"

task = Task("Laundry", "Wash clothes", 3)
task

<__main__.Task at 0x7f8e4015b520>

This can be solved by implementing the `__repr__` method:

In [27]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"

    def __repr__(self):
        return f"Task({self.title!r}, {self.desc!r}, {self.urgency})"

task = Task("Laundry", "Wash clothes", 3)
task

Task('Laundry', 'Wash clothes', 3)

| NOTE: |
| :---- |
| "!r" is used to quote the contents of the attribute. It is known as a *conversion flag*.|

To call `__repr__` on an instance, use the `repr` method:

In [None]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"

    def __repr__(self):
        return f"Task({self.title!r}, {self.desc!r}, {self.urgency})"

task = Task("Laundry", "Wash clothes", 3)
assert repr(task) == "Task('Laundry', 'Wash clothes', 3)"

In summary:
+ `__repr__` is intended for debugging and development.
+ `__str__` is intended for showing a descriptive information intended for users of the code.

| NOTE: |
| :---- |
| Python uses `__repr__` when `__str__` is not implemented, but it is a good practice to implement both. |

### Using `__class__` and `__name__` attributes in `__repr__`

The implementation of `__repr__` seen above hardcodes the name of the class.

This can be prevented using `__class__` and `__name__` as seen below:

In [28]:
class Task:
    def __init__(self, title, desc, urgency):
        self.title = title
        self.desc = desc
        self.urgency = urgency

    def __str__(self):
        return f"{self.title}: {self.desc}, urgency level {self.urgency}"

    def __repr__(self):
        return f"{self.__class__.__name__}({self.title!r}, {self.desc!r}, {self.urgency})"

task = Task("Laundry", "Wash clothes", 3)
assert repr(task) == "Task('Laundry', 'Wash clothes', 3)"

### Using inheritance: design considerations

Inheritance comes with additional baggage (such as tight-coupling) so we shouldn't jump into inheritance right away.

Instead, we should spend some time first analyzing the scenario at hand and deciding if using inheritance will pay off.

You should start by studying the similarities and differences in the attributes and methods.

Consider the following scenario in which we need to model two kinds of users: supervisors and subordinates:

![Inheritance](./pics/inheritance.png)

This scenario clearly calls out for using inheritance with an `Employee` base class and two subclasses `Supervisor` and `Subordinate`.

As seen in the picture below, the subclasses will feature the attributes and methods from the base class, fostering the DRY principle.

![Inheritance design](./pics/inheritance-design-non-uml.png)

This process is more important and complex than coding the correspondent classes:

In [29]:
class Employee:
    def __init__(self, name, employee_id):
        self.name = name
        self.employee_id = employee_id

    def login(self):
        print(f"The employee {self.name} has just logged in.")

    def logout(self):
        print(f"The employee {self.name} has just logged out.")

class Supervisor(Employee):
    pass

supervisor = Supervisor("John", "1001")
supervisor.login()
supervisor.logout()

The employee John has just logged in.
The employee John has just logged out.


#### Overriding a subclass method completely and `mro`

Python allows you to override a class method in its entirety by simply defining the overridden method in the subclass:

In [31]:
class Employee:
    def __init__(self, name, employee_id):
        self.name = name
        self.employee_id = employee_id

    def login(self):
        print(f"The employee {self.name} has just logged in.")

    def logout(self):
        print(f"The employee {self.name} has just logged out.")

class Supervisor(Employee):
    def login(self):
        print(f"The supervisor {self.name} has just logged in.")

supervisor = Supervisor("John", "1001")
supervisor.login()
supervisor.logout() # not overridden


The supervisor John has just logged in.
The employee John has just logged out.


*MRO* (Method Resolution Order) dictates that when you call a method on an instance it calls the one that is closer to the instance.

You can inspect what the MRO looks like using the `mro` method in a class:

In [32]:
class Employee:
    def __init__(self, name, employee_id):
        self.name = name
        self.employee_id = employee_id

    def login(self):
        print(f"The employee {self.name} has just logged in.")

    def logout(self):
        print(f"The employee {self.name} has just logged out.")

class Supervisor(Employee):
    def login(self):
        print(f"The supervisor {self.name} has just logged in.")

Supervisor.mro()

[__main__.Supervisor, __main__.Employee, object]

### Overriding a method partially using `super()`

Oftentimes, you will want to override the inherited implementation of a method to enhance it, rather than to change it completely.

In those cases, you'll need to use the `super` keyword:

In [33]:
class Employee:
    def __init__(self, name, employee_id):
        self.name = name
        self.employee_id = employee_id

    def login(self):
        print(f"The employee {self.name} has just logged in.")

    def logout(self):
        print(f"The employee {self.name} has just logged out.")

class Supervisor(Employee):
    def login(self):
        print(f"The supervisor {self.name} has just logged in.")

    def logout(self):
        super().logout()
        print(f"Applying additional logout actions for a supervisor.")

supervisor = Supervisor("John", "1001")
supervisor.logout()     # partially overridden

The employee John has just logged out.
Applying additional logout actions for a supervisor.


### Enumerations using `Enum` classes

While you may be tempted to use regular classes for enumerations, you will find a couple of drawbacks:

1. The type of the members are not `Direction`, but `int`.
2. You can't iterate over the attributes.

In [1]:
class Direction:
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

def move_to(direction: Direction, distance: float):
    ...

assert type(Direction.NORTH) == int

A better solution involves the use of an enumeration class, which does not have any of the aforementioned drawbacks:

In [3]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

def move_to(direction: Direction, distance: float):
    ...

assert type(Direction.NORTH) == Direction

for direction in Direction:
    print(direction)

Direction.NORTH
Direction.EAST
Direction.SOUTH
Direction.WEST


In addition, we're not constrained to use ints, we can use other types such as strings:

In [None]:
from enum import Enum

class Direction(Enum):
    NORTH = "N"
    EAST = "E"
    SOUTH = "S"
    WEST = "W"

We can use `type` and `isinstance` in a consistent manner when using Enum classes:

In [5]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

north = Direction.NORTH

assert type(north) == Direction
assert isinstance(north, Direction) == True

You can access the an enumeration member name and value:

In [7]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

north = Direction.NORTH

assert north.name == "NORTH"
assert north.value == 0

#### Instantiating an enumerated member from its value

Consider an enum that defines the 4 possible directions:
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

If you need to instantiate one of its members (e.g., SOUTH) from its value, you only need to use the constructor of the Enumerated class:

In [8]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

direction = Direction(2)
assert direction == Direction.SOUTH

Trying to instantiate from a value that is not part of the enumeration will fail:

In [10]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

try:
    direction = Direction(5)
except Exception as e:
    print(f"Oops: {e}")

Oops: 5 is not a valid Direction


#### Iterating over all enumeration members

By design, any subclass of `Enum` is by design an iterable:


In [12]:
all_directions = list(Direction)

print(all_directions)

for direction in Direction:
    print(direction)

[<Direction.NORTH: 0>, <Direction.EAST: 1>, <Direction.SOUTH: 2>, <Direction.WEST: 3>]
Direction.NORTH
Direction.EAST
Direction.SOUTH
Direction.WEST


#### Defining methods for an enumeration class

An enumeration class is still a Python class, so you can define applicable methods if required, such as `__str__`:

In [18]:
from enum import Enum

class Direction(Enum):
    NORTH = 0
    EAST = 1
    SOUTH = 2
    WEST = 3

    def __str__(self):
        return self.name.lower()

    def __repr__(self):
        return f"Direction({self.name}: {self.value})"

north = Direction.NORTH
print(repr(north))
print(north)

def move_to(direction: Direction, distance: float) -> None:
    if direction in Direction:
        print(f"Move {direction} for {distance} miles")
    else:
        print(f"Wrong input for direction: {direction}")

move_to(north, 2.5)

Direction(NORTH: 0)
north
Move north for 2.5 miles


### Using `@dataclass` to eliminate boilerplate code

The `@dataclass` decorator available in the `dataclasses` module lets you eliminate the boilerplate associated with the creation of classes that hold values.

Compared with names tuples, which are a lightweight data model, data classes support mutability and can be enriched with custom methods and support inheritance.

Consider the following example that models a data class for a restaurant bill:

In [20]:
from dataclasses import dataclass

@dataclass
class Bill:
    table_number: int
    meal_amount: float
    served_by: str
    tip_amount: float


bill0 = Bill(5, 60.5, "Jason", 10)
print(f"Today's bill: {bill0}")

bill1 = Bill(7, 15.23, "Jane", 3.5)
print(f"Today's bill: {bill1}")
print(f"Today's bill: {bill0}")

Today's bill: Bill(table_number=5, meal_amount=60.5, served_by='Jason', tip_amount=10)
Today's bill: Bill(table_number=7, meal_amount=15.23, served_by='Jane', tip_amount=3.5)
Today's bill: Bill(table_number=5, meal_amount=60.5, served_by='Jason', tip_amount=10)


Note how the `__init__` and `__repr__` has been implemented for us, and that while the definition looks like class attributes, they're actually instance attributes.

Dataclasses support specifying default values:

In [21]:
from dataclasses import dataclass

@dataclass
class Bill:
    table_number: int
    meal_amount: float
    served_by: str
    tip_amount: float = 0

bill0 = Bill(5, 60.5, "Jason")
print(f"Today's bill: {bill0}")

Today's bill: Bill(table_number=5, meal_amount=60.5, served_by='Jason', tip_amount=0)


Dataclasses look like named tuples, but they're mutable:

In [23]:
from dataclasses import dataclass

@dataclass
class Bill:
    table_number: int
    meal_amount: float
    served_by: str
    tip_amount: float = 0

bill0 = Bill(5, 60.5, "Jason")
bill0.served_by = "Jane"
print(f"Today's bill: {bill0}")

Today's bill: Bill(table_number=5, meal_amount=60.5, served_by='Jane', tip_amount=0)


#### Creating immutable dataclasses

You can create immutable dataclasses by passing an additional argument to the `@dataclass` decorator:

In [22]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Bill:
    table_number: int
    meal_amount: float
    served_by: str
    tip_amount: float = 0

immutable_bill = Bill(5, 60.5, "Jason")
try:
    immutable_bill.served_by = "Jane"
except Exception as e:
    print(f"Oops: {e}")

Oops: cannot assign to field 'served_by'


#### Creating hierarchies of dataclasses

At its core, a data class has the same extensibility features as other regular custom classes. However, several aspects of the `@dataclass` decorator must be taken into account.

In [24]:
from dataclasses import dataclass

@dataclass
class BaseBill:
    meal_amount: float

@dataclass
class TippedBill(BaseBill):
    tip_amount: float

bill0 = TippedBill(30, 4.5)
print(bill0)

TippedBill(meal_amount=30, tip_amount=4.5)


Attributes from the base class are inherited by the subclass.

However, you might find problems when the superclass defines default values:

In [28]:
from dataclasses import dataclass

@dataclass
class BaseBill:
    meal_amount: float = 25.95

@dataclass
class TippedBill(BaseBill):
    tip_amount: float

# compile-time error:
#  Non-default argument follows default argument
bill0 = TippedBill(30, 4.5)

TypeError: non-default argument 'tip_amount' follows default argument

Thus, in most cases, you may want to avoid setting default values for the superclass, so that you'll have more flexibility to implement your subclasses.

Otherwise, you'll need to provide default values for all the values of the subclasses too:

In [29]:
from dataclasses import dataclass

@dataclass
class BaseBill:
    meal_amount: float = 25.95

@dataclass
class TippedBill(BaseBill):
    tip_amount: float = 0.99

bill0 = TippedBill(30, 4.5)

### Creating lazy attributes (lazy evaluation)

Lazy evaluation is an implementation paradigm the defers the evaluation of expensive operations until it is strictly required.

Generators are application of lazy evaluations, on which the retrieval of an item is deferred until required, instead of materializing a potentially memory hungry huge list.

Let consider the following scenario involving a social media app in which a user can follow other users. The app provides the following capabilities:
+ View a user's followers. 
+ Getting a user's detailed profile by tapping on the user's thumbnail

The *stubbed* backend implementation for such capabilities would be like:

In [74]:
import time
from datetime import datetime

class User:
    def __init__(self, username) -> None:
        self.username = username
        self.profile_data = self._get_profile_data()
        print(f"### {username} created")

    def _get_profile_data(self):
        print("*** retrieving data from the server and loading it in memory")
        time.sleep(1)
        fetched_data = "(expensive to retrieve data here)"
        return fetched_data

def get_followers(username):
    print(f"*** retrieving followers from server for user {username}")
    usernames_fetched = ["Jason", "Florence", "Margot"]
    followers = [User(username) for username in usernames_fetched]
    return followers

print(f">>> Retrieving followers for Emma")
start_time = time.time()
followers = get_followers("Emma")
print(f">>> Followers retrieved in {time.time() - start_time} msec")


>>> Retrieving followers for Emma
*** retrieving followers from server for user {username}
*** retrieving data from the server and loading it in memory


### Jason created
*** retrieving data from the server and loading it in memory
### Florence created
*** retrieving data from the server and loading it in memory
### Margot created
>>> Followers retrieved in 3.0050675868988037 msec


A simple `get_followers` operation is taking a long time because we're unnecessarily loading the profile data for each and every user. That seems unnecessary, as they user might not need to get all the follower's details.

We'll see two different techniques to solve it:
+ [overriding `__getattr__` special method to implement lazy attributes](#overriding-__getattr__-to-implement-lazy-attributes)
+ [implementing a property as a lazy attribute](#implementing-a-property-as-a-lazy-attribute)

#### Overriding `__getattr__` to implement lazy attributes

There are a set of special methods (like `__str__` and `__repr__`) that we can override to provide custom behavir.

One of those methods is `__getattr__` that let you customize access to the instance attribute, and that can be used to implement lazy evaluation.

For custom classes, we can find the instance attributes stored in a dictionary that can be accessed through `__dict__`. This object uses the attribute names as keys, and the attribute values as the corresponding dictionary values.

If that dictionary does not include an attribute, the special method `__getattr__` gets called as a fallback mechanism. Then, in the implementaation you can either provide a value for that requested attribute, or an `AttributeError` will be raised if no value is provided in the implementation.

As a result:

In [79]:
import time
from datetime import datetime

class User:
    def __init__(self, username) -> None:
        self.username = username
        print(f"### {username} created")

    def __getattr__(self, item):
        print(f"*** __getattr__ called for {item}")
        if item == "profile_data":
            profile_data = self._get_profile_data()
            setattr(self, item, profile_data)
            return profile_data


    def _get_profile_data(self):
        print("*** retrieving data from the server and loading it in memory")
        time.sleep(1)
        fetched_data = "(expensive to retrieve data here)"
        return fetched_data

def get_followers(username):
    print(f"*** retrieving followers from server for user {username}")
    usernames_fetched = ["Jason", "Florence", "Margot"]
    followers = [User(username) for username in usernames_fetched]
    return followers

print(f">>> Retrieving followers for Emma")
start_time = time.time()
followers = get_followers("Emma")
print(f">>> Followers retrieved in {time.time() - start_time} msec")


emma = User("Emma")
print(f"\n>>> Retrieving Emma's profile data")
start_time = time.time()
profile_data = emma.profile_data
print(profile_data)
print(f">>> Emma's profile data retrieved in {time.time() - start_time:.2f} msec")


print(f"\n>>> Retrieving Emma's profile data (again)")
start_time = time.time()
profile_data = emma.profile_data
print(profile_data)
print(f">>> Emma's profile data retrieved in {time.time() - start_time:.2f} msec")


>>> Retrieving followers for Emma
*** retrieving followers from server for user Emma
### Jason created
### Florence created
### Margot created
>>> Followers retrieved in 0.00011515617370605469 msec
### Emma created

>>> Retrieving Emma's profile data
*** __getattr__ called for profile_data
*** retrieving data from the server and loading it in memory
(expensive to retrieve data here)
>>> Emma's profile data retrieved in 1.00 msec

>>> Retrieving Emma's profile data (again)
(expensive to retrieve data here)
>>> Emma's profile data retrieved in 0.00 msec


Now getting the followers take only a fraction of a second because we don't need to perform the expensive operation involving retrieving the user details for each of the followers.

Note that:
+ in the `__init__` method we have removed setting up the `profile_data`. It will be handled through `__getattr__`.
+ in the `__getattr__` method we check whether the `"profile_data"` is being requested, if so, we run the expensive operation, and also use `setattr` so that we don't need to go through the expensive operation again (it's cached!)

#### Implementing a property as a lazy attribute

In this section we will leveraget the `@property` decorator that we know can be used to implement setters and getters. In a way, we will be able to intercept when an attribute is accessed.

In [82]:
import time
from datetime import datetime

class User:
    def __init__(self, username) -> None:
        self.username = username
        self._profile_data = None
        print(f"### {username} created")

    @property
    def profile_data(self):
        if self._profile_data is None:
            self._profile_data = self._get_profile_data()
        else:
            return self._profile_data

    def _get_profile_data(self):
        print("*** retrieving data from the server and loading it in memory")
        time.sleep(1)
        fetched_data = "(expensive to retrieve data here)"
        return fetched_data

def get_followers(username):
    print(f"*** retrieving followers from server for user {username}")
    usernames_fetched = ["Jason", "Florence", "Margot"]
    followers = [User(username) for username in usernames_fetched]
    return followers

print(f">>> Retrieving followers for Emma")
start_time = time.time()
followers = get_followers("Emma")
print(f">>> Followers retrieved in {time.time() - start_time} msec")

emma = User("Emma")
print(f"\n>>> Retrieving Emma's profile data")
start_time = time.time()
profile_data = emma.profile_data
print(profile_data)
print(f">>> Emma's profile data retrieved in {time.time() - start_time:.2f} msec")


print(f"\n>>> Retrieving Emma's profile data (again)")
start_time = time.time()
profile_data = emma.profile_data
print(profile_data)
print(f">>> Emma's profile data retrieved in {time.time() - start_time:.2f} msec")


>>> Retrieving followers for Emma
*** retrieving followers from server for user Emma
### Jason created
### Florence created
### Margot created
>>> Followers retrieved in 0.0001533031463623047 msec
### Emma created

>>> Retrieving Emma's profile data
*** retrieving data from the server and loading it in memory


None
>>> Emma's profile data retrieved in 1.00 msec

>>> Retrieving Emma's profile data (again)
(expensive to retrieve data here)
>>> Emma's profile data retrieved in 0.00 msec


Again, now the followers are retrieved in a fraction of a second when it used to take 3 seconds.

Also the caching mechanism is in place.

### Using `type` instrospection to create flexible methods and functions

Consider the following list of tasks objects:

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Create a function `filter_tasks(tasks, by_urgency)` that can filter out the tasks based on the given argument `by_urgency`. That argument can be either a value like `3` or a list as `[3, 4, 5]`.

| HINT: |
| :---- |
| Use `type` to interrogate the type. |

In [5]:
from dataclasses import dataclass

@dataclass
class Task:
    title: str
    desc: str
    urgency: int

def filter_tasks(tasks, by_urgency):
    if type(by_urgency) == int:
        result = [t for t in tasks if t.urgency == by_urgency]
    elif type(by_urgency) == list:
        result = [t for t in tasks if t.urgency in by_urgency]
    else:
        raise TypeError(f"by_urgency requires int or list by got {type(by_urgency)}")
    return result


tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

print(filter_tasks(tasks, 2))
print(filter_tasks(tasks, [2, 3]))

try:
    filter_tasks(tasks, "Highest")
except Exception as e:
    print(f"Oops: {e}")


[Task(title='Toaster', desc='Clean the toaster', urgency=2)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
Oops: by_urgency requires int or list by got <class 'str'>


Note that with `type` you can use the `is` operator as in:

```python
if type(by_urgency) is int:
  ...

```

### Using `isinstance` instrospection to create flexible methods and functions

Consider the following list of tasks objects:

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Create a function `filter_tasks(tasks, by_urgency)` that can filter out the tasks based on the given argument `by_urgency`. That argument can be either a value like `3` or a list as `[3, 4, 5]`, or a tuple as `(3, 4, 5)`.

| HINT: |
| :---- |
| Use `isinstance` to interrogate the type. |

`isinstance` is similar to type, but it's the preferred aproach for checking the an object's type because of its flexibility.

```python
assert isinstance(4, int)
assert isinstance([4, 5], list)
assert isinstance([4, 5], (int, list))
```

The first argument is the object to be checked and the second is a type or a tuple of classes. In the latter, it makes you a one-to-many comparison using an *or* approach.

Additionally, `type` does not consider the class hierarchy, while `isinstance` does (see next exercise).

As a result:

In [6]:
from dataclasses import dataclass

@dataclass
class Task:
    title: str
    desc: str
    urgency: int

def filter_tasks(tasks, by_urgency):
    if isinstance(by_urgency, int):
        result = [t for t in tasks if t.urgency == by_urgency]
    elif isinstance(by_urgency, (list, tuple)):
        result = [t for t in tasks if t.urgency in by_urgency]
    else:
        raise TypeError(f"by_urgency requires int or list by got {type(by_urgency)}")
    return result


tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

print(filter_tasks(tasks, 2))
print(filter_tasks(tasks, [2, 3]))
print(filter_tasks(tasks, (2, 3)))

try:
    filter_tasks(tasks, "Highest")
except Exception as e:
    print(f"Oops: {e}")


[Task(title='Toaster', desc='Clean the toaster', urgency=2)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
Oops: by_urgency requires int or list by got <class 'str'>


### Using `isinstance` with class hierarchies

Consider the following class hierarcy consisting of a `User` base class and a `Supervisor` subclass.

Create an instance of the subclass named `supervisor`

Perform the following comparisons:
+ `type(supervisor) is User`
+ `type(supervisor) is Supervisor`
+ `isinstance(supervisor, User)`
+ `isinstance(supervisor, Supervisor)`

In [7]:
class User:
    pass

class Supervisor(User):
    pass

supervisor = Supervisor()

comparisons = [
    type(supervisor) is User,
    type(supervisor) is Supervisor,
    isinstance(supervisor, User),
    isinstance(supervisor, Supervisor)
]

assert comparisons == [False, True, True, True]

### Using generic classes for instrospection checks

In the standard library, the `collections.abc` module defines several abstract base classes which can be used to test whether a specific class has attributes or methods (a sort of interface).

> In OOP, interface represent the define attributes, functions, methods, classes, and other applicable components of an entity (such as a class or a package) that developers can use.

Consider the following list of tasks objects:

```python
tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]
```

Create a function `filter_tasks(tasks, by_urgency)` that can filter out the tasks based on the given argument `by_urgency`. That argument can be either a value like `3` or a collection (list, tuple, set, etc.).


The `Collection` abstract class is a sort of an interface that defines three special methods:
+ `__contains__` to check when an item exists in the collection, so that you can do `item in collection`.
+ `__iter__` so that you can do `iter(obj)` to obtain an iterator of the collection.
+ `__len__` so that you can do `len(obj)` and get the number of items in the collection.

Many classes inherit from this interface (list, tuple, Pandas' Series, etc.) so it seems to be the best fit for our flexible implementation:

In [8]:
from collections.abc import Collection
from dataclasses import dataclass

@dataclass
class Task:
    title: str
    desc: str
    urgency: int

def filter_tasks(tasks, by_urgency):
    if isinstance(by_urgency, int):
        result = [t for t in tasks if t.urgency == by_urgency]
    elif isinstance(by_urgency, Collection):
        result = [t for t in tasks if t.urgency in by_urgency]
    else:
        raise TypeError(f"by_urgency requires int or list by got {type(by_urgency)}")
    return result


tasks = [
    Task("Toaster", "Clean the toaster", 2),
    Task("Camera", "Export photos", 4),
    Task("Homework", "Physics and math", 5),
    Task("Floor", "Mop the floor", 3),
    Task("Internet", "Upgrade plan", 5),
    Task("Laundry", "Wash clothes", 3),
    Task("Museum", "Egypt exhibit", 4),
    Task("Utility", "Pay bills", 5)
]

print(filter_tasks(tasks, 2))
print(filter_tasks(tasks, [2, 3]))
print(filter_tasks(tasks, (2, 3)))
print(filter_tasks(tasks, {2, 3}))

try:
    filter_tasks(tasks, "Highest")
except Exception as e:
    print(f"Oops: {e}")


[Task(title='Toaster', desc='Clean the toaster', urgency=2)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
[Task(title='Toaster', desc='Clean the toaster', urgency=2), Task(title='Floor', desc='Mop the floor', urgency=3), Task(title='Laundry', desc='Wash clothes', urgency=3)]
Oops: 'in <string>' requires string as left operand, not int


Note that we've used `isinstance(by_urgency, Collection)`. Even if the given argument type (e.g., `list`) is not a concrete instance of `Collection`, `isinstance` traverses the class hierarchy to verify if `Collection` is a superclass of the given type. If it is, the check will succeed.

### Checking if an object is iterable using abc (abstract base classes)

In a previous exercise we did:

```python
def is_iterable(obj):
    try:
        _ = iter(obj)
    except TypeError:
        return False
    else:
        return True
    
print(is_iterable(5))
print(is_iterable([1, 2, 3]))
print(is_iterable("Hello"))
print(is_iterable((1, 2, "Hello")))
print(is_iterable({1: "one", 2: "two"}))
```

Improve the implementation using abstract base classes

In [10]:
from collections.abc import Iterable

def is_iterable(obj):
    return isinstance(obj, Iterable)

comparisons = [
    is_iterable(5),
    is_iterable([1, 2, 3]),
    is_iterable("Hello"),
    is_iterable((1, 2, "Hello")),
    is_iterable({1: "one", 2: "two"})
]

assert comparisons == [False, True, True, True, True]

### Understanding `__new__` and `__del__`

The special methods `__new__` and `__del__` are methods you can override in your classes to provide specific logic at construction and destruction time of your instances.

In [8]:
class Task:
    def __new__(cls, *args):
        print(f">>> in __new__")
        obj = object.__new__(cls)
        print(f">>> in __new__: instance 0x{id(obj):x} allocated")
        return obj

    def __init__(self, title):
        print(f">>> in __init__ for instance 0x{id(self):x}")
        self.title = title

    def __del__(self):
        print(f">> in __del__ for instance 0x{id(self):x}")


task = Task("homework")
del task # force destruction



>>> in __new__
>>> in __new__: instance 0x7f865d4faf80 allocated
>>> in __init__ for instance 0x7f865d4faf80
>> in __del__ for instance 0x7f865d4fa860
>> in __del__ for instance 0x7f865d4faf80


Please note that Jupyter might show additional log statements for `__del__`. The output of a regular Python program should look like:

```python
>>> in __new__
>>> in __new__: instance 0x7f0d7ac43b80 allocated
>>> in __init__ for instance 0x7f0d7ac43b80
>> in __del__ for instance 0x7f0d7ac43b80
```

The call to the destructor will happen automatically when the number of references to that object reaches zero.

That can be obtained through `sys.getrecount(obj)`

In [9]:
import sys

class Task:
    def __new__(cls, *args):
        print(f">>> in __new__")
        obj = object.__new__(cls)
        print(f">>> in __new__: instance 0x{id(obj):x} allocated")
        return obj

    def __init__(self, title):
        print(f">>> in __init__ for instance 0x{id(self):x}")
        self.title = title

    def __del__(self):
        print(f">> in __del__ for instance 0x{id(self):x}")


task = Task("homework")
print(sys.getrefcount(task))



>>> in __new__
>>> in __new__: instance 0x7f865d4fb850 allocated
>>> in __init__ for instance 0x7f865d4fb850
2


You can also use the `globals()` function to see if your variable is in scope:

In [10]:
import sys

class Task:
    def __new__(cls, *args):
        print(f">>> in __new__")
        obj = object.__new__(cls)
        print(f">>> in __new__: instance 0x{id(obj):x} allocated")
        return obj

    def __init__(self, title):
        print(f">>> in __init__ for instance 0x{id(self):x}")
        self.title = title

    def __del__(self):
        print(f">> in __del__ for instance 0x{id(self):x}")


task = Task("homework")
print(sys.getrefcount(task))

assert "task" in globals()

>>> in __new__
>>> in __new__: instance 0x7f865d4fa3b0 allocated
>>> in __init__ for instance 0x7f865d4fa3b0
>> in __del__ for instance 0x7f865d4fb850
2


### Creating a shallow copy of an object

We can create a shallow (as opposed to deep) copy of an object you can use the `copy` module, which provides copy related functionalities.

> a shallow copy only copies the *outmost* data containers (variables, objects, etc.)

In [15]:
from copy import copy

class Task:
    def __init__(self, title, desc):
        self.title = title
        self.desc = desc

    def __repr__(self):
        return f"Task(title={self.title!r}, desc={self.desc!r})"

    def save_data(self):
        """Update the database"""
        pass


task = Task("Homework", "Physics + Math")
print(repr(task))

task_bkp = copy(task)
print(repr(task_bkp))

# Now we modify the original
task.title = task.title + "(mod)"
print()
print(repr(task))
print(repr(task_bkp))


Task(title='Homework', desc='Physics + Math')
Task(title='Homework', desc='Physics + Math')

Task(title='Homework(mod)', desc='Physics + Math')
Task(title='Homework', desc='Physics + Math')


Note however this is a shallow copy (as oppossed to deep), meaning that for reference object you will be copying the reference, rather than the data:

In [22]:
from copy import copy

class Task:
    def __init__(self, title, desc, tags=None):
        self.title = title
        self.desc = desc
        self.tags = [] if tags is None else tags

    def __repr__(self):
        return f"Task(title={self.title!r}, desc={self.desc!r}, tags={self.tags!r})"

    def save_data(self):
        """Update the database"""
        pass


task = Task("Homework", "Physics + Math", ["boring staff", "school"])
print(repr(task))

task_bkp = copy(task)
print(repr(task_bkp))

# Now we modify the original
task.title = task.title + "(mod)"
task.tags.append("wasn't that bad after all")
print()
print(repr(task))
print(repr(task_bkp))

print()
print(id(task.tags))
print(id(task_bkp.tags))




Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school'])
Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school'])

Task(title='Homework(mod)', desc='Physics + Math', tags=['boring staff', 'school', "wasn't that bad after all"])
Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school', "wasn't that bad after all"])

140215073297344
140215073297344


See in the last line that both the tags for the original and the copied task point to the same memory area. Therefore, whenever we modify the original, the copy will also be changed and the other way around.

### Checking equality with `is` or `==`

`is` compares whether two objects are the same object (identity test). By contrast, `==` compares whether two objects have the same value.

For example, when checking an object agains `None` you should use `is`, because `None` is a singleton object and you'd like to check if the memory address of your object and the memory address of `None` is the same.

> `is` should be used when you need to check if the memory address of two objects are the same. In particular, any comparison with `None` should be using `None`.

> `==` should be used when you need to check that the value of two objects are the same, even if they have different memory addresses.

In [26]:
class Task:
    def __init__(self, title, desc, tags=None):
        self.title = title
        self.desc = desc

    def __repr__(self):
        return f"Task(title={self.title!r}, desc={self.desc!r}, tags={self.tags!r})"

task1 = Task("Homework", "Physics + Math")
task2 = Task("Homework", "Physics + Math")

assert task1 is not task2
assert id(task1) != id(task2)


### Creating a deep copy of an object

In a deep copy, we copy not only the *outmost* data container, but also perform recursive copies of the interior objects.

This can also be performed using the `copy` module:

In [27]:
from copy import deepcopy

class Task:
    def __init__(self, title, desc, tags=None):
        self.title = title
        self.desc = desc
        self.tags = [] if tags is None else tags

    def __repr__(self):
        return f"Task(title={self.title!r}, desc={self.desc!r}, tags={self.tags!r})"

    def save_data(self):
        """Update the database"""
        pass


task = Task("Homework", "Physics + Math", ["boring staff", "school"])
print(repr(task))

task_bkp = deepcopy(task)
print(repr(task_bkp))

# Now we modify the original
task.title = task.title + "(mod)"
task.tags.append("wasn't that bad after all")
print()
print(repr(task))
print(repr(task_bkp))

print()
print(id(task.tags))
print(id(task_bkp.tags))

Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school'])
Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school'])

Task(title='Homework(mod)', desc='Physics + Math', tags=['boring staff', 'school', "wasn't that bad after all"])
Task(title='Homework', desc='Physics + Math', tags=['boring staff', 'school'])

140214917981312
140214917986176


Note how in this case the address of the `tags` field for task and task copy are different.

### Changing a variable in a different scope

Consider the following piece of code:

In [28]:
db_filename = "N/A"

def set_database(db_name):
    db_filename = db_name

set_database("tasks.db")
print(db_filename)

N/A


This would not be the expected approach in other programming languages. You can note that even the IDE is trying to give us a hint about it.

#### Namespaces and Scopes

The mechanism for looking up variables in Python involves *namespaces*, which track the variables that have been defined. Namespaces can help locate the variable's information.

You can think of namespaces as being distionaries in which the active variables are the keys, and the corresponding values as the values.

Scopes form the boundaries for namespaces, while namespaces constitute the contents.

#### LEGB rule

When looking up a variable, Python examines the namespace that is associated with a given scope.

There are different levels of scopes for the lookup order, known as the LEGB rule.

> LEGB rule dictates the order of resolving a variable in Python, from Local (L), to enclosing (E), global (G), and built-in (B).

A module forms a global scope. Above the global, the built-in scope holds the namespaces for all the built-in functions and classes. In the module you can define a class or a function, each of which forms a local scope.

For functions defined within functions, the local scope of the outer function is known as the enclosing scope.

The LEGB rule applies in the sequential order for variable resolution. Python first searches in its local scope. If the name is resolved, the corresponding value is used. If not, Python continues searching the enclosing scope. If the name is resolved, the value is used &mdash; and so on for the global and built-in scopes sequentially.

If a name can't be resolved after Python checks all these scopes, a `NameError` is raised.

![LEGB](pics/legb.png)

The following picture illustrates the different scopes:

![scopes](pics/scopes.png)

In [None]:

num = int("5")

def outer_fn(info):
    print(info)
    x = 100
    def inner_fn():
        number_str = f"number: {num}"
        x_str = f"x: {x}"
        print(number_str, x_str)
    return inner_fn

inner = outer_fn("test")

### Accessing the namespaces `globals()`, `locals()`, ...

You can use the `globals()` and `locals()` function to view the namespaces:

| NOTE: |
| :---- |
| We use `list()` to show only the variable names. |

In [34]:
db_filename = "N/A"

def set_database(db_name):
    print(f">> in set_database: globals: ", list(globals()))
    print(f">> in set_database: locals (before): ", list(locals()))
    db_filename = db_name
    print(f">> in set_database: locals (after) : ", list(locals()))

set_database("tasks.db")
print(db_filename)



>> in set_database: globals:  ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__builtin__', '__builtins__', '_ih', '_oh', '_dh', 'In', 'Out', 'get_ipython', 'exit', 'quit', 'open', '_', '__', '___', '__vsc_ipynb_file__', '_i', '_ii', '_iii', '_i1', 'Task', '_i2', '_i3', '_i4', '_i5', '_i6', '_i7', '_i8', '_i9', 'sys', 'task', '_i10', '_i11', '_i12', '_i13', '_i14', 'copy', 'task_bkp', '_i15', '_i16', '_i17', '_i18', '_i19', '_i20', '_i21', '_i22', '_i23', 'num1', 'num2', '_i24', '_i25', 'task1', 'task2', '_i26', '_i27', 'deepcopy', '_i28', 'db_filename', 'set_database', '_i29', '_i30', '_i31', '_i32', '_i33', '_i34']
>> in set_database: locals (before):  ['db_name']
>> in set_database: locals (after) :  ['db_name', 'db_filename']
N/A


Note that `db_filename` is listed in the globals. Note also that after the statement `db_filename = db_name` is executed, a new `db_filename` local variable is added, thus shading the global one.


#### Changing a global variable with `global`

In order to fix it, you just need to use the `global` keyword:

In [35]:
db_filename = "N/A"

def set_database(db_name):
    global db_filename
    print(f">> in set_database: globals: ", list(globals()))
    print(f">> in set_database: locals (before): ", list(locals()))
    db_filename = db_name
    print(f">> in set_database: locals (after) : ", list(locals()))

set_database("tasks.db")
print(db_filename)

>> in set_database: globals:  ['__name__', '__doc__', '__package__', '__loader__', '__spec__', '__builtin__', '__builtins__', '_ih', '_oh', '_dh', 'In', 'Out', 'get_ipython', 'exit', 'quit', 'open', '_', '__', '___', '__vsc_ipynb_file__', '_i', '_ii', '_iii', '_i1', 'Task', '_i2', '_i3', '_i4', '_i5', '_i6', '_i7', '_i8', '_i9', 'sys', 'task', '_i10', '_i11', '_i12', '_i13', '_i14', 'copy', 'task_bkp', '_i15', '_i16', '_i17', '_i18', '_i19', '_i20', '_i21', '_i22', '_i23', 'num1', 'num2', '_i24', '_i25', 'task1', 'task2', '_i26', '_i27', 'deepcopy', '_i28', 'db_filename', 'set_database', '_i29', '_i30', '_i31', '_i32', '_i33', '_i34', '_i35']
>> in set_database: locals (before):  ['db_name']
>> in set_database: locals (after) :  ['db_name']
tasks.db


#### Changing an enclosing variable with `nonlocal`

The `nonlocal` keyword, changes an enclosing variable in a local scope.

| NOTE: |
| :---- |
| `nonlocal` is used less often than `global`, as global scopes are far more common, as they're only found when using inner functions. |

Consider the following code in which the technique is explained:

In [1]:
def change_text(using_nonlocal: bool):
    text = "N/A"
    def inner_fun0():
        text = "No nonlocal"

    def inner_fun1():
        nonlocal text
        text = "Using nonlocal"

    inner_fun1() if using_nonlocal else inner_fun0()
    return text

print(change_text(using_nonlocal=False))    # N/A
print(change_text(using_nonlocal=True))     # Using nonlocal



N/A
Using nonlocal


The function `inner_fun0` defines a new var named `text` that hides the `text` variable defined in the outer functin. In order to change its value you're required the `nonlocal` keyword.

### Callability and the `callable` built-in function

We say that an object is *callable* it it can be used with the call operator.

Python has a built-in function `callable`, that can check an object's callability.

As you may expect, a function is `callable`:

In [2]:
def doubler(x):
    return x * 2

assert callable(doubler) == True

Callable means an object can be called. When a function expects a callable, such as in `sorted(key: callable)`, you can pass a function or a class.

If you hava a custom class that implements `__call__`, then instances of the class are callable.

There are nuances between functions and objects that are callable.

One of the easiest ways to inspect an object is to call it with the `print` function:

In [4]:
def do_something():
    pass

print(do_something) # a function

print(sum) # something different... a built-in that is callable
print(map) # wait... what? a class?

<function do_something at 0x7f193fa46b00>
<built-in function sum>
<class 'map'>


Note that `map` is actually a class, a callable class. As what happens with `list`, `dict`, etc., calling `map` returns a `map` instance.

Classes can also be made callable, which might come in handy in some scenarios.

Consider the following list of Poker cards we need to sort:

In [6]:
cards = [10, 2, "J", "A"]

try:
    print(sorted(cards))
except Exception as e:
    print(f"Oops: {e}")

print(sorted(cards, key=str))

Oops: '<' not supported between instances of 'str' and 'int'
[1, 10, 'A', 'J']


Passing `str` gives us the wrong order. So we need to do some additional work:

In [8]:
class PokerOrder(int):
    def __new__(cls, x):
        numbers_mapping = {"J": 11, "Q": 12, "K": 13, "A": 14}
        casted_number = numbers_mapping.get(x, x)
        return super().__new__(PokerOrder, casted_number)

cards = [10, 2, "J", "A"]

assert callable(PokerOrder) == True

print(sorted(cards, key=PokerOrder))

[2, 10, 'J', 'A']


As the class constructor is callable we can pass it as the argument to `key`, which results in the cards being sorted correctly.

`PokerOrder` is a class that inherits from `int`. In its `__new__` we allocate an int with the given value.

### Creating decorators as classes

Remember that we used to create decorators using functions as seen below:

In [10]:
import functools
import time

def logging_time(func):
    @functools.wraps(func)
    def logger(*args, **kwargs):
        start_t = time.time()
        result = func(*args, **kwargs)
        print(f"Invocation of {func.__name__!r} took {time.time() - start_t:.5f} msec")
        return result

    return logger

@logging_time
def say_hello(person):
    """Greet someone"""
    print(f"Hello to {person}")

say_hello("Jason Isaacs!")

Hello to Jason Isaacs!
Invocation of 'say_hello' took 0.00008 msec


Because classes are also callable, we can create decorators in the form of a custom class:

In [12]:
import functools
import time

class TimeLogger:
    def __init__(self, func):
        @functools.wraps(func)
        def logger(*args, **kwargs):
            start_t = time.time()
            result = func(*args, **kwargs)
            print(f"Invocation of {func.__name__!r} took {time.time() - start_t:.5f} msec")
            return result
        self._logger = logger

    def __call__(self, *args, **kwargs):
        return self._logger(*args, **kwargs)

@TimeLogger
def calculate_sum(n):
    return sum(range(n))

result = calculate_sum(100_000)

Invocation of 'calculate_sum' took 0.00085 msec


Note that by implementing `__call__` we are making the instances of `TimeLogger` callable objects. In the implementation, we're telling the when called, the `logger` function should be executed instead.

## Section 49 &mdash; Processing JSON

When you use Python to make applications, and your applications have interactions with other systems via JSON, you must know how to convert data between JSON and Python.

JSON data types have corresponding native Python data structures. Most of the conversations are straightforward, except for numbers. JSON doesn't differentiate integers from floats, but Python does:


| JSON type | Python type |
| :-------- | :---------- |
| String: "one" | str: "one" |
| Number: 123<br>Number: 123.45 | int: 123<br>float: 123.45 |
| Boolean: true<br>Boolean: false | bool: True<br>bool: False |
| Array: [1, 2] | list: [1, 2] |
| Object: {"one": 1} | dict: {"one": 1} |
| Null: null | NoneType: None |


### Deserializing/Unmarshalling JSON string

Deserializing JSON means reading JSON data into Python.

Consider the following example in which a JSON string object representing an array of tasks is transformed into the corresponding Python structure.

`json.loads` function available in the `json` module can deserialize a JSON string converting it into a Python dictionary.

In [32]:
import json

tasks_json = """
[
    {
        "title": "Laundry",
        "desc": "Wash clothes",
        "urgency": 3
    },
    {
        "title": "Homework",
        "desc": "Physics + Math",
        "urgency": 5
    }
]
"""

tasks = json.loads(tasks_json)
assert tasks == [
    {
        "title": "Laundry", "desc": "Wash clothes", "urgency": 3
    },
    {
        "title": "Homework", "desc": "Physics + Math", "urgency": 5
    }
]



The `json.loads` method can be used parse any JSON data types other than objects:

In [41]:
def print_value_and_type(x):
    if type(x) == str:
        print(f"value: {x!r}, type: {type(x)}")
    else:
        print(f"value: {x}, type: {type(x)}")

print_value_and_type(json.loads("2.2"))
print_value_and_type(json.loads("2"))
print_value_and_type(json.loads('"2"'))
print_value_and_type(json.loads("true"))
print_value_and_type(json.loads("null"))

value: 2.2, type: <class 'float'>
value: 2, type: <class 'int'>
value: '2', type: <class 'str'>
value: True, type: <class 'bool'>
value: None, type: <class 'NoneType'>


Note however, that some transformations might be tricky and fail:

In [47]:
def print_value_and_type(x):
    if type(x) == str:
        print(f"value: {x!r}, type: {type(x)}")
    else:
        print(f"value: {x}, type: {type(x)}")

try:
    json.loads("True")
except Exception as e:
    print(f"Oops: {e}")

result: int = json.loads("1")
result: float = json.loads('[{"num": 1}, {"num": 1.24}]')

Oops: Expecting value: line 1 column 1 (char 0)


### Converting a dict object obtained from JSON deserialization into an instance of a dataclass

You can take advantage of dataclasses to convert a Python dict into the corresponding custom class as seen below:

In [33]:
import json
from dataclasses import dataclass

tasks_json = """
[
    {
        "title": "Laundry",
        "desc": "Wash clothes",
        "urgency": 3
    },
    {
        "title": "Homework",
        "desc": "Physics + Math",
        "urgency": 5
    }
]
"""

tasks_dict = json.loads(tasks_json)

@dataclass
class Task:
    title: str
    desc: str
    urgency: int

    @classmethod
    def from_dict(cls, task_dict):
        return cls(**task_dict)


tasks = [Task.from_dict(task_dict) for task_dict in tasks_dict]
print(tasks)

[Task(title='Laundry', desc='Wash clothes', urgency=3), Task(title='Homework', desc='Physics + Math', urgency=5)]


Note how we used `**` in the `from_dict` method:

> we knew that `**kwargs` refers to the variable number of keyword arguments packed as a dictionary. Conversely, the `**` operator converts a `dict` object to a *list* of keyword arguments that you could use to invoke a function.

### Serializing Python data to JSON

Serialization is the opposite of deserialization: you start from Python objects and via the serialization mechanism you end up with a JSON string.

The `json` module provides the `dumps` method for serializing Python objects into JSON:

In [49]:
python_data = ['text', False, {"0": None, 1: [1.0, 2.0]}]

json_data = json.dumps(python_data)
print(json_data)
print(repr(json_data))

["text", false, {"0": null, "1": [1.0, 2.0]}]
'["text", false, {"0": null, "1": [1.0, 2.0]}]'


However, you cannot serialize a custom class:

In [51]:
from dataclasses import dataclass
import json

@dataclass
class Task:
    title: str
    desc: str
    urgency: int


homework = Task("Homework", "Physics + Math", 5)

try:
    json_data = json.dumps(homework)
except Exception as e:
    print(f"Oops: {e}")


Oops: Object of type Task is not JSON serializable


One possible quick solution, especially for dataclasses, involves using `__dict__` and leverage the argument `default` which lets you configure custom encoding behavior:

In [56]:
from dataclasses import dataclass
import json

@dataclass
class Task:
    title: str
    desc: str
    urgency: int


homework = Task("Homework", "Physics + Math", 5)

json_data = json.dumps(homework, default=lambda t: t.__dict__)
print(repr(json_data))


'{"title": "Homework", "desc": "Physics + Math", "urgency": 5}'


### Prettifying `json.dumps` using indentation

The function `json.dumps` allows you to prettify the string generated using the `indent` argument:

In [58]:
from dataclasses import dataclass
import json

@dataclass
class Task:
    title: str
    desc: str
    urgency: int


homework = Task("Homework", "Physics + Math", 5)

json_data = json.dumps(homework, default=lambda t: t.__dict__, indent=2)
print(json_data)

{
  "title": "Homework",
  "desc": "Physics + Math",
  "urgency": 5
}


### Using `sort_keys` to get the JSON keys sorted alphabetically

You can use `sort_keys` to sort the keys within the JSON string:

In [60]:
from dataclasses import dataclass
import json

@dataclass
class Task:
    title: str
    desc: str
    urgency: int


homework = Task("Homework", "Physics + Math", 5)

json_data = json.dumps(
    homework,
    default=lambda t: t.__dict__,
    indent=2,
    sort_keys=True)
print(json_data)

{
  "desc": "Physics + Math",
  "title": "Homework",
  "urgency": 5
}


### Serializing named tuples (`namedtuple`s)

Named tuples are serialized as arrays of their elements as seen below.


In [61]:
from collections import namedtuple
import json

Task = namedtuple("Task", "title, desc, urgency")

homework = Task("Homework", "Physics + Math", 5)

result = json.dumps(homework)
print(result)
print(repr(result))

["Homework", "Physics + Math", 5]
'["Homework", "Physics + Math", 5]'


And a custom serializer cannot be used to change that default behavior because Python knows about `namedtuples` and therefore, it'll never get called.

In [65]:
from collections import namedtuple
import json

Task = namedtuple("Task", "title, desc, urgency")

homework = Task("Homework", "Physics + Math", 5)

def custom_encoder(t: Task):
    result = {
        "title": t.title,
        "desc": t.desc,
        "urgency": t.urgency
    }
    return result

result = json.dumps(
    homework,
    default=custom_encoder
)

print(result)
print(repr(result))

["Homework", "Physics + Math", 5]
'["Homework", "Physics + Math", 5]'


The workaround involves the creation of a wrapper class to force `json.dumps` to use the custom encoder:

In [68]:
from collections import namedtuple
import json

Task = namedtuple("Task", "title, desc, urgency")

homework = Task("Homework", "Physics + Math", 5)

class Wrapper:
    def __init__(self, nt):
        self.nt = nt


def custom_encoder(w):
    result = {
        "title": w.nt.title,
        "desc": w.nt.desc,
        "urgency": w.nt.urgency
    }
    return result



result = json.dumps(
    Wrapper(homework),
    default=custom_encoder
)

print(result)
print(repr(result))

{"title": "Homework", "desc": "Physics + Math", "urgency": 5}
'{"title": "Homework", "desc": "Physics + Math", "urgency": 5}'


## Section 50 &mdash; More on files

This section explains how to read and write files. Text files will be used throughout the examples to illustrate the techniques, but whatever is exlained also applies to other file formats.

The following diagram summarizes the reading and writing operations:

![File Operations](pics/file-operations.png)


### Opening and closing files (without the Context manager)




Consider the `tasks.csv` file, whose contents are:

```
1001,Homework,5
1002,Laundry,3
1003,Grocery, 4
```

The file can be opened using the `open` built-in function. This function returns an `_io.TextIOWrapper` object that represents a buffered text stream providing higher-level access to the underlying data in the file.

We typically refer to this object as a *stream* or *file object*.

The *stream* object also features certain attributes:
+ `name` &mdash; name of the file
+ `mode` &mdash; indicates how the file was opened (`r` for read, etc.) When a file is opened in `r` mode nonread operations won't succeed.
+ `encoding` &mdash; indicates how the file data was encoded. Most text data is encoded with UTF-8.

In [9]:
text_file = open("./exercises/section_50-files/file-1/tasks.csv")

print(text_file)

<_io.TextIOWrapper name='./exercises/section_50-files/file-1/tasks.csv' mode='r' encoding='UTF-8'>


From a *stream* or *file object*, we can read the data using `read`. This will return a string representation of the file contents:

In [10]:
text_file = open("./exercises/section_50-files/file-1/tasks.csv")

text_data = text_file.read()

assert type(text_data) == str
print(f"file contents:\n{text_data!r}")

print()
print(text_data)

file contents:
'1001,Homework,5\n1002,Laundry,3\n1003,Grocery,4\n'

1001,Homework,5
1002,Laundry,3
1003,Grocery,4



Once we're done with file processing, we must close the file by using the `close` method. 

After having closed the file, the `closed` attribute of the file will be True.

In [11]:
text_file = open("./exercises/section_50-files/file-1/tasks.csv")

# ...process the file data...

text_file.close()

assert text_file.closed == True

### Opening and closing files with the Context Manager (`with` statement)

You should always close files when you're done with them. To prevent us from losing data due to forgetting to close a file, we can use the *context management* technique.

This involves using the `with` statement, which is the Pythonic way to read files:

In [12]:
with open("./exercises/section_50-files/file-1/tasks.csv") as file:
    text_data = file.read()
    print(f"file contents:\n{text_data!r}")



file contents:
'1001,Homework,5\n1002,Laundry,3\n1003,Grocery,4\n'


When using this technique, you no longer need to close the file explicitly. A context manager establishes a connection to the applicable resource object in the `with` statement, and when the body of the `with` is completed, the context manager will automatically close the connection to the resource.

Note that `with` can be used with more resources than files. Any resource needing to be closed may support this technique.

For example, when working with sqlite databases you should do:

```python
import sqlite3

with con = sqlite3.connect("database.sqlite"):
  ...
```

### Reading data from a file using a generator (i.e., a `for` loop)

The `read` method obtains the entire file contents and materializes it in a string. In certain circumstances, your computer might not have enough memory to hold that data.

One of the techniques to work around processing of large files is through generators, as we will be able *yield* smaller pieces of the file in sequence.

Consider the following example in which we read the information from a file that contains tasks ([tasks.csv](./exercises/section_50-files/file-1/tasks.csv)) and create a named tuple out of each line:

In [16]:
from collections import namedtuple

Task = namedtuple("Task", "task_id, title, urgency")

with open("./exercises/section_50-files/file-1/tasks.csv") as file:
    for line in file:
        stripped_line = line.strip()
        task_id, title, urgency = stripped_line.split(",")
        task = Task(task_id, title, urgency)
        print(f"{stripped_line}: {task}")

1001,Homework,5: Task(task_id='1001', title='Homework', urgency='5')
1002,Laundry,3: Task(task_id='1002', title='Laundry', urgency='3')
1003,Grocery,4: Task(task_id='1003', title='Grocery', urgency='4')


Note that:
+ we can iterate over each of the lines of file simply using a `for` loop
+ we need to use `strip` to get rid of the `\n` found at the end of each line
+ the approach handles the last line (`\n`) gracefully

### Reading data from file to form a list using `readlines`

If the file is not too big, you can read the file contents using `readline` and form a list using `readlines`

In [19]:
with open("./exercises/section_50-files/file-1/tasks.csv") as file:
    lines = file.readlines()
    numbered_lines = [f"row #{row}: {line.strip()}" for row, line in enumerate(lines, start=1)]

assert numbered_lines == [
    "row #1: 1001,Homework,5",
    "row #2: 1002,Laundry,3",
    "row #3: 1003,Grocery,4",
]

| HINT: |
| :---- |
| See how we use `enumerate(lines, start=1)` to make the row count start from 1 rather than zero. |

### Reading a single line from a file using `readline`

In some cases we might be interested in reading only the header row of a file (e.g., CSV with a header line). The method `readline` let us read a single line.

`readline` can be used multiple times to read the contents of a file line by line as with the generator/for loop.

Optionally, you can pass `readline` a `size` argument that reads up to the number of characters in that line (e.g., `file.readline(5)` will read up to 5 characters in that line):

In [26]:
with open("./exercises/section_50-files/file-1/tasks.csv") as file:
    print(file.readline())  # whole first line
    print(file.readline())  # whole second line
    print(file.readline(5)) # first 5 characters of 3rd line
    print(file.readline(8)) # subsequent 8 characters of 3rd line
    print(file.readline())  # remaining contents of 3rd line



1001,Homework,5

1002,Laundry,3

1003,
Grocery,
4





| NOTE: |
| :---- |
| Like `readline`, both `read` and `readlines` also accept as size argument that specifies how many characters to read from the file. |

### Writing data to a new file using `write`

To write data to a new file, we can create a file object using `with` and passing the `"w"` mode when invoking the `open` method.

Then, we can call the `write` method. This method returns the number of characters written.

Use this approach to create a new file with the following contents:

```
1001,Homework,5
1002,Laundry,3
1003,Grocery,4
```



In [29]:
data = """1001,Homework,5
1002,Laundry,3
1003,Grocery,4"""

with open("./exercises/section_50-files/out-sandbox/tasks.csv", "w") as file:
    print(f"File opened for writing: {file}")
    result = file.write(data)
    print(f"result: {result} characters written")


File opened for writing: <_io.TextIOWrapper name='./exercises/section_50-files/out-sandbox/tasks.csv' mode='w' encoding='UTF-8'>
result: 45 characters written


If you try to write a file without having passed `"w"` as a mode you'll get an `io.UnsupportedOperation` exception:

In [31]:
data = """1001,Homework,5
1002,Laundry,3
1003,Grocery,4"""

try:
    # Note that the file must exist
    with open("./exercises/section_50-files/out-sandbox/tasks.csv") as file:
        print(f"File opened for writing: {file}")
        result = file.write(data)
        print(f"result: {result} characters written")
except Exception as e:
    print(f"Oops: {e}")

File opened for writing: <_io.TextIOWrapper name='./exercises/section_50-files/out-sandbox/tasks.csv' mode='r' encoding='UTF-8'>
Oops: not writable


### Writing a list of lines to a new file with `writelines`

You can write a list of lines to a new file using `writelines`:


In [33]:
list_data = [
    "1001,Homework,5",
    "1002,Laundry,3",
    "1003,Grocery,4"
]

with open("./exercises/section_50-files/out-sandbox/tasks.txt", "w") as file:
    file.writelines(list_data)

with open("./exercises/section_50-files/out-sandbox/tasks.txt") as file:
    str = file.read()

assert str == "1001,Homework,51002,Laundry,31003,Grocery,4"


Note that `writelines` do not append a `\n` after each list item, so the contents of the file will be a single line.

If you need to add the newline, you'll need to do it yourself:

In [34]:
list_data = [
    "1001,Homework,5",
    "1002,Laundry,3",
    "1003,Grocery,4"
]

with open("./exercises/section_50-files/out-sandbox/tasks.csv", "w") as file:
    file.writelines([f"{line}\n" for line in list_data])

### Appending string data to an existing file

If you want to write data to the end of a file, instead of opening it in write mode (with mode `"w"`), you should use mode `"a"`.

Use this approach to append the line:

```
1004,Museum,3
```

In [36]:
new_task = "1004,Museum,3"

with open("./exercises/section_50-files/file-2/tasks.csv", "a") as file:
    file.write(f"{new_task}\n")

Note that the management of `\n` might change from Windows and Linux systems. In the latter it is customary to end the file with a line consisting of a single `\n`.

### File modes and positions of the cursor

| Mode | read | write | create | truncate | Cursor position |
| :--- | :--- | :---- | :----- | :------- | :-------------- |
| r | * | | | | Start |
| w |   | * | * | * | Start |
| a |   | * | * |   | End |
| r+ | *  | * |   |   | Start |
| w+ | *  | * | * | * | Start |
| a+ | *  | * | * |  | End |
| x |  |  | * |  | Start |

### Reading a CSV file line-by-line using csv reader

The standard Python library provides a built-in solution for dealing with CSV files: the `csv` module, which allows us to read the data directly with a `csv_reader`:

In [39]:
import csv

with open("./exercises/section_50-files/file-1/tasks.csv", newline="") as file:
    csv_reader = csv.reader(file)
    for row in csv_reader:
        print(row)

['1001', 'Homework', '5']
['1002', 'Laundry', '3']
['1003', 'Grocery', '4']


Note the `newline=""`. According to the documentation, specifying the newline character as "" to ensure cross-platform consistency in the way the system treats it.

In the code, after opening the file for reading, we wrap the opened *stream*/*file object* in a `csv.reader`.

This returns an iterator, and therefore, we can iterate over the rows using a `for` loop, and each of the rows consists of a list that consist of the values extracted from the line, separated by commas.

### Reading a CSV file in one shot using csv reader

If the CSV file not very large, you can materialize the whole contents of the CSV in a list of lists using the `list` constructor:

In [40]:
import csv

with open("./exercises/section_50-files/file-1/tasks.csv", newline="") as file:
    csv_reader = csv.reader(file)
    tasks = list(csv_reader)

print(tasks)

[['1001', 'Homework', '5'], ['1002', 'Laundry', '3'], ['1003', 'Grocery', '4']]


### Reading a CSV file that has a header using dict using a manual approach

When you have a CSV file with a header, the best approach is to read each row as `dict`, with the header's field names becoming the keys.

In [42]:
import csv

with open("./exercises/section_50-files/file-3/tasks.csv", newline="") as file:
    csv_reader = csv.reader(file)
    fields = next(csv_reader)
    print(f"Fields extracted from header: {fields}")
    for row in csv_reader:
        task_dict = dict(zip(fields, row))
        print(task_dict)


Fields extracted from header: ['task_id', 'title', 'urgency']
{'task_id': '1001', 'title': 'Homework', 'urgency': '5'}
{'task_id': '1002', 'title': 'Laundry', 'urgency': '3'}
{'task_id': '1003', 'title': 'Grocery', 'urgency': '4'}


See that as `csv_reader` is an iterator, you can call `next` to obtain the first row of data.

Then, you can proceed with the usual line-by-line approach, but instead of keeping the items in a list, we create a dict. To do it in a very Pythonic way, we use `zip` to zip the field names and field values.

### Reading a CSV file with a header using a DictReader

To simplify the manual approach described in the previous section, the `csv` module provides a `DictReader` object:

In [43]:
import csv

with open("./exercises/section_50-files/file-3/tasks.csv", newline="") as file:
    csv_reader = csv.DictReader(file)
    for row in csv_reader:
        print(row)

{'task_id': '1001', 'title': 'Homework', 'urgency': '5'}
{'task_id': '1002', 'title': 'Laundry', 'urgency': '3'}
{'task_id': '1003', 'title': 'Grocery', 'urgency': '4'}


### Writing data to a CSV file using a writer

`reader` have its counterpart `writer`.

In [44]:
import csv

new_task = "1004,Museum,3"

with open("./exercises/section_50-files/file-4/tasks.csv", "a", newline="") as file:
    csv_writer = csv.writer(file)
    csv_writer.writerow(new_task.split(","))


Note that we didn't have to add a `\n`. If we needed to do that, you just use `file.write("\n")` as can be seen in the following example involving a file that does not end in `\n`:

In [47]:
import csv

new_task = "1004,Museum,3"

with open("./exercises/section_50-files/file-5/tasks.csv", "a", newline="") as file:
    csv_writer = csv.writer(file)
    file.write("\n")
    csv_writer.writerow(new_task.split(","))

### Writing data to a CSV file using a DictWriter

`DictReader` have its counterpart `DictWriter`.

Suppose that we want to save the following data to a new CSV file:

```python
tasks = [
   {'task_id': '1001', 'title': 'Homework', 'urgency': '5'},
   {'task_id': '1002', 'title': 'Laundry', 'urgency': '3'},
   {'task_id': '1003', 'title': 'Grocery', 'urgency': '4'}
]
```

In [50]:
import csv

tasks = [
   {'task_id': '1001', 'title': 'Homework', 'urgency': '5'},
   {'task_id': '1002', 'title': 'Laundry', 'urgency': '3'},
   {'task_id': '1003', 'title': 'Grocery', 'urgency': '4'}
]

fields = [ "task_id", "title", "urgency"]
with open("./exercises/section_50-files/file-6/tasks.csv", "w", newline="") as file:
    csv_writer = csv.DictWriter(file, fieldnames=fields)
    csv_writer.writeheader()
    csv_writer.writerows(tasks)

### Pickling objects for data preservation

Pickling is a technique that allows us to preserve various forms of Python data.

The term comes from the preservation of food using vinergar or similar solutions. 

In Python, *pickling* refers to the process of converting objects to a binary format for data preservation. That way, you can store them in binary format, and then conveniently retrieve them later.

Alsmost any object can be pickled:

In [51]:
import pickle

task_tuple = (1001, "Homework", 5)
task_dict = {"task_id": "1002", "title": "Laundry", "urgency": 3}

with open("./exercises/section_50-files/file-7/task_tuple.pickle", "wb") as file:
    pickle.dump(task_tuple, file)

with open("./exercises/section_50-files/file-7/task_dict.pickle", "wb") as file:
    pickle.dump(task_dict, file)

Note that the `dump` function saves the data to a file. Note also that the mode we've used is `"wb"` to inform the runtime that we're dealing with binary files.

In order to restore the pickled objects, we need to unpickle them:

In [52]:
import pickle

with open("./exercises/section_50-files/file-7/task_tuple.pickle", "rb") as file:
    task_tuple_loaded = pickle.load(file)

with open("./exercises/section_50-files/file-7/task_dict.pickle", "rb") as file:
    task_dict_loaded = pickle.load(file)

task_tuple = (1001, "Homework", 5)
task_dict = {"task_id": "1002", "title": "Laundry", "urgency": 3}
assert task_tuple == task_tuple_loaded
assert task_dict == task_dict_loaded

### Pickling and unpickling custom classes

Custom classes can be pickled too:

In [3]:
import pickle

class Task:
    def __init__(self, title, urgency):
        self.title = title
        self.urgency = urgency

task = Task("Laundry", 3)

with open("./exercises/section_50-files/file-7/task_class.pickle", "wb") as file:
    pickle.dump(task, file)

with open("./exercises/section_50-files/file-7/task_class.pickle", "rb") as file:
    task_loaded = pickle.load(file)

assert task_loaded.__dict__ == task.__dict__    # Equality must be True
assert task_loaded is not task                  # Identity must be False

Note that this works seamlessly because `Task` is known at the time of unpickling. If that were not the case, the unpickling will fail.

In [4]:
del Task

try:
    with open("./exercises/section_50-files/file-7/task_class.pickle", "rb") as file:
        task_loaded = pickle.load(file)
except Exception as e:
    print(f"Oops: {e}")


Oops: Can't get attribute 'Task' on <module '__main__'>


### Pickling and unpickling from/to strings with `pickle.dumps` and `pickle.loads`

While JSON is a great data exchange format, it doesn't work with custom classes unless you provide specific JSON serialization instructions, provided by setting the `default` argument when calling `json.dumps`.

Additionally, you cannot serialize certain types of objects like functions. By contrast, pickling is compatible with many more kinds of objects out of the box.

In [6]:
def doubler(x):
    return x * 2

doubler_pickle = pickle.dumps(doubler)
print(doubler_pickle)

doubler_loaded = pickle.loads(doubler_pickle)
assert doubler_loaded(5) == doubler(5)

b'\x80\x04\x95\x18\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x07doubler\x94\x93\x94.'


Note however that not everything can be pickled. For example, you can't pickle a module:

In [7]:
import os

try:
    os_dumped = pickle.dumps(os)
except Exception as e:
    print(f"Oops: {e}")

Oops: cannot pickle 'module' object


### Security considerations while pickling

Loading untrusted pickles is a serious threat vector, as those might come with malicious behavior:

./exercises/section_50-files/file-8/malicious_task_class.pickle

In [25]:
import os

class MaliciousTask:
    def __init__(self, title, urgency):
        self.title = title
        self.urgency = urgency

    def __reduce__(self):
        print("__reduce__ is called")
        return os.system, ('touch ./exercises/section_50-files/file-8/hacking.txt',)

malicious_task = MaliciousTask("Set fire", 5)

with open("./exercises/section_50-files/file-8/test_malicious.pickle", "wb") as file:
   pickle.dump(malicious_task, file)


__reduce__ is called


The `__reduce__` method is invoked during the pickling/unpickling process, and results in the creation of an unwanted file.

| NOTE: |
| :---- |
| I'm not seeing the file being created, but it's not failing either. |

In [26]:
import pickle

class MaliciousTaskTask:
    def __init__(self, title, urgency):
        self.title = title
        self.urgency = urgency

    def __reduce__(self):
        print(f"__reduce__ is called")
        return os.system, ('touch exercises/section_50-files/file-8/u.ve.been.owned.txt',)


with open("./exercises/section_50-files/file-8/malicious_task_class.pickle", "rb") as file:
    pickle.load(file)

### Creating a directory using `pathlib`

The `pathlib` module is the preferred approach to deal with handling paths and also creating directories.


In [29]:
from pathlib import Path

data_folder = Path("./exercises/section_50-files/file-9/data")
data_folder.mkdir()

assert data_folder.exists()

### Creating a bunch of files programmatically

Consider the creation of the following files in a given directory:

```
subject_123.config
subject_123.dat
subject_123.txt
subject_124.config
subject_124.dat
subject_124.txt
subject_125.config
subject_125.dat
subject_125.txt
```

With the information we've seen so far, we should be able to create this without much trouble:

In [3]:
from pathlib import Path

ids = [123, 124, 125]
extensions = ["config", "dat", "txt"]

data_dir = Path("./exercises/section_50-files/file-9/data")

for id in ids:
    for extension in extensions:
        filename = f"subject_{id}.{extension}"
        filepath = data_dir / filename
        with open(filepath, "w") as file:
            file.write(f"This is {filename}")

Note how the path has being constructed using:

```python
directory_path / filename
```

When using this approach, the operation will be OS agnostic.

### Retrieving the list of files of a specific kind

Consider a directory holding the files:

```
subject_123.config
subject_123.dat
subject_123.txt
subject_124.config
subject_124.dat
subject_124.txt
subject_125.config
subject_125.dat
subject_125.txt
```

We need to get all the ".dat" files.

| HINT: |
| :---- |
| Use the `glob` method. |

In [33]:
from pathlib import Path

data_dir = Path("./exercises/section_50-files/file-9/data")

data_files = data_dir.glob("*.dat")

for data_file in data_files:
    print(f"Processing {data_file}")

Processing exercises/section_50-files/file-9/data/subject_123.dat
Processing exercises/section_50-files/file-9/data/subject_125.dat
Processing exercises/section_50-files/file-9/data/subject_124.dat


### Moving files to a different folder

You can move files by just renaming their file path. For example, if you rename `data/subject_123.dat` to `subjects/subject_123/subject_123.dat`, the file would be effectively moved from `data/` to `subjects/subject_123`.

Consider the following directory `data/` containing:

```
subject_123.config
subject_123.dat
subject_123.txt
subject_124.config
subject_124.dat
subject_124.txt
subject_125.config
subject_125.dat
subject_125.txt
```

We want each of the files to be moved to a corresponding directory `subjects/subject_{id}/{filename}` as in the example mentioned above.

| HINT: |
| :---- |
| You will need to use the `mkdir` method. In order to create a multilevel directory even when some intermediate levels don't exist use `parents=True`. Also, use `exist_ok=True` to silence the situation in which directories already exist. |

In [2]:
from pathlib import Path

subject_ids = [123, 124, 125]
data_folder = Path("./exercises/section_50-files/file-9/data")
target_base_folder = Path("./exercises/section_50-files/out-sandbox")

for subject_id in subject_ids:
    target_subject_folder = Path(f"subjects/subject_{subject_id}")
    target_folder = target_base_folder / target_subject_folder
    target_folder.mkdir(parents=True, exist_ok=True)

    for subject_file in data_folder.glob(f"*{subject_id}*"):
        filename = subject_file.name
        target_path = target_folder / filename
        _ = subject_file.rename(target_path)
        print(f"Moved {filename} to {target_path}")

Moved subject_123.dat to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.dat
Moved subject_123.config to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.config
Moved subject_123.txt to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.txt
Moved subject_124.txt to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.txt
Moved subject_124.config to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.config
Moved subject_124.dat to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.dat
Moved subject_125.txt to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.txt
Moved subject_125.config to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.config
Moved subject_125.dat to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.dat


### Copying files to a different folder

The `shutil` module provides a high-level API for manipulating files. 


This module features the `copy(src, dst)` method to perform the copy.

It also exposes an `rmtree` method that lets you remove a directory and its contents.

In [4]:
import shutil

shutil.rmtree("./exercises/section_50-files/out-sandbox/subjects")

subject_ids = [123, 124, 125]
data_folder = Path("./exercises/section_50-files/file-9/data")
target_base_folder = Path("./exercises/section_50-files/out-sandbox")

for subject_id in subject_ids:
    target_subject_folder = Path(f"subjects/subject_{subject_id}")
    target_folder = target_base_folder / target_subject_folder
    target_folder.mkdir(parents=True, exist_ok=True)

    for subject_file in data_folder.glob(f"*{subject_id}*"):
        filename = subject_file.name
        target_path = target_folder / filename
        _ = shutil.copy(subject_file, target_path)
        print(f"Copied {filename} to {target_path}")


Copied subject_123.dat to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.dat
Copied subject_123.config to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.config
Copied subject_123.txt to exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.txt
Copied subject_124.txt to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.txt
Copied subject_124.config to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.config
Copied subject_124.dat to exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.dat
Copied subject_125.txt to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.txt
Copied subject_125.config to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.config
Copied subject_125.dat to exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.dat


Note that `Path` exposes the method `rmdir`, but it complains if the directory is not empty. By contrast, `rmtree` in `shutil` can remove a directory and its contents.

In [6]:
from pathlib import Path

try:
    Path("./exercises/section_50-files/out-sandbox/subjects").rmdir()
except Exception as e:
    print(f"Oops: {e}")

Oops: [Errno 39] Directory not empty: 'exercises/section_50-files/out-sandbox/subjects'


### Deleting specific files

Consider the following scenario in which we want to remove the `.txt` files from a specific folder.

The `Path` class provides the `unlink` method to delete a file.

In [7]:
from pathlib import Path

data_folder = Path("./exercises/section_50-files/file-9/data")

for file in data_folder.glob("*.txt"):
    before = file.exists()
    file.unlink()
    after = file.exists()
    print(f"Deleting {file}, existing: {before} -> {after}")


Deleting exercises/section_50-files/file-9/data/subject_124.txt, existing: True -> False
Deleting exercises/section_50-files/file-9/data/subject_125.txt, existing: True -> False
Deleting exercises/section_50-files/file-9/data/subject_123.txt, existing: True -> False


### Retrieving file name and file extension related metadata

The following snippet illustrates how to obtain file name and extension.

In it, we access the `subjects/` directory tree looking for `*.dat` files in all the subdirectories of `subjects/`. When found, we get the directory name using `parent` and the filename (without extension) using `stem` and then build the path of the corresponding `{filename}.config` file.

In [9]:
from pathlib import Path

subjects_folder = Path("./exercises/section_50-files/out-sandbox/subjects")

for dat_path in subjects_folder.glob("**/*.dat"):
    subject_dir = dat_path.parent
    filename = dat_path.stem
    config_path = subject_dir / f"{filename}.config"
    print(f"{subject_dir} & {filename} -> {config_path}")

    dat_exists = dat_path.exists()
    config_exists = config_path.exists()

    with open(dat_path) as dat_file, open(config_path) as config_file:
        print(f"Processing {filename}: dat? {dat_exists}, config? {config_exists}\n")
        # additional processing of files


exercises/section_50-files/out-sandbox/subjects/subject_125 & subject_125 -> exercises/section_50-files/out-sandbox/subjects/subject_125/subject_125.config
Processing subject_125: dat? True, config? True

exercises/section_50-files/out-sandbox/subjects/subject_123 & subject_123 -> exercises/section_50-files/out-sandbox/subjects/subject_123/subject_123.config
Processing subject_123: dat? True, config? True

exercises/section_50-files/out-sandbox/subjects/subject_124 & subject_124 -> exercises/section_50-files/out-sandbox/subjects/subject_124/subject_124.config
Processing subject_124: dat? True, config? True



Note that `with` can be used with multiple resources in one statement.

In summary you can use:
+ `parent` &mdash; retrieves the directory on which file is hosted.
+ `name` &mdash; retrieves the entire filename, including the extension.
+ `stem` &mdash; retrieves the filename, without the extension.
+ `suffix` &mdash; retrieves the file extension.

In [11]:
from pathlib import Path

dat_path = Path("./exercises/section_50-files/file-9/data/subject_123.dat")

print(f"parent: {dat_path.parent}")
print(f"name  : {dat_path.name}")
print(f"stem  : {dat_path.stem}")
print(f"suffix: {dat_path.suffix}")


parent: exercises/section_50-files/file-9/data
name  : subject_123.dat
stem  : subject_123
suffix: .dat


### Retrieving file's size and time metadata

The following snippet illustrates how you can get the size of a file using `stat().st_size`.

In the example, we get all the `*.dat` files, retrieve their size and check that it is within certain limits.

In [13]:
from pathlib import Path

def process_data_using_size_cutoff(min_size, max_size):
    data_folder = Path("./exercises/section_50-files/file-9/data/")
    for dat_path in data_folder.glob("*.dat"):
        filename = dat_path.name
        size = dat_path.stat().st_size
        if min_size < size < max_size:
            print(f"{filename}: within size limits: size={size} (min={min_size}, max={max_size})")
        else:
            print(f"{filename}: outside size limits: size={size} (min={min_size}, max={max_size})")


process_data_using_size_cutoff(20, 40)

print()
process_data_using_size_cutoff(40, 60)

subject_123.dat: within size limits: size=23 (min=20, max=40)
subject_125.dat: within size limits: size=23 (min=20, max=40)
subject_124.dat: within size limits: size=23 (min=20, max=40)

subject_123.dat: outside size limits: size=23 (min=40, max=60)
subject_125.dat: outside size limits: size=23 (min=40, max=60)
subject_124.dat: outside size limits: size=23 (min=40, max=60)


We can get the time related metadata (such as modification time) using `stat().st_mtime`.

Note that in order to make it more readable, we also import the `time` module and use `ctime`

In [15]:
import time

base_path = Path("./exercises/section_50-files/file-9")
subject_dat_path = Path("data/subject_123.dat")

dat_path = base_path / subject_dat_path

modified_time = dat_path.stat().st_mtime
readable_modified_time = time.ctime(modified_time)

print(f"Modification time: {modified_time} -> {readable_modified_time}")

Modification time: 1694502454.1044314 -> Tue Sep 12 09:07:34 2023


## Section 51 &mdash; Logging

This section details the best practices about logging in Python. 

### Logger instantiation best practices

The `logging` module from the standard Python library provides all the logging functionalities. This module has the `Logger` class, whose constructor takes a name to create an instance:

In [None]:
import logging

# bad practice: DO NOT USE
logger_bad = logging.Logger("task_app")

While the previous approach works, we shouldn't be using that approach. Instead, we should use the `getLogger` factory method:

In [None]:
import logging

logger = logging.getLogger("task_app")

The reason is that we want to get a shared instance of the `Logger` class to handle logging, instead of a brand new instance each time we invoke `Logger` constructor.

In [16]:
import logging

logger0_bad = logging.Logger("task_app")
logger1_bad = logging.Logger("task_app")

logger0_good = logging.getLogger("task_app")
logger1_good = logging.getLogger("task_app")

assert logger0_bad is not logger1_bad   # identity
assert logger0_good is logger1_good     # identity

As a best practice, you should create the logger by running `logging.getLogger(__name__)` which returns the module name (e.g., for `server.py`, `__name__` will be `server`).

### Sending logger messages to file

While not that useful in production use cases anymore, as most of the hyperscalers require you to log to the stdout, we'll see in this section how to log to a file.

In [17]:
import logging

logger = logging.getLogger(__name__)

file_handler = logging.FileHandler("./exercises/section_51-logging/logfile-0/app.log")

logger.addHandler(file_handler)

# Now logger invocations will end up in file
task_title = "Laundry"
logger.warning(f"removed the task {task_title} from the database")

### Configuring multiple handlers to a logger

Besides file handlers, you can add a stream handler which can be used to log records in an interactive console.

In [20]:
import logging

logger = logging.getLogger(__name__)

stream_handler = logging.StreamHandler()

logger.addHandler(stream_handler)
logger.warning("a sample warning event")



The method `hasHandlers` can be used to check if a logger has handlers already.

In [22]:
import logging

logger = logging.getLogger(__name__)

if logger.hasHandlers():
    print(f"It has handlers: {logger.handlers}")

It has handlers: [<FileHandler /home/ubuntu/Development/git-repos/side_projects/python-workbench/part_1-python-fundamentals/00_basic-python-workout/exercises/section_51-logging/logfile-0/app.log (NOTSET)>, <StreamHandler stderr (NOTSET)>, <StreamHandler stderr (NOTSET)>, <StreamHandler stderr (NOTSET)>]


### Logger levels

Python's logging module gives you access to five levels `DEBUG`, `INFO`, `WARNING`, `ERROR`, and `CRITICAL`, plus a base level `NOTSET` which has a numeric value of `0` and isn't typically used.

| Severity Value | Logging Level | Description |
| :------------- | :------------ | :---------- |
| 50 | `CRITICAL` | Severe error in core functionalities |
| 40 | `ERROR` | Errors in certain functionalities |
| 30 | `WARNING` | Unexpected behavior that can lead to errors |
| 20 | `INFO` | Information about expected behaviors |
| 10 | `DEBUG` | Diagnosis information to facilitate debugging |

In order to benefit from the logging levels you first need to set the level of a logger. When you set an specific level, all logging records at that level and the ones above it will be captured by the logger:

In [24]:
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.WARNING)

print(f"level={logger.level}, levelToName={logging._levelToName[logger.level]}")

def logging_messages_all_levels():
    logger.critical("critical message")
    logger.error("error message")
    logger.warning("warning message")
    logger.info("info message")
    logger.debug("debug message")

logging_messages_all_levels() # Warning an above only will be displayed


critical message
critical message
critical message
error message
error message
error message




### Setting level to individual logging handlers

It is possible to set logging level per handler:

In [27]:
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

handler_warning = logging.FileHandler("./exercises/section_51-logging/logfile-0/app-warning.log")
handler_warning.setLevel(logging.WARNING)
logger.addHandler(handler_warning)

critical_warning = logging.FileHandler("./exercises/section_51-logging/logfile-0/app-critical.log")
critical_warning.setLevel(logging.CRITICAL)
logger.addHandler(critical_warning)


def logging_messages_all_levels():
    logger.critical("critical message")
    logger.error("error message")
    logger.warning("warning message")
    logger.info("info message")
    logger.debug("debug message")

logging_messages_all_levels() # Warning an above only will be displayed


critical message
critical message
critical message
error message
error message
error message
info message
info message
info message
debug message
debug message
debug message


### Cleaning the logging handlers

You can reset the existing handlers of a loggers doing:

In [28]:
logger.handlers = []

### Configuring the logger format

The following snippet configure the logger format in a somewhat standard way:

In [29]:
import logging

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

logger.handlers = []  # Make sure we start anew

formatter = logging.Formatter("%(asctime)s [%(levelname)s] - %(name)s - %(message)s")
stream_handler = logging.StreamHandler()
stream_handler.setLevel(logging.DEBUG)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)

def logging_messages_all_levels():
    logger.critical("critical message")
    logger.error("error message")
    logger.warning("warning message")
    logger.info("info message")
    logger.debug("debug message")


logging_messages_all_levels()


2023-09-12 14:25:59,194 [CRITICAL] - __main__ - critical message
2023-09-12 14:25:59,196 [ERROR] - __main__ - error message
2023-09-12 14:25:59,197 [INFO] - __main__ - info message
2023-09-12 14:25:59,198 [DEBUG] - __main__ - debug message


### Logging exceptions

When logger is invoked within exception handlers, it is recommended to use the `exception` method:

In [1]:
import logging

logger = logging.getLogger(__name__)

try:
    raise Exception("Fabricated exception")
except Exception:
    logger.exception("Oops")

Oops
Traceback (most recent call last):
  File "/tmp/ipykernel_4500/1673740087.py", line 6, in <module>
    raise Exception("Fabricated exception")
Exception: Fabricated exception


### Avoiding preformatting log messages

While we're used to use f-strings for formatting log messages, this bypasses one of logging's features that delays string formatting until it is actually needed.

As a result, it is recommended in log messages to use old-style formatters:



In [2]:
import logging

logger = logging.getLogger(__name__)

name = "Jason Isaacs"
logging.error("Couldn't say hello to %s", name)

ERROR:root:Couldn't say hello to Jason Isaacs


The following table describes the most common formatters:

| Format specifier | Description | Example |
| :--------------- | :---------- | :------ |
| `%s` | string formatter<br>can be used for any object with a string representation (lists, tuples, etc.) | logger.error("Hello, %s", name) |
| `%d` | integer formatter | logger.error("It failed %d times", num) |
| `%f` | floating point formatter | logger.error("Expected %f", num) |
| `%.nf` | floating point formatter with fixed amount of digits to the right of the decimal point | logger.error("Num %.5f unexpected", num) |

## Section 52 &mdash; Unit Testing

This section deals with the basics of unit testing:

### Understanding the basis for testing functions

Consider the the following snippet.

The first goal of a unit test is that for a given input the function returns a definite output.

That can be implemented with assertions:


In [12]:
class Task:
    def __init__(self, title, urgency):
        self.title = title
        self.urgency = urgency

def create_task(text):
    title, urgency_text = text.split(",")
    urgency = int(urgency_text)
    task = Task(title, urgency)
    return task

# Poor man's unit test framework
assert Task("Task Title", 34).__dict__ == create_task("Task Title,34").__dict__

### Creating your first TestCase class for unit testing

You can do unit tests in Python using the `unittest` module. This will require you to create a `TestCase` class

```python
import unittest
from task_func import Task, create_task

class TestTaskCreation(unittest.TestCase):
    def test_create_task(self):
        task_text = "Laundry,3"
        created_task = create_task(task_text)
        self.assertEqual(created_task.__dict__, Task("Laundry", 3).__dict__)

if __name__ == "__main__":
    unittest.main()
```

Note that:
+ The `TestTaskCreation` class is created by inheriting from `TestCase`
+ It's a convention to name test classes as `Test*`
+ The test methods should be named `test_{func_name_to_test}`
+ We invoke `unittest.main()` to run the test class

From the command line, you can do:

```bash
python test_task_func.py
```

Assuming that the name of the file containing the test class is `test_task.func.py`.

Additional tests can be added to the same  test class.

For example, if we have a function:

```python
def create_task_from_dict(task_data):
  title = task_data["title"]
  urgency = task_data["urgency"]
  task = Task(title, urgency)
  return task
```

We can enhance our test class as follows:
```python
import unittest
from task_func import Task, create_task, create_task_from_dict

class TestTaskCreation(unittest.TestCase):
    def test_create_task(self):
        task_text = "Laundry,3"
        created_task = create_task(task_text)
        self.assertEqual(created_task.__dict__, Task("Laundry", 3).__dict__)

    def test_create_task_from_dict(self):
        task_data = {"title": "Laundry", "urgency": 3}
        created_task = create_task_from_dict(task_data)
        self.assertEqual(created_task.__dict__, Task("Laundry", 3).__dict__)


if __name__ == "__main__":
    unittest.main()
```

### Setting up the test environment

`TestCase` has a `setUp` method you can override whenever you need some action to be performed before running any test.

In the example above, we could use it to refactor the set up of `task_data`:

```python
import unittest
from task_func import Task, create_task, create_task_from_dict

class TestTaskCreation(unittest.TestCase):
    def setUp(self):
        task_to_compare = Task("Laundry", 3)
        self.task_dict = task_to_compare.__dict__

    def test_create_task(self):
        task_text = "Laundry,3"
        created_task = create_task(task_text)
        self.assertEqual(created_task.__dict__, self.task_dict)

    def test_create_task_from_dict(self):
        task_data = {"title": "Laundry", "urgency": 3}
        created_task = create_task_from_dict(task_data)
        self.assertEqual(created_task.__dict__, self.task_dict)


if __name__ == "__main__":
    unittest.main()
```

### Testing `@classmethod`s

Consider the following class that exposes two methods decorated with `@classmethod` as they don't need any instance of the class:

```python
class Task:
    def __init__(self, title, urgency):
        self.title = title
        self.urgency = urgency

    @classmethod
    def task_from_text(cls, text_data):
      title, urgency_text = text_data.split(",")
      urgency = int(urgency_text)
      task = cls(title, urgency)
      return task

    @classmethod
    def task_from_dict(cls, task_data):
        title = task_data["title"]
        urgency = task_data["urgency"]
        task = cls(title, urgency)
        return task      
      
```

The class can be tested with the following code:

```python
import unittest
from task_func import Task

class TestTaskCreation(unittest.TestCase):
    def setUp(self):
        task_to_compare = Task("Laundry", 3)
        self.task_dict = task_to_compare.__dict__

    def test_create_task_from_text(self):
        task_text = "Laundry,3"
        created_task = Task.task_from_text(task_text)
        self.assertEqual(created_task.__dict__, self.task_dict)

    def test_create_task_from_dict(self):
        task_data = {"title": "Laundry", "urgency": 3}
        created_task = Task.task_from_dict(task_data)
        self.assertEqual(created_task.__dict__, self.task_dict)


if __name__ == "__main__":
    unittest.main()
```

Note that the `setUp` method will be executed before each and every test.

### Mocking

A mock object substitutes and imitates a real object within a testing environment.

This becomes really handy if your code is difficult to test in certain areas, including calls to external systems, interactions with the file system, etc.

Additionally, mock objects tend to expose methods that lets you understand:
+ If a method has been called
+ The arguments passed when invoking a method
+ How many times you invoked a method (if more than one)

The standard library includes `unittest.mock` for your mocking needs.

It provides a class called `Mock` that can be used to imitate real objects. The library also provides a function `patch`, which replaces real objects in your code with `Mock` instances.

`patch` can be used as a decorator or as a context manager. If using the latter, once the designated scope ends, the mock object will be replaced by the real one, which is useful when you only require mocking for a certain portion of your test function.

### The `Mock` object

A `Mock` must simulate the object it replaces. For example, when mocking the `json` library, the mock object must contain the function `dumps` so that the existing code written for the library works also with the mock version.

In [1]:
from unittest.mock import Mock

# Instantiate a Mock instance
mock = Mock()

# Patch the `json` library
json = mock

# invoke the `dumps()` function on the Mock
json.dumps()

<Mock name='mock.dumps()' id='139650868565072'>

As you can see from the example above a `Mock` object creates arbitrary attributes on the fly &mdash; we didn't told `Mock` to create a `dumps` function and yet, the library itself created it.

As a result, it will be suitable to replace any object:

In [2]:
from unittest.mock import Mock

# Instantiate a Mock instance
mock = Mock()

mock.some_attribute

<Mock name='mock.some_attribute' id='139650868565744'>

In [3]:
from unittest.mock import Mock

# Instantiate a Mock instance
mock = Mock()

mock.do_something()

<Mock name='mock.do_something()' id='139650867364912'>

Note also that the attributes and functions defined on the mock return also a `Mock`. This will allow you to recursively define other mocks to handle complex scenarios.

For example:

In [4]:
from unittest.mock import Mock

# Mock the json library
json = Mock()

# Calling json.loads(...).get() just works
json.loads('{"key": "value"}').get("key")

<Mock name='mock.loads().get()' id='139650867367504'>

#### Assertions and Inspection with Mock

As discussed in the introduction, Mock objects allow you to understand if a method has been called and how:

In [6]:
from unittest.mock import Mock

# Mock the json library
json = Mock()

json.loads('{"key": "value"}')

json.loads.assert_called()
json.loads.assert_called_once()
json.loads.assert_called_with('{"key": "value"}')
json.loads.assert_called_once_with('{"key": "value"}')


If an assertion fails, an `AssertionError` will be raised:

In [7]:
from unittest.mock import Mock

# Mock the json library
json = Mock()

json.loads('{"key": "val"}')

try:
    json.loads.assert_called_with('{"key": "value"}')
except AssertionError as ex:
    print(f"Ooops: {ex}")


Ooops: expected call not found.
Expected: loads('{"key": "value"}')
Actual: loads('{"key": "val"}')


In particular, the `assert_called_with` and `assert_called_once_with` are defined as:

```python
assert_called_with(*args, **kwargs)
assert_called_once_with(*args, **kwargs)
```

This forces you to use in the assertion the exact same list of parameters used when invoking the Mock:

In [8]:
from unittest.mock import Mock

# Mock the json library
json = Mock()

json.loads(s='{"key": "val"}')

try:
    json.loads.assert_called_with('{"key": "val"}')
except AssertionError as ex:
    print(f"Ooops: {ex}")

Ooops: expected call not found.
Expected: loads('{"key": "val"}')
Actual: loads(s='{"key": "val"}')


The Mock objects provide a wide range of features to spy how you're interacting with the mocked object:

In [15]:
from unittest.mock import Mock

def print_sep():
    print()
    print("#" * 70)

# Mock the json library
json = Mock()

json.loads(s='{"key": "val"}')

print(f"json.loads.call_count: {json.loads.call_count}")
print(f"json.loads.call_args: {json.loads.call_args}")
print(f"json.loads.call_args_list: {json.loads.call_args_list}")

print_sep()
json.loads(s='{"key": "val2"}')
print(f"json.loads.call_count: {json.loads.call_count}")
print(f"json.loads.call_args: {json.loads.call_args}")    # last call
print(f"json.loads.call_args_list: {json.loads.call_args_list}") # list of arguments across calls

print_sep()
print(json.method_calls)


json.loads.call_count: 1
json.loads.call_args: call(s='{"key": "val"}')
json.loads.call_args_list: [call(s='{"key": "val"}')]

######################################################################
json.loads.call_count: 2
json.loads.call_args: call(s='{"key": "val2"}')
json.loads.call_args_list: [call(s='{"key": "val"}'), call(s='{"key": "val2"}')]

######################################################################
[call.loads(s='{"key": "val"}'), call.loads(s='{"key": "val2"}')]


#### Customizing a `Mock`'s return value

`Mock` objects also allow you to control the return value.

Consider the following snippet which returns whether today is a weekday or not:

In [16]:
from datetime import datetime

def is_weekday() -> bool:
    today = datetime.today()
    return (0 <= today.weekday() < 5)

In order to properly test that function, you need to use mocking so that the test result does not depend on whether you run the test on a weekday or on the weekends.

The following code illustrates how to do that:

In [19]:
from unittest.mock import Mock
from datetime import datetime

# Define a couple of days
sunday = datetime(year=2023, month=9, day=24)
tuesday = datetime(year=2023, month=9, day=26)

# Mock datetime
datetime = Mock()

# Function under test
def is_weekday() -> bool:
    today = datetime.today()
    return (0 <= today.weekday() < 5)


# Testing with mocks and dates defined above
## Test tuesday is weekday
datetime.today.return_value = tuesday
assert is_weekday()
## Test sunday is not a weekday
datetime.today.return_value = sunday
assert not is_weekday()



#### Customizing a `Mock`'s behavior with `side_effect`
This approach will work for basic scenarios, but won't be sufficient for more complex ones in which you might need to control the mocked function's behavior.

Consider the following function, which uses an endpoint to get the list of holidays:


In [None]:
import requests

def get_holidays():
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json
    return None

That API is supposed to return a dictionary with the holidays if the server responds, but we want to make sure that the behavior when we can't connect to the server because of a timeout.

That complex scenario can be modeled with `side_effect`:

In [21]:
from requests.exceptions import Timeout
from unittest.mock import Mock

# Mock requests lib
requests = Mock()

# Function under test
def get_holidays():
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json()
    return None

# Mock the behavior to return an Exception
requests.get.side_effect = Timeout

try:
    get_holidays()
except Exception as ex:
    print(f"Oops: {type(ex)}")

Oops: <class 'requests.exceptions.Timeout'>


Of course, this can be improved to provide a more contrived behavior that not only returns some specific value, but also perform some side effects.

In [27]:
from unittest.mock import Mock

# Mock requests lib
requests = Mock()

# Function under test
def get_holidays():
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json()
    return None

# Create a function that we will wire as the behavior for the Mock
def custom_request_behavior(url):
    print(f"Simulating a request to {url}")
    response_mock = Mock()
    response_mock.status_code = 200
    response_mock.json.return_value = {
        "12/25": "Christmas",
        "7/4": "Independence Day"
    }
    return response_mock

# Configure request.get behavior with our custom behavior which sets the status code
# and returns a sample response
requests.get.side_effect = custom_request_behavior

assert get_holidays()["12/25"] == "Christmas"
assert get_holidays()["7/4"] == "Independence Day"


Simulating a request to http://localhost/api/holidays
Simulating a request to http://localhost/api/holidays


When using `unitest`, the previous snippet should be refactored as follows:

```python
import requests
import unittest
from unittest.mock import Mock

# Mock requests lib
requests = Mock()

# Function under test
def get_holidays():
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json()
    return None

class TestGetHolidays(unittest.TestCase):
    def log_request(self, url):
        print(f"Simulating a request to {url}")
        response_mock = Mock()
        response_mock.status_code = 200
        response_mock.json.return_value = {
            "12/25": "Christmas",
            "7/4": "Independence Day"
        }
        return response_mock

    def test_get_holidays(self):
        requests.get.side_effect = self.log_request

        assert get_holidays()["12/25"] == "Christmas"
        assert get_holidays()["7/4"] == "Independence Day"


if __name__ == "__main__":
    unittest.main()
```

`side_effect` can also be configured with an iterable, on which the configured behavior will produce its next value each time you call your mocked value.
The iterable must consist of return values, exceptions, or a mixture of both.

As an example, the snippet below is configured to raise an Exception first, and then return a value. Note that the second element of the iterable is the `response_mock` object itself and not a function:

In [36]:
from unittest.mock import Mock
# from requests.exceptions import Timeout

# Mock requests lib
requests = Mock()

# Function under test
def get_holidays():
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json()
    return None

# Create a Mock() to imitate a Response from the endpoint
response_mock = Mock()
response_mock.status_code = 200
response_mock.json.return_value = {
    "12/25": "Christmas",
    "7/4": "Independence Day"
}

requests.get.side_effect = [Timeout, response_mock]

try:
    get_holidays()
except Exception as ex:
    assert isinstance(ex, Timeout)

assert get_holidays()["12/25"] == "Christmas"
assert requests.get.call_count == 2

#### Configuring your `Mock`

You can configure a `Mock` by specifying certain attributes when you initialize a `Mock` object:

In [42]:
from unittest.mock import Mock

mock = Mock(name="my mock")
print(mock)

mock = Mock(name="my other mock", return_value=True)
print(f"{mock}: {mock()}")

def my_side_effect(name="stranger"):
    print(f"Hello to {name}")

mock = Mock(name="yet another mock", side_effect=my_side_effect)
print(mock())

<Mock name='my mock' id='139650459600304'>
<Mock name='my other mock' id='139650459601840'>: True
Hello to stranger
None


We know that `return_value` and `side_effect` can also be set on the Mock instance, but other things such as the name can only be set while constructing the object or by way of invoking `configure_mock()`.

The latter can be used to reduce the verbosity of some of the mock configuration:

In [47]:
from unittest.mock import Mock

# Mock requests lib
holidays = {"12/25": "Christmas", "7/4": "Independence Day"}
response_mock = Mock(**{"json.return_value": holidays})

print(response_mock.json())


{'12/25': 'Christmas', '7/4': 'Independence Day'}


#### Using `patch()`

The `patch` function looks up an object in a given module and replaces it with a `Mock`. 

##### `@patch()`

If you want to mock an object for the duration of your entire test function, you can use `@patch()`.

Consider the following snippet for a module `my_calendar` that exposes a couple of functions:

```python
from datetime import datetime
import requests


def is_weekday():
    """Returns True if today is a weekday, False if today is a weekend day"""
    today = datetime.today()
    return 0 <= today.weekday() < 5


def get_holidays():
    """Access an *invented* API to get a dictionary of holidays as in
    `{"12/25": "Christmas", "7/4": "Independence Day"}`
    """
    r = requests.get("http://localhost/api/holidays")
    if r.status_code == 200:
        return r.json()
    return None
```

See [my_calendar.py](exercises/section_52-mocking/01_patch-as-decorator/my_calendar.py)

Now we will use a technique known as *Monkey patching*. When using this technique we replace one object with another at runtime. Up until now we had used `Mock` to patch objects in the file in which they have been defined.

In order to do so, you create a test file in which `@patch()` is used to replace the objects to test in [my_calendar.py](exercises/section_52-mocking/01_patch-as-decorator/my_calendar.py):

```python
import unittest
from unittest.mock import patch

from requests.exceptions import Timeout

from my_calendar import get_holidays


class TestCalendar(unittest.TestCase):
    @patch("my_calendar.requests")
    def test_get_holidays_timeout(self, mock_requests):
        mock_requests.get.side_effect = Timeout
        with self.assertRaises(Timeout):
            get_holidays()
            mock_requests.get.assert_called_once()


if __name__ == "__main__":
    unittest.main()
```

See [tests.py](exercises/section_52-mocking/01_patch-as-decorator/tests.py)

See how we have *patched* `my_calendar.requests` using the `@patch` decorator. Within the test function we have used the `Mock` functionality we were familiar with.

Note also how the test function defines an extra parameter `mock_requests` which we use to receive the mocked object in the test method.

##### `patch` as a Context Manager

In certain scenarios it is recommended to use `patch()` as a context manager rather than as a decorator:
+ you only want to mock an object for a part of the test scope
+ you're already using a lot of decorators on the test function and don't want to compromise readability

In that case you can use `with`:

```python
import unittest
from unittest.mock import patch

from requests.exceptions import Timeout

from my_calendar import get_holidays


class TestCalendar(unittest.TestCase):
    def test_get_holidays_timeout(self):
        with patch("my_calendar.requests") as mock_requests:
            mock_requests.get.side_effect = Timeout
            with self.assertRaises(Timeout):
                get_holidays()
                mock_requests.get.assert_called_once()


if __name__ == "__main__":
    unittest.main()
```

See [tests.py](exercises/section_52-mocking/02_patch-context-manager/tests.py)

When the test exits the `with` scope, `patch()` will replace the mocked object with the real one.

##### Patching an object's attributes with `patch.object`

Sometimes you might want to mock one method of an object, instead of the entire object.

That is possible with `patch.object`.

For example, if you have a closer look at the test function `test_get_holidays_timeout` you realize that you only need to mock the `get` method, not the entire `requests` object:

```python
import unittest
from unittest.mock import patch

from my_calendar import requests, get_holidays


class TestCalendar(unittest.TestCase):
    @patch.object(requests, "get", side_effect=requests.exceptions.Timeout)
    def test_get_holidays_timeout(self, mock_requests):
        with self.assertRaises(requests.exceptions.Timeout):
            get_holidays()
            mock_requests.get.assert_called_once()


if __name__ == "__main__":
    unittest.main()
```

See [tests.py](exercises/section_52-mocking/03_patch-object/tests.py)

In the previous example, we've only mocked the `get` method instead of the entire `requests` object.

Note that `patch.object` takes the same configuration arguments as `patch`, but you need to provide the target object as the first parameter, and the attribute as the second parameter:

```python
@patch.object(requests, "get", side_effect=requests.exceptions.Timeout)
```

| NOTE: |
| :---- |
| Besides objects and attributes, you can also patch dictionaries using `patch.dict` (See https://docs.python.org/3/library/unittest.mock.html#unittest.mock.patch.dict). |

##### Identifying target object's path when patching

Learning how to use `patch` effectively is critical to succeed when mocking objects in other modules.

However, many times it is not obvious what the target object path is.

Consider the following snippet in which we mock a library function:

```python
import unittest
from unittest.mock import patch

import my_calendar


class TestCalendar(unittest.TestCase):
    def test_weekday_mock(self):
        with patch("my_calendar.is_weekday"):
            my_calendar.is_weekday()


if __name__ == "__main__":
    unittest.main()
```

When using this approach we will see that `my_calendar.is_weekday()` is effectively mocked.

However, if we change it slightly:

```python
import unittest
from unittest.mock import patch

from my_calendar import is_weekday


class TestCalendar(unittest.TestCase):
    def test_weekday_mock(self):
        with patch("my_calendar.is_weekday"):
            is_weekday()


if __name__ == "__main__":
    unittest.main()
```

We will see that `is_weekday` has not been mocked.

| NOTE: |
| :---- |
| The `import` is bringing the function into scope, but the patch is applied to a different reference coming from `my_calendar.is_weekday`. |

The rule of thumb is to patch the object where it is looked up. That is, using the same approach used for the import.

Specifically, if you need to do it for a bare `is_weekday` you will need to use `__main__.is_weekday` as the path:

```python
import unittest
from unittest.mock import patch

from my_calendar import is_weekday


class TestCalendar(unittest.TestCase):

    def test_weekday_mock(self):
        with patch("__main__.is_weekday"):
            is_weekday()
            print(f"is_weekday(): {is_weekday()}")  # a mock!!!


if __name__ == "__main__":
    unittest.main()
```

See [example](exercises/section_52-mocking/04_patch-object-path/)

#### Using Mock specifications

As `Mock` objects creates attributes and methods *on-the-fly* when you access them, you might find problems when the class or function is updated or misspelled.

This can be mitigated by creating a *spec* for your mock.



In [3]:
from unittest.mock import Mock

calendar = Mock(spec=["is_weekday", "get_holidays"])

try:
    calendar.is_weekday() # OK
    calendar.create_event() # AttributeError
except Exception as ex:
    print(f"Oops: {type(ex)}: {ex}")

Oops: <class 'AttributeError'>(Mock object has no attribute 'create_event')


They work in the same way, if you use the spec with an instance:

```python
import my_calendar
from unittest.mock import Mock

calendar = Mock(spec="my_calendar"])

try:
    calendar.is_weekday() # OK
    calendar.create_event() # AttributeError
except Exception as ex:
    print(f"Oops: {type(ex)}: {ex}")
```

It is also possible to use `create_autospec` to automatically create the mock specification:

```python
import my_calendar
from unittest.mock import create_autospec

calendar = create_autospec(my_calendar)

try:
    calendar.is_weekday() # OK
    calendar.create_event() # AttributeError
except Exception as ex:
    print(f"Oops: {type(ex)}: {ex}")
```

If you're using `patch` you can send an argument to achieve the same result:

```python
import my_calendar
from unittest.mock import create_autospec

...
with patch("__main__.my_calendar", autospec=True) as calendar:
try:
    calendar.is_weekday() # OK
    calendar.create_event() # AttributeError
except Exception as ex:
    print(f"Oops: {type(ex)}: {ex}")  
```

## Section 53 &mdash; The *walrus* `:=` operator

The *walrus* operator introduced in Python 3.8 lets you use an assignment in an `if` statement.

Consider the following code, in which the user typed something and we want to give the variable a friendlier name to then take some action.

In [1]:
def get_user_input():
    return "Y"

user_input = get_user_input()

should_show_value = user_input
if should_show_value == "Y":
    print("Value should be displayed")

Value should be displayed


Python does not allow us to do the assignment within the `if`:

```python
def get_user_input():
    return "Y"

# Compilation error
if (should_show_value = get_user_input()) == "Y":
    print("Value should be displayed")
```

But it will let us if we use the *walrus* operator:

In [4]:
def get_user_input():
    return "Y"

if (should_show_value := get_user_input()) == "Y":
    print("Value should be displayed")

Value should be displayed


That is, the *walrus* operator lets you do an assignment in places where you wouldn't typically use them.

Note that the if statement will become even more succinct when the value we're assigning can be directly used in the if statement:

In [5]:
def get_user_input():
    return True

if should_show_value := get_user_input():
    print("Value should be displayed")

Value should be displayed


Which would need to be expanded to:

In [6]:
def get_user_input():
    return True

should_show_value = get_user_input()
if should_show_value:
    print("Value should be displayed")

Value should be displayed


if the *walrus* operator didn't exist.