## Zip up

### How zip works

In a simple for loop we have an iterator `it`

In [45]:
it = range(5)
for element in it:
    print(element)

0
1
2
3
4


An "iterator" is something that can be traversed linearly, like a list or a string.

Sometimes you will have two iterators with related information  and we need to loop over those iterators to do something. Check this example:

In [46]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for i in range(len(firsts)):
    print(f"{firsts[i]} {lasts[i]}")

John Doe
Jane Smith
Jack Johnson


This is what `zip` is for: use a pair up iterables that you want to traverse at the same time.

In [47]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for first, last in zip(firsts, lasts):
    print(f"{first} {last}")

John Doe
Jane Smith
Jack Johnson


We are doing an unpacking assignment because zip actually returns tuples.

In [48]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for z in zip(firsts, lasts):
    print(z)

('John', 'Doe')
('Jane', 'Smith')
('Jack', 'Johnson')


### Zip is lazy

`zip` does not create tuples immediatly. `zip` is lazy, meaning it generates tuples on the fly when you iterate over it, for example when you iterate over them in a `for` loop or when you convert it to a list.

In [49]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson", "Davis"]
z = zip(firsts, lasts)
print(z)
print(list(z))

<zip object at 0x10a5e47c0>
[('John', 'Doe'), ('Jane', 'Smith'), ('Jack', 'Johnson')]


`zip` being lazy means that by itself is not that similar to a list. For example, you cannot ask for the length of a zip object.

In [50]:
len(z)

TypeError: object of type 'zip' has no len()

### Three is a crowd

`zip` can take three or more iterables and return a tuple of the same length as the shortest iterable.

In [11]:
firsts = ["John", "Jane", "Jack"]
middles = ["Z.", "A.", "C."]
lasts = ["Doe", "Smith", "Johnson"]

for z in zip(firsts, middles, lasts):
    print(z)

('John', 'Z.', 'Doe')
('Jane', 'A.', 'Smith')
('Jack', 'C.', 'Johnson')


### Mismatched lengths

If `zip`'s arguments have different lengths, it will stop as soon as it hits the end of the shortest iterable.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith"]

for z in zip(firsts, lasts):
    print(z)

Starting with Python 3.10, `zip` will be able to receive a keyword argument `strict` to error out if the iterables have different lengths.

In [8]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson", "Davis"]

for z in zip(firsts, lasts, strict=True):
    print(z)

('John', 'Doe')
('Jane', 'Smith')
('Jack', 'Johnson')


ValueError: zip() argument 2 is longer than argument 1

`zip` only errors when finds the length mismatch, not when it's about to start iterating over the longer iterable; this is because the arguments to `zip` are lazy iterators.

In general, `zip` is used with iterators that are expected to have the same lenght.  If that is the case is a good idea to always set `strict=True` to catch bugs in your code.

### Create a dictionary with zip

You can create dictionaries by feeding key-value pairs to the dict function, which means `zip` can be used to create dictionaries from two lists.

In [15]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

dictionary = dict(zip(firsts, lasts))
print(dictionary)

{'John': 'Doe', 'Jane': 'Smith', 'Jack': 'Johnson'}


## Enumerate me

### How enumerate works

Python newcomers are usually exposed to this type of `for` loop very early.

In [16]:
for i in range(3):
    print(i)

0
1
2


This leads them to "learning" this anti-pattern of `for` loops to go over a list:

In [18]:
words = ["apple", "banana", "cherry"]
for i in range(len(words)):
    print(f"'{words[i]}' has {len(words[i])} characters.")

'apple' has 5 characters.
'banana' has 6 characters.
'cherry' has 6 characters.


The pythonic way of writing such a loop is iterating directly over the list:

In [19]:
words = ["apple", "banana", "cherry"]
for word in words:
    print(f"'{word}' has {len(word)} characters.")

'apple' has 5 characters.
'banana' has 6 characters.
'cherry' has 6 characters.


However, the final step in this indices vs elements comes when you need to know the index of each element as well. For this, `enumerate` is your friend.

In [21]:
words = ["apple", "banana", "cherry"]
for i, word in enumerate(words):
    print(f"Word #{i}: '{word}' has {len(word)} characters.")

Word #0: 'apple' has 5 characters.
Word #1: 'banana' has 6 characters.
Word #2: 'cherry' has 6 characters.


### Optional `start` argument

The `enumerate` function can also accept an optional `start` argument. This argument specifies the starting index for the enumeration.

In [22]:
words = ["apple", "banana", "cherry"]
for i, word in enumerate(words, start=1):
    print(f"Word #{i}: '{word}' has {len(word)} characters.")

Word #1: 'apple' has 5 characters.
Word #2: 'banana' has 6 characters.
Word #3: 'cherry' has 6 characters.


This optional `start` argument is useful when you want to start the enumeration from a specific index.

By the way, the argument has to be an integer but can be negative.

In [23]:
for i, v in enumerate("abc", start=-4000):
    print(f"Index: {i}, Value: {v}")

Index: -4000, Value: a
Index: -3999, Value: b
Index: -3998, Value: c


### Unpacking when iterating

The `enumerate` function returns a lazy iterator, which means the items you iterate only become available as you need them. This can be useful when you want to process large amounts of data without consuming too much memory.

The items that `enumerate` returns are tuples, where the first element is the index and the second element is the value.

In [24]:
for tuple in enumerate("abc"):
    print(tuple)

(0, 'a')
(1, 'b')
(2, 'c')


### Deep unpacking

Things can get more interesting when you use `enumerate`, for example, on a `zip`:

In [None]:
pages = [5, 17, 32, 50]
for i, (start, end) in enumerate(zip(pages, pages[1:]), start=1):
    print(f"Chapter {i}: {end - start} pages long. {start} {end}") 


[17, 32, 50]
Chapter 1: 12 pages long. 5 17
[17, 32, 50]
Chapter 2: 15 pages long. 17 32
[17, 32, 50]
Chapter 3: 18 pages long. 32 50
[17, 32, 50]


This snippet takes a list of pages where chapters of a book start and prints the length of each chapter. Notice how `enumerate` returns tuples with indices and values, but those values are extracted from a `zip`, which iteself returns tuples.

We use deep unpacking to access all the values directly.

## Chaining comparison operators

### Chaining of comparison operators

One excelent feature of Python is its ability to chain comparison operators. This can make your code more readable and easier to understand. Check this snippet that looks natural:

In [26]:
a = 1
b = 2
c = 3
if a < b < c:
    print("The numbers are in ascending order.")

The numbers are in ascending order.


When Python sees two comparisons in a row, like `a < b < c`, it behaves as if you wrote `a < b and b < c`, except that `b` is only evaluated once (which is relevant if `b` is an expression like a function call).

Another example usage is for when you want to make sure that three values are the same:

In [29]:
a = b = 1
c = 2

if a == b == c:
    print("The numbers are the same.")
else:
    print("The numbers are different.")

The numbers are different.


You can chain any arbitrary number of comparison operators. For example `a < b < c < d` would check if `a < b`, `b < c`, and `c < d`.

### Pitfalls

#### Non-transitive operators

We can use `a == b == c` to check if three variables are equal, but this won't work for non-transitive operators like `!=` or `<>`.

In [30]:
a = c = 1
b = 2

if a != b != c:
    print("a, b, and c all diferent:", a, b, c)

a, b, and c all diferent: 1 2 1


The problem is that the check is `a != b` and `b!= c`, which checks that `b` is different from both `a` and `c`, but says nothing about whether `a` is different from `c`.

This is because `!=` is a non-transitive operator, i.e., knowing how `a` relates to `b` and knowing how `b` relates to `c` doesn't tell you anything about how `a` relates to `c`.

#### Non-constant expressions or side-effects

Chaining comparisons like `a < b < c` evaluates `b` only once.

If `b` contains an expression with side-effects or if it's something that is not a constant, then the two expressions are not equivalent. Check this example which the element in the middle gets evaluated only once:

In [10]:
def f():
    print("hey")
    return 3

if 1 < f() < 5:
    print("done")

hey
done


Just to corroborate, that this will get evaluate `f()` twice:

In [11]:
if 1 < f() and f() < 5:
    print("done")

hey
hey
done


This snippet shows that an expression like `1 < f() < 0` can actually evaluate to `True` when its unfolded:

In [16]:
l = [-2, 2]

def f():
    global l
    l = l[::-1]  # Reverse the list
    return l[0]

# evaluated once f() = 2
# if 1 < f() < 0:
#     print("ehh")  # Never gets printed

# evaluated twice: first time f() = 2, second time f() = -2
if 1 < f() and f() < 0:
    print("ehh 2")  # gets printed

ehh 2


#### Ugly chains

This feature looks neat, but some chains where operatos are not aligned look very ugly, so thes chains look good:

In [None]:
a == b == c
a < b <= c
a <= b < c

but this chains look really ugly:

In [None]:
a < b > c   # it's better to use b > max(a, c), it's more readable and easier to understand
a <= b > c
a < b >= c

Now there are some other chains that are just confusing:

In [None]:
lst = []
a < b is True
a == b in lst
a in lst is True

In Python, `is`, `is not`, `in` and `not in` are comparison operators, so you can also chain them. But this creates weird situations like

In [None]:
a = 3
lst = [3, 5]
if a in lst == True:
    print("is True")
else:
    print("is False")

is True


Here is a break down of what this does:
- `a in lst == True` is equivalent to `a in lst` and `lst == True`
- `a in lst` is `True`, but
- `lst == True` is `False`, so
-  `a in lst == True` unfolds to `True and False`, which is `False`

## Truthy, Falsy, and bool

### "Truthy" and "Falsy"

Any object can be tested for truth value, for use in an `if` or `while` condition or as operand of the booleans operations `and`, `or`, and `not`.

In [5]:
5 > 3

True

The next step is using an object that is not boolean value, ex:

In [6]:
lst = [1, 2, 3]
if lst:
    print(lst)

[1, 2, 3]


How can we now if an object is truthy or falsy? The answer is by using the built-in `bool` function.

In [7]:
bool(lst)

True

A value of a given type is Falsy when it is "empty" or "without any useful value". Examples of Falsy values are: empty list, empty string, empty tuple, empty set, empty dictionary, the number 0, the boolean value `False`, and `None`.
- by defaault any object is Truthy
- an object is Falsy if calling `len` on it returns `0`

### The `__bool__` dunder method

An object has a Falsy value if it defines a `__bool__` method that returns `False`.

`__bool__` is a dunder method that you can use to tell your objects if they are considered "truthy" or "falsy", by implementing it in your class.

In [None]:
class A:
    def __bool__(self):
        return False
    
a = A()
if a:
    print("Go Away!")

Go Away!


When given an arbitrary Python object that needs to be tested fort a truth value, Python first tries to call `bool` on it, in an attempt to use its `__bool__` method.. If the object does not implement a `__bool__` method, then Python tries to call `len` on it. Finally, if that also fails, Python defaults to giving a Truthy value to the object.

### Remarks

#### A note about containers with falsy objects

Things like a list that only contains zeroes or a dictionary composed of zeroes and empty lists are not Falsy, because the containers themselves are not longer empty:

In [22]:
# These are false
print(bool([]))
print(bool({}))
print(bool(0))

# These are true
print(bool([0]))
print(bool({0: []}))

False
False
False
True
True


#### A note about checking for `None`

Imagine someone implemented the following function to return the integer square root of a number, returning `None` for negative inputs (because negative numbers don't have square roots). 

When you use this function, you know it returns `None` if the computation fails, so you might be tempted to use it like this:

In [None]:
import math

def int_square_root(n):
    if n < 0:
        return None
    return math.floor(math.sqrt(n))

n = int(input("Enter a number: "))
int_sqrt = int_square_root(n)

print("debug int_sqrt value: ", int_sqrt)

if not int_sqrt:
    print("Negative numbers do not have an integer square root.")
else:
    print(int_sqrt)

debug int_sqrt value:  0
0


The problem is that `int_square_root` returned  meaningful value which is `0`  but that value is still Falsy.

So when you want to check fi a function returned `None` do not rely on the Truthy/Falsy value. Instead check explicitly if the returned value is `None`.

In [None]:
returned = ""

# Use
if returned is None:
    pass

if returned is not None:
    pass

# Avoid
if not returned:
    pass

if returned:
    pass

## Boolean short-circuiting

### Return values of the `and` and `or` operators

`x or y` returns `x` if `x` is `True`, otherwise it returns `y`. This is equivalent to the expression `(x or y) == (y if not x else x)`.

In [19]:
if 3 or 5:
    print("Yeah")
else:
    print("Nope")

Yeah


Now look at the program below and see what it prints:

In [20]:
print(3 or 5)

3


A similar thing happens with `and`. `x and y` returns `x` if `x` is `False`, otherwise it returns `y`. This is equivalent to the expression `(x and y) == (x if not x else y)`.

In [24]:
print(False and True)
print(True and 0)

False
0


### Short-circuiting

This is what short-circuting is: not evaluating the whole expression (stopping short of evaluating it) if we already have enough information to determine the result.

#### or

##### False ory

`or` evaluatest to `True` if any of its operands is truthy. If the left operand to `or` is `False` the `or` operator hast to look to its right operand in order to determine the result.

In [29]:
y = 5  # truthy value
if False or y:
    print("Got in!, y = ", y)
else:
    print("Didn't get in...")

Got in!, y =  5


In [28]:
y = []  # falsy value
if False or y:
    print("Got in!, y = ", y)
else:
    print("Didn't get in second...")

Didn't get in second...


##### True ory

On the other hand, if the left operand to `or` is `True`, we do not need to take a look at `y` because the result will be `True`.

Let's create a simple function that return its argument unchanged but that produces a side-effect of printing something in the screen, then we can use it to take a look at the things that Python evaluates when trying to determin the vale of `x or y`: 

In [4]:
def print_and_return(x):
    print(f"Inside `print_and_return` with x = {x}")
    return x

print(print_and_return(False) or print_and_return(3))
print(print_and_return(True) or print_and_return(3))

Inside `print_and_return` with x = False
Inside `print_and_return` with x = 3
3
Inside `print_and_return` with x = True
True


Notices that, in the second example, `print_and_return` only did one print because it never reached the `print_and_return(3)`

##### Short-circuiting of `or` expressions 

Now we tie everything together. If the left operand to `or` is `False` or falsy, we know that `or` has to look to its right operand and will, therefore, return the vale of its right operand after evaluating it. On the other hand, if the left operand is `True` or truthy, `or` will return the value of the left operand without even evaluating the right operand.

#### and

##### False andy

`and` gives `True` if both operands are `True`. Therefore, if we have an expression like

In [None]:
y = []  # falsy value
val = False and y
print(val)

False


do we need to know what `y` is in order to figure out what `val` is? no, we do not, because regardless of wether `y` is `True` or `False`, `val` is always `False`:

In [11]:
print(False and True)
print(False and False)

False
False


If we take the `False` and `y` expressions from this example and compare them with the `if` expression we wrote earlier which was

    `(x and y) == (x if not x else y)`

we see that, in this case, `x` was substituted by `False`, and, therefore, we have
    
    `(False and y) == (False if not False else y)`

Now, the condition inside that `if` expresion reads

    `not False`

which we know evaluates to `True`, meaning that the `if` expression never returns `y`

In [35]:
def print_and_return(x):
    print(f"Inside `print_and_return` with x = {x}")
    return x

print(print_and_return([]) and print_and_return(True))  # [] is falsy
print(print_and_return(0) and print_and_return(True))  # 0 is falsy
print(print_and_return({}) and print_and_return(True))  # {} is falsy
print(print_and_return(0) and print_and_return(0))  # both are falsy, but only the left matters

Inside `print_and_return` with x = []
[]
Inside `print_and_return` with x = 0
0
Inside `print_and_return` with x = {}
{}
Inside `print_and_return` with x = 0
0


##### True andy

If we evaluate `True and y`, we figure out that the result of such an expression is always the value of `y`, because the left operand being `True`, or any other truthy value, doesn't give `and` enough information.

##### Short-circuiting of `and` expressions 

To tie everything together. If the left operand to `and` is `False` or falsy, we know the expression returns the value of the left operand regardles of the right operand, and therefore we do not even evaluate the right operand. On the other hand, if the left operand to `and` is `True`, then `and` will evaluate the righ operand and return its value.

### Short-circuiting in plain English

Instead of memorising rules about what sides get evaluated when, just remember that both `and` and `or` will evaluate as many operands as needed to determine the overall Boolean result, and will then return the value of the last side that they evaluated.

As inmediate conclusion, the left operand is always evaluated, as you might imagine.


### `all` and `any`

The built-in functions `all` and `any` also short-circuit, as they are simple extensions of the behaviours provided by the `and` and `or` operators.

`all` wants to make sure that all the values of its argument are truthy, so as soon as it finds a falsy value, it knows it's game over. The docs says `all` is equivalent to this:

In [22]:
def custom_all(iterable):
    for element in iterable:
        if not element:
            return False
    return True

my_list = [True, True, True, True]
my_list_2 = [True, True, False, True]

print(all(my_list))  # True
print(custom_all([]))  # True

print(all(my_list_2))  # False
print(custom_all(my_list_2))  # False

True
True
False
False


Similarly, `any` is going to fo its best to look for some value that is truthy. Therefore, as soon as it fins one, `any` knows it has achieved its goal. Something similar to this:

In [23]:
def custom_any(iterable):
    for element in iterable:
        if element:
            return True
    return False

my_list = [False, False, False, False]
my_list_2 = [False, False, True, False]

print(any(my_list))  # False
print(custom_any(my_list))  # False

print(any(my_list_2))  # True
print(custom_any(my_list_2))  # True

False
False
True
True


### Short-circuiting in chained comparisons

Comparisons operators can be chained arbitrarily, and those are almost equivalent to a series of comparisons separated with `and`, except that the subexpressions are only evaluated once, to prevent wasting resources. Therefore, because we are also using an `and` in the background, chained comparisons are also short-circuiting.

In [36]:
def print_and_return(x):
    print(f"Inside `print_and_return` with x = {x}")
    return x

## 1 > 2 is False, so there is no need to evaluate the right side.
print_and_return(1) > print_and_return(2) > print_and_return(3)

Inside `print_and_return` with x = 1
Inside `print_and_return` with x = 2


False

### Examples in code

#### Short-circuit to save time

This does not work but it shows that you should check for the simplier operand first.

In [38]:
import timeit


setup = ""
import re
s = b"a"*1000 + b"*"

validate = False
print(timeit.timeit("validate and not re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s)", setup))

NameError: name 'validate' is not defined

#### Short-circuit to flatten `if` statements

In [39]:
if validate:
    print("Validating...")
    if re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s):
        print("Valid!")
    else:
        print("Not valid!")

It's best to use a single ìf statement instead of a chain of `and` and `or` operators.

In [40]:
if validate and re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s):
    print("Valid!")

##### Checking preconditions before expression

In [None]:
def set_terminator(self, term):
    if isinstance(term, str) and self.use_encoding:
        term = bytes(term, self.encoding)
    elif isinstance(term, int) and term < 0:
        raise ValueError("Terminator must be a non-negative integer")


#### Define default values

In [42]:
greet = input("Type your name >> ") or "Guest"
print(f"Hello, {greet}!")

Hello, Guest!


#### Find witnesses in a sequence of items

In [44]:
items = [14, 16, 18, 20, 35, 41, 100]
any_found = False

for item in items:
    any_found = item % 2
    if any_found:
        print(f"Found an odd number: {item}")
        break

Found an odd number: 35


Look as this neat simplified version:

In [33]:
items = [14, 16, 18, 20, 35, 41, 100]
is_odd = lambda x: x % 2

if any(is_odd(witness := item) for item in items):
    print(f"Found an odd number: {witness}")

Found an odd number: 35


## set and frozenset

### (Mathematical) sets

A set is simply a collection of unique items where order does not matter. Think a set as a shopping cart.

### No ordering

If you go shopping, the order of the items you put in your shopping cart does not matter. The only thing that matters is the items that are in the cart.

You could say that the groceries that you bought form a set.

Both in maths and in Python, we use `{}` to denote a set. 

In [9]:
groceries = {"milk", "bread", "cheese", "milk"}
print(groceries)
print(type(groceries))
print(type(groceries).__name__)

{'bread', 'milk', 'cheese'}
<class 'set'>
set


To make sure that the order really does not matter in sets, we can compare this set with other sets containing the same elements but in a different order.

In [45]:
groceries = {"milk", "bread", "cheese", "milk"}
print(groceries == {"cheese", "bread", "milk"})
print(groceries == {"bread", "milk", "cheese"})

True
True


### Uniqueness

Another key property of (mathematical) sets is that there are no duplicate elements.

Think as someone told you to buy cheese, and when you go back home, someone asks you: "Did you buy cheese?" This is a yes/no question, either you bought cheese or you didn't.

For sets, the same thing happens: the element is either in the set or it's not. We don't care about element count. We don't even consider it.

In [12]:
groceries = {"apple", "banana", "apple", "milk", "milk", "milk"}
print(groceries)

{'milk', 'apple', 'banana'}


### (Common) Operations on sets

#### Creation

There are three main ways to create a set.

##### Explicit {} notation

Using the `{}` notation, you write out the elements inside the set in a comma-separated list.

In [46]:
numbers = {1, 2, 3}
letters = {"a", "b", "c"}
print(numbers)
print(letters)

{1, 2, 3}
{'b', 'c', 'a'}


By the way, you cannot use `{}` to create an empty set! `{}` by itself will create an empty dictionary. To create an empty set, you need the next method.

##### Calling set on an iterable

You can call the built-in functon `set()` on any iterable to create a set out of the elements of that iterable. Like range, strings, and lists.

In [1]:
print(set(range(3)))
print(set([73, "water", 42]))

{0, 1, 2}
{73, 'water', 42}


Calling `set` on a string produces a set with the characters of the string, not a set containing the whole string.

In [47]:
place = "mississippi"
print(place)
print(set(place))

mississippi
{'m', 's', 'i', 'p'}


Calling `set` by itself produces an empty set.

In [24]:
my_set = set()
print(my_set)

set()


##### Set comprehensions

Using `{}`, one can also write what's called a set comprehension. Very similar to list comprehensions, but for sets.

In [49]:
veggies = ["broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato", "carrot"]
veggies_set = {veggie for veggie in veggies if "c" in veggie}
print(veggies_set)

{'lettuce', 'carrot', 'spinach', 'broccoli'}


Secondly, a set of comprehension with two nested for loops.

In [26]:
veggies = ["broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"]
print({char for veggie in veggies for char in veggie})

{'p', 't', 'o', 'r', 'n', 'c', 'a', 'i', 'm', 'b', 'e', 's', 'l', 'h', 'u'}


#### Operations on a single set

Many common operations are done on with a single set, namely:

- membership testing

In [1]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
print("broccoli" in groceries)  # True
print("cucumber" in groceries)  # False

True
False


- computing the size of a set

In [58]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
len(groceries)  # 6

6

- popping a random element from a set

In [55]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
print(groceries.pop())
print(groceries)

tomato
{'broccoli', 'lettuce', 'carrot', 'spinach', 'pepper'}


- adding an element to a set

In [7]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
groceries.add("zucchini")
print(groceries)

{'pepper', 'lettuce', 'carrot', 'broccoli', 'zucchini', 'spinach', 'tomato'}


#### Iteration

Sets are similar to lists with unique elements, but lists are ordered: a list can be traversed from the beginning to the end, and a list can be indexed.

Sets can also be iterated over (in an order you can't rely on)

In [59]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
for item in groceries:
    print(item)

tomato
broccoli
lettuce
carrot
spinach
pepper


Sets cannot be indexed directly:

In [60]:
groceries = {"broccoli", "carrot", "spinach", "lettuce", "pepper", "tomato"}
print(groceries[0])  # This will raise an error

TypeError: 'set' object is not subscriptable

#### Computation with multiple sets

When having multiple sets you may need to do other sorts of operations. Here are some common ones:

- check for overlap between two sets

In [61]:
groceries = {"milk", "bread", "cheese"}
treats = {"cake", "ice cream", "cookies", "cheese"}

print(groceries & treats)

{'cheese'}


- join the two sets. Here the pipe is similar to the usage of | to merge dictionaries.

In [62]:
groceries = {"milk", "bread", "cheese"}
treats = {"cake", "ice cream", "cookies", "cheese"}

print(groceries | treats)

{'cookies', 'bread', 'cake', 'ice cream', 'cheese', 'milk'}


- find differences between two sets (what's on the left set but not on the right set)

In [63]:
groceries = {"milk", "bread", "cheese"}
treats = {"cake", "ice cream", "cookies", "cheese"}

print(groceries - treats)

{'milk', 'bread'}


- check for containment using <, <=, >= and >

In [65]:
groceries = {"milk", "bread", "cheese"}
treats = {"cake", "ice cream", "cookies", "cheese"}

print({"cheese", "milk"} < groceries)
print(groceries < groceries)

print({"cheese", "milk"} <= groceries)
print(groceries <= groceries)

print(treats > {"cake"})
print(treats >= {"cake", "cheese"})
print(treats > {"anything"})

True
False
True
True
True
True
False


### Differences between `set` and `frozenset`

#### Creation

While you can create a set with the built-in `set()` function or through the `{}` notation, `frozenset` can only be created with the built-in `frozenset()` function.

`frozenset` can be created out of other sets or out of any iterables.

When printed, `frozenset` display the indication that they are frozen.

In [29]:
groceries = {"milk", "bread", "cheese"}
print(frozenset(groceries))
print(frozenset([73, "water", 42]))

frozenset({'bread', 'cheese', 'milk'})
frozenset({'water', 73, 42})


#### Mutability

Sets are mutable. They can be changed after they are created. You can add, remove, or change elements.

If you need to create an object that behaves like a set but is immutable, you can use `frozenset`.

A `frozenset` is an instance of a `set`except that it cannot be changed after it is created.

In [66]:
groceries_ = frozenset({"milk", "bread", "cheese"})
# groceries_.add("zucchini")  # This will raise an error
groceries_.pop()

AttributeError: 'frozenset' object has no attribute 'pop'

There's a very similar pair of built-in types that have this same dichotomy: lists and tuples.

Lists are mutable, and tuples are immutable.

In [67]:
my_list = ["apple", "banana", "apple", "milk", "milk", "milk"]
print(my_list[0])
print(my_list.pop())
my_list.append("zucchini")
print(my_list)

my_tuple = ("apple", "banana", "apple", "milk", "milk", "milk")
my_tuple.pop()  # This will raise an error

apple
milk
['apple', 'banana', 'apple', 'milk', 'milk', 'zucchini']


AttributeError: 'tuple' object has no attribute 'pop'

### To be (hashable) or not to be

An object that is hashable is an object for wich hash can be computed.

A hash is an integer that the built-in function `hash()` computes to help with fast operations with fast operations with dictionaries, e.g. key lookups.

The built-in function `hash` dictates what can and annot be a key dictionary: if it's hashable it can be if not, it can't.

Lists are mutable and unhashable, while tuples are immutable and hashable.

In [None]:
dictionary = {}
# dictionary[[1, 2, 3]] = 73  # This will raise an error
dictionary[(1, 2, 3)] = 73
print(dictionary)

TypeError: unhashable type: 'list'

Something similar occurs with sets and frozensets.

In [6]:
dictionary = {}
# dictionary[{1, 2, 3}] = 73  # This will raise an error
dictionary[frozenset({1, 2, 3})] = 73
print(dictionary)

{frozenset({1, 2, 3}): 73}


### What are sets used for?

Sets are useful when the problems at hand would inherit from mathematical sets:
- membership testing
- uniqueness

Doing this in sets are much faster than lists.

### Examples

Never do this, this is an anti-pattern

In [51]:
seen_actions = set()
action = "test"

if action not in seen_actions:
    seen_actions.add(action)    

Checking if an element is inside a set or adding unconditionally is almost the same work, so this is doubling your work!

### Conclusion

- Use (frozen)set when youa re dealing with collections and where what matters is fast memebership checking.

## List comprehensions 101

### What is a list comprehension?

A list comprehension is a Python expression that builds a list.



### A loop that builds a list

Consider the next loop. It builds a list called `squares` which contains the first square of numbers:

In [53]:
squares = []
for num in range(10):
    squares.append(num**2)
print(squares)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


The key idea behind list comprehensions is that many list can be built out of other, simpler iterables (lists, tuples, strings) by transforming the data that we get from those iterables. In such cases, we want to focus on the data transformation that we are doing.

In [54]:
squares_comprehension= [num**2 for num in range(10)]
print(squares_comprehension)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


In this case we are dropping:
- the initialisation of the list (`squares = []`)
- the call to append (`squares.append(...)`)

### Exercises: practice rewriting `for` loops as list of comprehensions

1. Compute the first square numbers

In [10]:
squares = []
for n in range(10):
    squares.append(n ** 2)
print(squares)

squares_comp = [n**2 for n in range(10)]
print(squares_comp)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


2. Uppercasing a series of words:

In [11]:
fruits = "banana pear peach strawberry tomato".split()
upper_words = []
for fruit in fruits:
    upper_words.append(fruit.upper())
print(upper_words)

upper_words_comp = [fruit.upper() for fruit in fruits]
print(upper_words_comp)

['BANANA', 'PEAR', 'PEACH', 'STRAWBERRY', 'TOMATO']
['BANANA', 'PEAR', 'PEACH', 'STRAWBERRY', 'TOMATO']


3. Find the length of each word in a sentence:

In [12]:
words = "the quick brown fox jumps over the lazy dog".split()
length_words = [len(word) for word in words]
print(length_words)

[3, 5, 5, 3, 5, 4, 3, 4, 3]


### Filtering data in a list comprehension

List comprehensions allow you to filter data so that the new list only transforms some of the data that comes from the source iterable.

In [8]:
square_list = []
for number in range(1, 10):
    if (number % 3 == 0) or (number % 5 == 0):
        square_list.append(number**2)
print(square_list)

square_list_comprehension = [
    number ** 2 
    for number in range(1, 10) 
    if (number % 3 == 0) or (number % 5 == 0)
]
print(square_list_comprehension)

[9, 25, 36, 81]
[9, 25, 36, 81]


### More exercises

1. Squaring

In [13]:
fizz_buzz_squares = []
for n in range(10):
    if (n % 3 == 0) or (n % 5 == 0):
        fizz_buzz_squares.append(n ** 2)
print(fizz_buzz_squares)

# List comprehension
fizz_buzz_comp = [
    n ** 2
    for n in range(10)
    if (n % 3 == 0) or (n % 5 == 0)
]
print(fizz_buzz_comp)

[0, 9, 25, 36, 81]
[0, 9, 25, 36, 81]


2. Upper cassing words

In [69]:
fruits = "Banana pear PEACH strawberry tomato".split()
upper_cased = []
for fruit in fruits:
    if (fruit.islower()):
        upper_cased.append(fruit.upper())
print(upper_cased)

upper_cased_comp = [
    fruit.upper()
    for fruit in fruits
    if fruit.islower()
]
print(upper_cased_comp)

['PEAR', 'STRAWBERRY', 'TOMATO']
['PEAR', 'STRAWBERRY', 'TOMATO']


3. Finding length of words

In [70]:
words = "the quick brown fox jumps over the lazy dog".split()
lengths = []
for word in words:
    if "o" in word:
        lengths.append(len(word))
print(lengths)

lengths_comp = [
    len(word)
    for word in words
    if "o" in word
]
print(lengths_comp)


[5, 3, 4, 3]
[5, 3, 4, 3]


### Full anatomy of a list comprehension

The anatomy of a list comprehension is dictated by 3 components enclosed in square brackets:
1. a data transformation
2. a data source
3. a data filter (optional)

In [None]:
ns = []
my_list = [
    n ** 2       # data transformation
    for n in ns  # data source
    if n == 0    # data filter
]

There's no restriction on the number of data sources or data filters in a list comprehension

In [None]:
my_list = []
for it2 in it1:
    for _ in it3:
        if p1(it2):
            for it4 in it2:
                for v1 in it4:
                    if p2(v1):
                        if p3(it4):
                            for it6 in it5:
                                for v2, it7 in it6:
                                    for v3, it8m, it9 in it7:
                                        if p2(v1):
                                            for v4, v5 in zip(it8, it9):
                                                my_list.append(func(v1, v2, v3, v4, v5))

This is the transformation to list comprehension

In [None]:
my_list = [
    func(v1, v2, v3, v4, v5)
    for it2 in it1
    for _ in it3
    if p1(it2)
    for it4 in it2
    for v1 in it4
    if p2(v1)
    if p3(it4)
    for it6 in it5
    for v2, it7 in it6
    for v3, it8m, it9 in it7
    if p2(v1)
    for v4, v5 in zip(it8, it9)
]

List comprehensions should be kept simple. For most people, that's just one data source and one data filter (one `for` and one `if`) or two data sources and no data filters (two `for` and zero `if`)

### Advantages of list comprehensions

The main advantages over nested structures are:
- speed
- conciseness
- purity
- readability

Keep in mind **the main advantage is readability**

### Bad use cases

#### Initialising another list

In [73]:
squares = []
print([squares.append(num ** 2) for num in range(10)])
print(squares)

[None, None, None, None, None, None, None, None, None, None]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Here we are creating an empty list and we're appending to it from inside another list... Totally not the point.

Also we created a list not assigned to any variable, so we are creating a list an wasting it.

In [18]:
squares = []
some_list = [squares.append(num ** 2) for num in range(10)]
print(some_list)

[None, None, None, None, None, None, None, None, None, None]


We get a bunch of `None` because that's the return value of method `append`

#### Side effects

In [74]:
numbers = range(10)
[print(value for value in numbers)]

<generator object <genexpr> at 0x10abfa8c0>


[None]

Again, this does things even if you don't assign the list to a variable, for this is better a normal for loop

In [21]:
numbers = range(10)
for value in numbers:
    print(value)

0
1
2
3
4
5
6
7
8
9


#### Replacing built-ins

In [23]:
numbers = range(10)
print(numbers)
my_list = [value for value in numbers]
print(my_list)

range(0, 10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


This looks like a perfect list but the code is equivalent to `list`

In [24]:
my_other_list = list(numbers)
print(my_other_list)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


Another built-in you may end up reinventing is `reversed`.

## Sequence indexing

### Introduction

This is using integers to index linear sequences. 

A very simple example is a string. 

To index a specific character we use square brackets. Python is 0-indexed, so the first character is at index 0.

In [3]:
s = "Indexing is easy!"
print(s[0])
print(s[1])

I
n


### Maximum legal index and index errors

Because indices start at 0, the maximum legal index is the length of the sequence minus 1.

In [6]:
s = "Indexing is easy!"
print(len(s))
print(s[16])
print(s[17])  # This will raise an IndexError

17
!


IndexError: string index out of range

### Negative indices

If the last legal index is the length minus 1, then there is an obvious way to access the last element.

In [9]:
s = "Indexing is easy!"
print(s[len(s)-1])

!


However, Python provides this feature where you can use negative indices to count from the end of the sequence. Think about writing the sequence to the left of itself:
 
|e |a |s |y |! |I|n|d|e|x|i|n|g| |i|
|- |- |- |- |- |-|-|-|-|-|-|-|-|-|-|
|-5|-4|-3|-2|-1|0|1|2|3|4|5|6|7|8|9|

From this figure we can see that the index -1 refers to the last element, -2 to the second last, and so on.

In [10]:
s = "Indexing is easy!"
print(s[-1])
print(s[-2])

!
y


Other way to look at negative indices is to pretend there's a `len(s)` to their left.

In [13]:
s = "Indexing is easy!"

print(s[len(s)-1])
print(s[-1])

print(s[len(s)-5])
print(s[-5])

!
!
e
e


### Indexing idioms

Having seen the basic syntax for indexing, there are a couple of indices that would be helpful if you were able to read them immediatly for what they are, without having to think about them:

In [None]:
s = "Indexing is easy!"
s[0]  # First element of s
s[1]  # Second element of s
s[-1]  # Last element of s
s[-2]  # Second-to-last element of s

### To index or not to index?

Strings, lists and tuples are indexable with integers. Sets and dictionaries are not.

Be careful of things that you think are like a list, but aren't. These include `enumerate`, `zip`, `map` and other objects. None of these are indexable, none of these have a `len` value.

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

e = enumerate(numbers)
# print(e[3])

z = zip(numbers)
# print(z[3])

m = map(str, numbers)
# print(m[3])

<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>


### Best practices in code

#### A looping pattern with `range`

Because of the way both `range` and `len` work, one can understand that `range(len(s))` will generate all the legal indices for `s`.

In [19]:
s = "Indexing is easy!"
print(list(range(len(s))))
print(s[0])
print(s[16])

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
I
!


Thes can lead to an anti-pattern for beginners. To exemplify this, suppose we want to write a program to find unique characters in a string. This is the anti pattern:

In [23]:
s = "Indexing is easy!"
uniques = []
for index in range(len(s)):
    if s[index] not in uniques:
        uniques.append(s[index])

print(uniques)

['I', 'n', 'd', 'e', 'x', 'i', 'g', ' ', 's', 'a', 'y', '!']


The better solution is to use a set for more efficient implementation:

In [22]:
s = "Indexing is easy!"
print(set(s))

{'!', 'd', 'x', 'y', 'e', 'a', 'n', 'I', 'i', ' ', 'g', 's'}


Another way to do it's not using range because of Python's slicing:

In [24]:
s = "Indexing is easy!"
uniques = []
for letter in s:
    if letter not in uniques:
        uniques.append(letter)

print(uniques)

['I', 'n', 'd', 'e', 'x', 'i', 'g', ' ', 's', 'a', 'y', '!']


If you care about the indices, then use:

In [27]:
s = "Indexing is easy!"
print(list(range(len(s))))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]


When you need to work with indices and values:

In [28]:
s = "Indexing is easy!"
print(list(enumerate(s)))

[(0, 'I'), (1, 'n'), (2, 'd'), (3, 'e'), (4, 'x'), (5, 'i'), (6, 'n'), (7, 'g'), (8, ' '), (9, 'i'), (10, 's'), (11, ' '), (12, 'e'), (13, 'a'), (14, 's'), (15, 'y'), (16, '!')]


#### Large expressions as indices

When you are dealing with sequences and indices for those sequences, you may end up needing to perform some calculations to compute new indices. For example, you want the middle element of a string and you don't know about `//` yet:

In [35]:
s = "Indexing is so easy!"
print(s[len(s)/2])

TypeError: string indices must be integers, not 'float'

In [34]:
import math
s = "Indexing is so easy!"
print(s[math.floor(len(s)/2)])
print(s[len(s)//2])  # Pro-tip: the operation // is ideal here

s
s


Another alternative is to create a well-named variable to hold the result of the computation:

In [36]:
import math
s = "Indexing is so easy!"
mid_char_idx = math.floor(len(s)/2)
print(s[mid_char_idx])

s


If you have large expressions to compute indices, consider using an intermediate variable with a descriptive name.

#### Unpacking with indexing

You will find yourself often working with small groups of data, for example pairs of things that you keep together in a small list. For example:

In [39]:
names = ["Mary", "Doe"]

def greet(names, formal):
    if formal:
        print(f"Hello Miss {names[1]}")
    else:
        print(f"Hey there {names[0]}")

greet(names, True)
greet(names, False)

Hello Miss Doe,
Hey there Mary,


You might consider unpacking the `names`before reach the if statement:

In [40]:
names = ["Mary", "Doe"]

def greet(names, formal):
    first, last = names
    if formal:
        print(f"Hello Miss {last}")
    else:
        print(f"Hey there {first}")

greet(names, True)
greet(names, False)

Hello Miss Doe
Hey there Mary


This makes the intent of the code much more obvious. Just from looking at the function first line we know `names` is supposed to be a pair of names. This forces your greet function to expect a pair of names.

In [83]:
names = ["Mary", "Doe", "Jane"]

def greet(names, formal):
    first, last = names
    if formal:
        print(f"Hello Miss {last}")
    else:
        print(f"Hey there {first}")

# greet(names, True)
greet("Mary", False)

ValueError: too many values to unpack (expected 2)

## Idiomatic sequence slicing

### Introduction

Slicing is a "more advanced" way of accessing portions of sequences

### Slicing syntax

Slice in Python is the act of accessing a sequence of elements that are extracted from successive positions of a larger sequence.

Think of working with the same string as before, and extract "icing" from the string

In [None]:
s = "Slicing is so easy!"
substring = ""

for i in range(2, 7):
    substring += s[i]

print(substring)

icing


When you want to slice a sentence you need t use brackets `[]` and a colon `:`to separate the start and end points. The key is to figure out what the start and end points are.

In [45]:
s = "Slicing is so easy!"
print(s[2:7])

icing


The start point is the index of the first element that is included in the slice, whereas the end point is the index of the first element that is not included in the slice.

### What to slice?

Things that you can slice:
- strings
- lists
- tuples
- ranges

In [1]:
print("Hello"[2:4])  # strings

print([1, 2, 3, 4][1:3])  # lists

print((1, 2, 3, 4)[1:3])  # tuples

rg = range(5) # ranges
slice = rg[1:3]
print(slice)  # outputs: range(1, 3)
print(list(slice))  # outputs: [1, 2]

ll
[2, 3]
(2, 3)
range(1, 3)
[1, 2]


### Slicing from the beginning

Now assume we want to extract the word "Slicing" from the string.

Notices that when we use `range` with one argument assumes that the first argument is `0`.

When slicing we can do a similar thing:

In [52]:
s = "Slicing is so easy!"

print(s[0:7])
print(s[:7])

Slicing
Slicing


It's like you never tell Python where the slicing starts, so the bar that is hovering the string ends up covering the whole beginning of the string, stopping at the position you indicate.

### Slicing until the end

Similar to omitting the start point, you can omit the end point of the slice.

Therefore, if we do not specify the end point, we extract all the elements from the point specified to the end.

In [57]:
s = "Slicing is so easy!"
print(s[8:len(s)])
print(s[8:])

is so easy!
is so easy!


### Silicing with negative indices

Slicing can also use negative indexes with the same logic as indexing.

In [58]:
s = "Slicing is so easy!"
print(s[-17:-12])

icing


In fact, using `-17` and `-12` would work in `range` solution but the first one is preferred.

In [59]:
s = "Slicing is so easy!"
substring = ""
for i in range(-17, -12):
    substring += s[i]
print(substring)

icing


### Slicing and `range`

If you are looking at a slice and you have no clue what items are going to be picked up by it, try thinking about the slice in this way, with the `range`. 

### Idiomatic slicing patterns

Suppose you have a variable `n` that is a positive integer (think a small integer like `1` or `2`) and supose `s` is some sequence that supports slicing. Here are the 4 idiomatic slicing patterns:
- s[n:]
- s[-n:]
- s[:n]
- s[:-n]

Why are these "idiomatic" slicing patterns? Because with little practice, you stop looking at them as "slice `s` starting at position blah and ending at position blah"

In [64]:
s = "Slicing is so easy!"
n = 2

print(s[n:])
print(s[-n:])
print(s[:n])
print(s[:-n])

icing is so easy!
y!
Sl
Slicing is so eas


#### s[n:]

If `n` is not negative (so 0 or more), then `s[n:]` means "skip the first `n` elements of `s`"

In [67]:
s = "Slicing is so easy!"

print(s[2:])
print(s[3:])
print(s[4:])

icing is so easy!
cing is so easy!
ing is so easy!


#### s[-n:]

If `n` is positive (so 1 or more), then `s[-n:]` means "the last `n` elements of `s`"

In [68]:
s = "Slicing is so easy!"

print(s[-2:])
print(s[-3:])
print(s[-4:])

y!
sy!
asy!


Careful with `n = 0`, 0 cannot be negative so it will be the first pattern.

#### s[:n]

If `n` is not negative (so 0 or more), then `s[:n]` can be read as "the first `n` elements of `s`"

In [69]:
s = "Slicing is so easy!"

print(s[:2])
print(s[:3])
print(s[:4])

Sl
Sli
Slic


#### s[:-n]

If `n` is positive (so 1 or more), then `s[:-n]` means "drop the last `n` elements of `s`"

In [76]:
s = "Slicing is so easy!"

print(s[:-2])
print(s[:-3])
print(s[:-4])

Slicing is so eas
Slicing is so ea
Slicing is so e



### Empty slices

If you get your start and end points mixed uo, you will end up with empty slices, because your start point is to the right of the end point... And because of negative indices, it is not enough to check if the start point is less than the end point.

In [None]:
s = "Slicing is so easy!"

print(s[10:3])
print(s[-6:-10])
print(s[-9:3])
print(s[10:-10])

### More empty slices

If you use numbers that are too high or too low, slicing still works and does not raise `IndexError`.

If `s[50:]` is "skip the first 50 elements of s" and s only has 16 elements then `s[50:]` will be an empty slice.

### Examples in code

#### Ensuring at most `n` elements.

In [83]:
ordered = ["apple", "banana", "cherry", "date", "elderberry"]

# Return max 3 elements
print(ordered[:3])

['apple', 'banana', 'cherry']


#### Start/end of a string

In [87]:
my_string = "Hello, World!"

print(my_string[:5] == "Hello")
print(my_string.startswith("Hello"))
print(my_string.endswith("World!"))

True
True
True


#### Removing prefixes and suffixes

In [89]:
my_string = "Hello, World!"
print(my_string.removeprefix("Hello, "))
print(my_string.removesuffix("!"))

World!
Hello, World


## str and repr

### Explanation

Python has two built-ins mechanism that allow you to convert an object to a string:
- `str` class
- `repr` built-in function

When to use them:
- The `str` class is used when you want to convert something to the string type.
- The `repr` function is used to create an unambiguos representation of its argument.

The `print` function calls `str` on its argument then displays it, so both are printed the same way, you have no way to tell if the original object is an integer or a string.

In [2]:
print(3)
print("3")

3
3


In the REPL returns an unambiguos representation of the object: you can tell integers from strings apart, because REPL is using `repr`.

In [9]:
3

3

In [10]:
"3"

'3'

`repr` is also used when your object is inside a container like a list or a dictionary..

In [12]:
print([3, '3'])

[3, '3']


### The `__str__` and `__repr__` dunder metods

If you want to display your objects properly, you will want to implement the `__str__` and `__repr__` dunder methods in your class, and the implementation should follow the use case of `str` and `repr` above: 
- the implementation of `__str__` should provide a nice, readable representation of the object.
- the implementation of `__repr__` should represent unambigusly the object.

When implementing custom classes, you should start by implementing `__repr__`, as `__str__` will default to using `__repr__` if no custom implementation is given.

In [13]:
class A:
    def __str__(self):
        return "A"

a = A()
a

<__main__.A at 0x10c19bd70>

In [14]:
class B:
    def __repr__(self):
        return "B"
b = B()
b

B

### Examples in code

#### datetime

Notice that from its `str` we can't even tell that we were dealing with a `datetime.datetime` object.

In [18]:
import datetime
date = datetime.datetime(2024, 1, 2)

print(date)
print(repr(date))
print(str(date))

2024-01-02 00:00:00
datetime.datetime(2024, 1, 2, 0, 0)
2024-01-02 00:00:00


#### 2D point

This is a simple 2D point class with a custom usage of `__str__` and `__repr__`.

In [19]:
class Point2D:
    """A class to represent points in a 2D space."""
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __str__(self):
        """Provide a good looking representation of the object"""
        return f"({self.x}, {self.y})"
    
    def __repr__(self):
        """Provide a unambigus representation of the object"""
        return f"Point2D({repr(self.x)}, {repr(self.y)})"
    
p = Point2D(3, 4)
print(f"To build the point {p} in your code, try writing {repr(p)}")

To build the point (3, 4) in your code, try writing Point2D(3, 4)


## String formatting comparison

### Three string formatting methods

1. `langugage_info_cstyle` uses old-style string formatting, borrowed from the similar C syntax that does the same thing.
2. `language_info_format` uses the string method `.format`, introduced in PEP 3101.
3. `language_info_fstring` uses the f-string syntax introduced in PEP 498.

#### C-style formatting

Is characterised by a series of perecent signs ("%") that show up in the template.

These signs indicate the places where the bits of information should go, and the character that comes next (`%s`.`%d`)

The way which we apply the formatting is trough the binary operator `%`: on the left you put the template string, and on the right you put the arguments.

In [7]:
phrase = "%s rocks! Did you know that %s has around %d users?!" % ("Python", "Python", 100)
print(phrase)

Python rocks! Did you know that Python has around 100 users?!


#### String method `.format`

Is a method of the string type. You typically have a format string and, when you get access to the missing pieces of information, you just call the `.format` method on that string.

Strings that use `.format` are characterised by the occurrence of a series of curly braces "{}"

In [27]:
phrase = "{} rocks! Did you know that {} has around {} users?!".format("Python", "Python", 100)
print(phrase)

Python rocks! Did you know that Python has around 100 users?!


#### Literal string interpolation, or f-strings

Is the process through which you interpolate values into strings.

I characterised by the `f` prefix on the string literals, and also the curly braces "{}" inside the string.

Using a letter as a prefix to a string literal is not an idea introduced by string interpolation.

In [28]:
# bytes
b_object = b"This is a bytes object!"
print(b_object)
print(type(b_object))

# raw
r_object = r"This is a raw string object!\nThis string contains newline characters."
print(r_object)
print(type(r_object))

# fstring
f_object = f"This is a f-string object!\nThis string contains newline characters."
print(f_object)
print(type(f_object))

b'This is a bytes object!'
<class 'bytes'>
This is a raw string object!\nThis string contains newline characters.
<class 'str'>
This is a f-string object!
This string contains newline characters.
<class 'str'>


### Value conversion

When doing string formatting, the objects that we want to format into the template string need to be converted to strings.

This is done by calling `str`. However, sometimes it's beneficial to have the object be represented with the result from calling `repr`.

There are special ways to determine which type of string conversion happens.

In [35]:
class Data:
    def __str__(self):
        return "str"
    
    def __repr__(self):
        return "repr"

data = Data()

# C-style formatting
my_str = "%s %r" % (data, data)
print(my_str)

# String method `.format`
my_str = "{!s} {!r}".format(data, data)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"{data!s} {data!r}"
print(my_str)


str repr
str repr
str repr


### Alignment

When we need to format many values across many lines, like a table, we might want to align all values and pad them accordingly.

In [12]:
lang = "Python"

# C-style formatting
my_str = "'%-10s'" % lang
print(my_str)

# String method `.format`
my_str = "'{:<10}'".format(lang)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"'{lang:<10}'"
print(my_str)

'Python    '
'Python    '
'Python    '


The C-styles aligns on the right, whereas `.format` and f-strings align to the left. Hence, we could have written.

In [10]:
# String method `.format`
my_str = "'{:10}'".format(lang)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"'{lang:10}'"
print(my_str)

'Python    '
'Python    '


With the two modern methods we can align to the center but not with C-style formatting.

In [44]:
# String method `.format`
my_str = "'{:^10}'".format(lang)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"'{lang:^10}'"
print(my_str)

'  Python  '
'  Python  '


To the right.

In [46]:
# C-style formatting
my_str = "'%10s'" % lang
print(my_str)

# String method `.format`
my_str = "'{:>10}'".format(lang)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"'{lang:>10}'"
print(my_str)

'    Python'
'    Python'
'    Python'


### Named placeholders

For longer strings, it may be helpful to include placeholder strings. With f-strings this is automatically done, but can also be done with `.format` and C-style formatting.

In [50]:
name, age = "John Doe", 30

# C-style formatting
my_str = "%(name)s is %(age)d years old" % {"name": name, "age": age}
print(my_str)

# String method `.format`
my_str = "{name} is {age} years old".format(name=name, age=age)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"{name} is {age} years old"
print(my_str)

John Doe is 30 years old
John Doe is 30 years old
John Doe is 30 years old


### Accessing nested data structures

Let's imagine that the name and age were actually stored in a dictionary.

In [53]:
data = {"name": "John Doe", "age": 30}

# C-style formatting
my_str = "%(name)s is %(age)d years old" % data
print(my_str)

# String method `.format`
my_str = "{name} is {age} years old".format(**data)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"{data['name']} is {data['age']} years old"
print(my_str)

John Doe is 30 years old
John Doe is 30 years old
John Doe is 30 years old


In [57]:
class ConvultedExample:
    values = [{"name": "John Doe", "age": 30}, {"name": "Jane Doe", "age": 25}]

ce = ConvultedExample()

my_str = "Name is: {ce.values[0][name]}".format(ce=ce)
print(my_str)

my_str = f"Name is: {ce.values[0]['name']}"
print(my_str)

Name is: John Doe
Name is: John Doe


### Parametrised formatting

Sometimes, you want to do some string formatting, but the exact formatting is dynamic.

Say you have a list of companies and their countries of origin, and you want that to be aligned.

In [1]:
data = [("Toyota", "Japan"), ("Ford", "USA")]

for brand, country in data:
    print(f"{brand:>7}, {country:>9}")

 Toyota,     Japan
   Ford,       USA


The thing is, what if we now include a company with a longer name?

In [2]:
data = [("Toyota", "Japan"), ("Ford", "USA"), ("Lamborghini", "Italy")]


for brand, country in data:
    print(f"{brand:>7}, {country:>9}")

 Toyota,     Japan
   Ford,       USA
Lamborghini,     Italy


Output is no longer aligned, we need to dynamically compute the maximum lengths and use them to create the correct format specifications. This is where parametrised formatting comes in handy.

In [4]:
data = [("Toyota", "Japan"), ("Ford", "USA"), ("Lamborghini", "Italy")]

# Compute brand width and cpuntry with needed for formatting
bw = 1 + max(len(brand) for brand, _ in data)
cw = 1 + max(len(country) for _, country in data)

for brand, country in data:
    print(f"{brand:>{bw}}, {country:>{cw}}")

      Toyota,  Japan
        Ford,    USA
 Lamborghini,  Italy


Old style formatting only allows parametrisation of the width of the field and the precision used. For the string method `.format` and f-strings, parametrisation can be used with all the format specifier options

In [7]:
month = "November"
prec = 3
value = 2.7256

# C-style formatting
my_str = "%.*s = %.*f" % (prec, month, prec, value)
print(my_str)

# String method `.format`
my_str = "{:.{prec}} = {:.{prec}f}".format(month, value, prec=prec)
print(my_str)

# Literal string interpolation, or f-strings
my_str = f"{month:.{prec}} = {value:.{prec}f}"
print(my_str)

Nov = 2.726
Nov = 2.726
Nov = 2.726


### Custom formatting

The string method `.format` and f-strings allow you to define how your own custom objects should be formatted through the dunder method `__format__`.

`__format__` accepts a string (format specification) and it returns the corresponding string.

In [15]:
class Person:
    def __format__(self, format_spec):
        return "N" if "n" in format_spec else "Y"
    
person = Person()

print("{:aaabbbccc}".format(person))
print(f"{person:nope}")

Y
N


### Examples

Better to use f-strings or `.format` than old style except when maintaining legacy code.

#### Plain formatting

F-strings are short, have good locality, and fast.

In [19]:
name, age, width = "John Doe", 30, 10

# Prefer
print(f"{name!s} {name!r}")
print(f"{name:<10}")
print(f"{name} is {age} years old")
print(f"{name:^{width}}")

# Over
print("{!s} {!r}".format(name, name))
print("{:<10}".format(name))
print("{name} is {age} years old".format(name=name, age=age))
print("{:^{w}}".format(name, w=width))

John Doe 'John Doe'
John Doe  
John Doe is 30 years old
 John Doe 
John Doe 'John Doe'
John Doe  
John Doe is 30 years old
 John Doe 


#### Data in dictionary

If all your formatting data is already in a dictionary, then using `.format` might be the best way to go. Because f-strings will be more verbose when compared with the usage of `**` in `.format`.

In [21]:
data = {"name": "John Doe", "age": 30}

# Nice
print("{name} is {age} years old".format(**data))

# This is cumbersome
print(f"{data['name']} is {data['age']} years old")

John Doe is 30 years old
John Doe is 30 years old


#### Deferred formatting

If you need to create your formatting string first, and only format it later, then you cannot use f-strings.

In [16]:
def get_greeting(language):
    if language == "pt":
        return "Olá, {}"
    else:
        return "Hello, {}"
    
lang = input(" [en/pt] >> ")
name = input(" your name >> ")

get_greeting(lang).format(name)

'Hello, test'

## Assignment expressions and the walrus operator `:=`

### Walrus operator and assignment expressions

The walrus operator is written as `:=` and was introduced in Python 3.8. The walrus operator is used in assignments expressions, which means assignments can now be used as a part of an expression, whereas before the assignments were only possible as statements.

An assignment statement assigns a value to a variable name. With an assignment expression, that value can then be immediatly reused.

In [17]:
a = 3
print(a)

3


In [18]:
print(b = 3)

TypeError: 'b' is an invalid keyword argument for print()

In [19]:
print(b := 3)

3


A good usage of assignment expressions can help write better code: code that is clearer and runs faster.

Assignment expressions should be avoided when they make the code too convoluted. It's better to write readable code.

In [22]:
import sys

if(i := input())[0] == "q" or i == "exit":
    sys.exit()

SystemExit: 

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


A better alternative would have been:

In [23]:
import sys

i = input()
if i[0] == "q" or i == "exit":
    sys.exit()

SystemExit: 

The second alternative is much easier to read than the first one.

However, good uses of assignment expressions can:
- make your code faster
- make it more readable/expressive
- make your code shorter

### Examples in code

#### Controlling a while loop with initialisation

Consider the following `while` loop:

In [26]:
inp = input()
while inp:
    eval(inp)
    inp = input()

This code can be used to create a very basic Python repl inside your Python program, and the REPL stops when given an empty input. First, you have to initialise `inp`, because you want to use it in the `while` condition, but then you have to update `inp` inside the loop.

With assignment expressions, the code can be written as:

In [None]:
while inp := input():
    eval(inp)

#### Reducing visual noise

Say you want to count the number of trailing zeros in an integer. An easy way to do so would be to convert the integer to a string, find its length, and then substract the length of that same string with all its trailing zeroes.

In [None]:
def count_trailing_zeros(n):
    s = str(n)
    return len(s) - len(s.rstrip("0"))

For such a simple function, it kind of looks sad to have such a short `s = str(n)` line. With assignment expressions, you can make the code more concise:

In [None]:
def count_trailing_zeros(n):
    return len(s := str(n)) - len(s.rstrip("0"))

#### Reuse computations in list comprehensions

Suppose you are writing a list comprehension with an `if` filter, but the filter test in the comprehension uses a value that you also want to use in the list itself.

For example, you have a list of integers, and want to keep the factorials of the numbers for which the factorial has more than 50 trailing zeroes.

In [None]:
from math import factorial

def count_trailing_zeros(n):
    return len(s := str(n)) - len(s.rstrip("0"))

l = [2, 17, 89, 15, 58, 193]
facts = [factorial(num) for num in l if count_trailing_zeros(factorial(num)) > 50]
print(facts)

[481199779677977486016699009358137978183480804067261380813085594116305751890010955912922305852067338518684640096193435851940520911246181662702714818813933314316279628102998441493337890446893955104871678797693253036994704678292343992633265456528607486050757463669283236066454922775411200834380867273693778876760002114053184802443542074196048641769699505814352221988511945689840957059455495890545683217923389191494429859199577347929594024990968456430204018693811756039644243332221141259743748178042426333097698042939528700346193541250142100456476640632401620075601086652905686461283425571473509853587241546232533718674707651204220738679639357752586921097530417620943435690504974703535317644815031747509118582309069983610660847877583161105857360133653774318607385722613257382336568352719473526951808655730438340279555390127654893726450425044065977523574819315328723566354112245783340405222947464028295854584787087783463794318623688248190091770914440348859413943193439102231686558697617996690750595276085

The problem is that the code above computes the factorial for each number twice, and if numbers get big, this can become really slow. Using assignment expressions, you can compute the factorial once and reuse it:

In [36]:
from math import factorial

def count_trailing_zeros(n):
    return len(s := str(n)) - len(s.rstrip("0"))

l = [2, 17, 89, 15, 58, 193]
facts = [f for num in l if count_trailing_zeros(f := factorial(num)) > 50]
print(facts)

[]


The use of `:=` allows to reuse the expensive computaton of the factorial `num`.

Two other alternatives without assignment expressions, would be

In [None]:
from math import factorial

l = [2, 17, 89, 15, 58, 193]

# Alternative 1
facts = [factorial(num) for num in l]
facts = [num for num in facts if count_trailing_zeros(num) > 50]

# Alternative 2
facts = [num for num in map(factorial, l) if count_trailing_zeros(num) > 50]

The second one can be more efficient because it only computes the factorials as they are needed.

#### Flattening related logic

Imagine you reach a point in your code where you need to pick an operation to do to your data, and you have a series of things you would like to try.

As a very simple example, suppose you have a string that may contain an email or a phone number.

In [None]:
import re

string = input("Your contact info: >> ")
email = re.search(r"\b(\w+@\w+\.com)\b", string)

if email:
    print(f"Email found: {email.group(1)}")
else:
    phone = re.search(r"\d{9}", string)
    if phone:
        print(f"Phone number found: {phone.group(0)}")
    else:
        print("No email or phone number found.")

Notice the code above is nested but the logic is flat: look for successive things and stop as soon as we find something.

In [None]:
import re

string = input("Your contact info: >> ")

if email := re.search(r"\b(\w+@\w+\.com)\b", string):
    print(f"Email found: {email.group(1)}")
elif phone := re.search(r"\d{9}", string):
    print(f"Phone number found: {phone.group(0)}")
else:
    print("No email or phone number found.")

## String translate ande maketrans methods

### `str.translate`

`str.translate(table)` method returns a copy of the string in which each character has been mapped through the given translation table.

The translationt able being mentioned here is the only argument thath `str.translate` accepts.

In its simplest form `str.translate` is similar to the method `str.replace`

In [13]:
my_string = "Hello, World!"
print(my_string.replace("l", "L"))


HeLLo, WorLd!


#### Character code points

Unicode is an standar to map characters to numbers.

Python provides two useful built-in functions to do these conversions:
- `chr`
- `ord`

In [18]:
print(ord("A"))
print(ord("a"))
print(ord(" "))

print(chr(65))
print(chr(97))
print(chr(32))
print(chr(128013))

65
97
32
A
a
 
🐍


`chr` takes an integer and returns the character that the integer represents.

`ord` takes a character and returns the integer correspoindinto its Unicode code point.

The "code point" of a character is the integer that corresponds to it in the standard being used - Unicode standard in the case of Python.

#### Translation dictionaries

The translation dict that is fed as an argument to `str.translate` specifies substitutions that are going to take place in the target string.

The dictionary needs to map Unicode code points to other Unicode points, to other strings, or to `None`.

In [37]:
print(ord("a"), ord("b"), ord("c"))
print(ord("A"))

translaton = "aaa bbb ccc".translate({97:65, 98:"BBB", 99:None})
print(translaton)

other_trans = "Hey, aaa bbb ccc, how are you?".translate({97:65, 98:"BBB", 99:None})
print(other_trans)

97 98 99
65
AAA BBBBBBBBB 
Hey, AAA BBBBBBBBB , how Are you?


### Non-equivalence to `str.replace`

Rewriting this using `str.replace`.

In [23]:
my_str = "Hey, aaa bbb ccc, how are you?"
from_ = "abc"
to_ = ["A", "BBB", ""]

for f,t in zip(from_, to_):
    my_str = my_str.replace(f, t)

print(my_str)

Hey, AAA BBBBBBBBB , how Are you?


This is doing more work than `str.translate` because for every loop iteration the `str.replace` has to go over the whole string looking for the characters to replace.

What if we wanted to take a sring of zeroes and ones and replace all zeroes with ones, and vice-versa? Here's the solution with `str.replace`.

In [24]:
s = "001011010101001"
from_ = "01"
to_ = "10"

for f,t in zip(from_, to_):
    s = s.replace(f, t)
print(s)

000000000000000


This did not work because the first iteration converts all into ones and the second iteration has no way to know what ones are original and which ones used to be zeroes. 

In order to achieve the correct effect, we need `str.translate`.

In [25]:
s = "001011010101001".translate({ord("0"):"1", ord("1"):"0"})
print(s)

110100101010110


### Generic translation tables

`str.translate`accepts a "translation table", but that table does not need to be a dictionary. It can be any object that supports indexing with square brackets. People use mappings or sequences, but you can use your own custom objects.

In [4]:
translation_table = [i for i in range(91)]
print(translation_table)

for letter in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
    translation_table[ord(letter)] = 2 * letter.lower()

print(translation_table)

translation = "Hey, what's UP?".translate(translation_table)
print(translation)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 'aa', 'bb', 'cc', 'dd', 'ee', 'ff', 'gg', 'hh', 'ii', 'jj', 'kk', 'll', 'mm', 'nn', 'oo', 'pp', 'qq', 'rr', 'ss', 'tt', 'uu', 'vv', 'ww', 'xx', 'yy', 'zz']
hhey, what's uupp?


### `str.maketrans`

The method `str.maketrans` is a utility method that provides for a convenient way of creating translations tables that can be used with `str.translate`.

`str.maketrans` accepts up to 3 arguments, so let's break them down.

#### Single argument

The version of `str.maketrans` that only accepts one argument has the purpose of making it simpler for us, users, to define dictionaries that can be used with `str.translate`.

When using dictionaries as translation tables we need to make sure that the keys of the dictionary are the code points of the characters we want to replace.

This introduces some boilerplate, because we know the characters we want to replace, not their code points, so we need to do the conversion by hand previously, or when defining the dictionary with `ord`.

This is ugly:

In [None]:
"001011010101001".translate(
    {ord("0"): "1", ord("1"): "0"}
)

'110100101010110'

It would be better if we could just write the dictionary in its most natural form, for this we need to use `str.maketrans`

In [1]:
"001011010101001".translate(
    str.maketrans({"0": "1", "1": "0"})
)

'110100101010110'

#### Two arguments

In the previous example we replaced some characters with some other single characters.

This is so common, that the method `str.maketrans` can be used to create translation tables of this sort. For that the first argument to `str.maketrans` should be a string consisting of the characters to be replaced, and the second argument is the string with the corresponding new characters.

In [13]:
"001011010101001".translate(
    str.maketrans("01", "10")
)

'110100101010110'

In [12]:
"#0F45cd".translate(
    str.maketrans("abcdef", "ABCDEF")
)

'#0F45CD'

#### The third argument

The third argument to `str.translate` is simply a string of all the characters that should be mapped to `None` or, on other words, that should be removed altogether from the string.

In [15]:
"# 0F45cd".translate(
    str.maketrans("abcdef", "ABCDEF", "# ")
)

'0F45CD'

### Examples in code

#### Caesar cipher

Caesar cipher is a function that takes two arguments. The first, a string, specifies some text. The second, an integer, specifies an integer key. Then, the upper case letters of the argument string should be shifted, along the alphabet, by the amount specified by the key.

In [2]:
from string import ascii_uppercase

ABC = ascii_uppercase

print(ABC)


def caesar(msg, key):
    return msg.translate(
        str.maketrans(ABC, ABC[key:] + ABC[:key])
    )

caesar("HELLO", 7)

ABCDEFGHIJKLMNOPQRSTUVWXYZ


'OLSSV'

#### Sanitising file names

In [3]:
def sanitize_windows_name(arcname):
    illegal = ':<>|"?*'
    table = str.maketrans(illegal, '_' * len(illegal))
    return arcname.translate(table)

sanitize_windows_name("My:File")

'My_File'

## Conditional expressions


### What is a conditional expression?

A conditional expression in Python is an expression whose value depends on a condition.

#### Expresions and statements

In [4]:
3 + 4 * 5

23

This code is an expression, and that that expression evaluates to 23.

Some pieces of code are not expressions. For example, `pass` is just a statement, it does not "have" or "evaluate to" any result.

Easy way to identify expressions try to stick it in a `print` function. Expressions can be used inside other expressions. Statements not because print want to print *something* but p.e. `pass` gives nothing.

In [30]:
print(3 + 4 * 5)

23


In [5]:
print(pass)

SyntaxError: invalid syntax (635316705.py, line 1)

#### Conditions

We are very used to using `if` statements to run pieces of code when certain conditions are met. Rewording that, a condition can dictate what piece of code run.

In conditional expressions, we will use a condition to change the value to which the expression evaluates.

#### Syntax

A conditional expression is composed of 3 sub-expressions and the keywords `if` and `else`. None of these components are optional. All of them have to be present.

In [None]:
expression_if_true if condition else expression_if_false

First, `condition` is evaluated. Then depending on whether `condition` evaluates to Thruty or Falsy, the expression evaluates `expression_if_true` or `expression_if_false`.

`expression_if_true` and `expression_if_false` can themselves be expressions. This means they can be simple literal values like `42` or other complicated expressions like other conditional expressions.

#### Examples of conditional expressions

In [32]:
42 if True else 0

42

In [6]:
42 if False else 0

0

In [34]:
"Mathsapp".lower() if pow(3, 27, 10) > 5 else "Oh boy"

'mathsapp'

#### Reading a conditional expression

While the conditional expression presents the operands in an order that may throw you off, it is easy to read it as an English sentence.

In [None]:
value if condition else other_value

This can be read as:

"Evaluate to value if condition is true, otherwise evaluate to other_value"

or 

"Give value if condition is true and other_value otherwise"

#### Does Python have a ternary operator?

Many other languages have a ternary operator that looks like `condition ? expr_if_true : expr_if_false`. Python does not have such ternary operator, but conditional expressions are similar.

### Rationale

Programmers are often faced with a situation where they have to pick one of two values.

#### Examples with `if` statements

computing the pairty of an integer

In [35]:
def pairity(n):
    if n % 2:
        return "odd"
    else:
        return "even"
    
print(pairity(15))
print(pairity(42))

odd
even


computing absolute value of a number (there's already a built-in function in Python)

In [37]:
def absolute_number(x):
    if x > 0:
        return x
    else:
        return -x
    
print(absolute_number(-10))
print(absolute_number(42))

10
42


#### Refactored examples

In [38]:
def pairity(n):
    return "odd" if n % 2 else "even"

print(pairity(15))
print(pairity(42))

odd
even


In [39]:
def absolute_number(x):
    return x if x > 0 else -x

print(absolute_number(-10))
print(absolute_number(42))

10
42


### Short-circuiting

Conditional expressions also short-circuit. Example:

In [40]:
def ucs(x):
    if isinstance(x, int):
        return chr(x)
    else:
        return ord(x)
    
print(ucs(65))
print(ucs("A"))
print(ucs(102))
print(ucs("f"))

A
65
f
102


we need to consider that `ord` throws an error when called on integers, and `chr` fails when called on characters

In [9]:
# ord(65)
chr("f")

TypeError: 'str' object cannot be interpreted as an integer

Since conditional expressions short-circuit, we can implement it like

In [44]:
def ucs(x):
    return chr(x) if isinstance(x, int) else ord(x)

print(ucs(65))
print(ucs("A"))
print(ucs(102))
print(ucs("f"))

A
65
f
102


We see that when `x` is an integer, `ord(x)` never runs. On the flip side, when `x` is not an integer, `chr(x)` never runs.

### Conditional expressions and if statements

#### Equivalence to `if`

There's a close relationship between the conditional expression and the `if` statement, and that close relationship is that of equivalence. The two pices of code are exactly equivalent.

In [None]:
if condition:
    name = expr_if_true
else:
    name = expr_if_false


name = expr_if_true if condition else expr_if_false

#### Equivalence to `if-elif-else` blocks

Let's try to rewrite the following functon to use a conditional expression.

In [45]:
def sign(x):
    if x == 0:
        return 0
    elif x > 0:
        return 1
    else:
        return -1
    
print(sign(0))
print(sign(-73))
print(sign(42))

0
-1
1


Conditionals expressions do not allow the usage of the `elif` keyword so, instead, we start by reworking the `if` block itself:

In [None]:
def sign(x):
    if x == 0:
        return 0
    else:
        if x > 0:
            return 1
        else:
            return -1

This isn't a great implementation, but this intermediate representation makes it clearer that the bottom of the `if` block can be replaced with a conditional expression.

In [None]:
def sign(x):
    if x == 0:
        return 0
    else:
        return 1 if x > 0 else -1

Now, if we abstract away from the fact that the second return value is a conditional expression itself, we can reqrite the existing `if` block as a conditional expression.

In [46]:
def sign(x):
    return 0 if x == 0 else (1 if x > 0 else -1)

print(sign(0))
print(sign(-73))
print(sign(42))

0
-1
1


This shows that conditional expressions can be nested. We need to check whether the parenthesis are needed or not.

In [47]:
def sign(x):
    return 0 if x == 0 else 1 if x > 0 else -1

print(sign(0))
print(sign(-73))
print(sign(42))

0
-1
1


This can be read as "return 0 if x is 0, otherwise, return 1 if x is positive otherwise return -1".

The repetition of the word "otherwise" becomes cumbersome, a good indicator that is not generally a good idea to get carried away and chain several condition expressions.

#### Non-equivalence to function wrapper

Because the last equivalence, many people may believe that conditional expressions could be implemented as a function enclosing the previous `if... else...` block like

In [48]:
def cond(condition, value_if_true, value_if_false):
    if condition:
        return value_if_true
    else:
        return value_if_false
    
print(cond(pow(3, 27, 10) > 5, "Mathsapp".lower(), "Oh boy"))

mathsapp


This is not possible, because the function call to `cond` only happens after we have evaluated all the arguments (not short-circuiting). Hence, we can't use this cond to implement `ucs`

In [49]:
def ucs(x):
    return cond(isinstance(x, int), chr(x), ord(x))

print(ucs(65))

TypeError: ord() expected string of length 1, but int found

When given `65`, the first argument evaluates to `True`, and the second evaluates to "A", but the third one raises an error

### Precedence

Conditional expressions are the expressions with lowest precedence.

This means that sometimes you may need to perenthesise a conditional expression if you are using it inside another expression.

In [11]:
def foo(n, b):
    if b:
        to_add = 10
    else:
        to_add = -10
    return n + to_add

print(foo(42, True))
print(foo(42, False))

52
32


You might spot the pattern of assigning one of two values, and decide to use a conditional expression:

In [12]:
def foo(n, b):
    to_add = 10 if b else -10
    return n + to_add

print(foo(42, True))
print(foo(42, False))

52
32


But then, you decide there is no need to waste a line here, and you decide to inline the conditional expression (that is, put the conditional expression inside the arithmetic expression with `n +`)

In [13]:
def foo(n, b):
    return n + 10 if b else -10

print(foo(42, True))
print(foo(42, False))

52
-10


That is because the expression

`n + 10 if b else -10`

is seen by Python as 

`(n + 10) if b else -10`

while you meant for it to mean

`n + (10 if b else -10)`

### Conditional expressions that evaluate to Booleans

These are some things you should avoid when using conditional expressions

Conditional expressions are suboptimal when they evaluate to Boolean values. Here's an example:

In [15]:
def is_huge(n):
    return True if n > pow(10, 10) else False

print(is_huge(3.1415))
print(is_huge(73_324_634_325_242))

False
True


The conditional expression isn't doing anything relevant! The conditional expression just evaluates to the same vale as the condition itself!

In [16]:
def is_huge(n):
    return n > pow(10, 10)

print(is_huge(3.1415))
print(is_huge(73_324_634_325_242))

False
True


Take this with you: never use 

`if: ... else: ...`

or conditional expressions to evaluate to/return Boolean values. Often it suffices to work with the condition alone.

A related use case where conditional expressions shouldn't be used is when assigning default values to variables. Some of these default values can be assigned with **Boolean short-circuiting** using th `or` operator.

### Examples in code

#### The dictionary `.get` method

The `collections` has a `ChainMap` class. This can be used to chain several dictioneries together. This can be used when you want to jusxtapose user configurations with default configurations.

`ChainMap` also defines a `.get` method, much like a dictionary. This method tries to retrieve a key and returns a default value if it finds it

In [None]:
from collections import ChainMap

user_config = {"name": "mathsapp"}
default_config = {"name": "<noname>", "fullscreen": True}

config = ChainMap(user_config, default_config)
print(config)

# Access a key directly
print(config["fullscreen"])

# config["darkmode"] fails with a KeyError 
print(config.get("darkmode", False))

# This is the method
def get(self, key, default=None):
    return self[key] if key in self else default

ChainMap({'name': 'mathsapp'}, {'name': '<noname>', 'fullscreen': True})
True
False


Simple! Return the value associated with the `key` if `key` is in the dictionary, otherwise return the default value.

#### Resolving paths

The module `pathlib` is great when dealing with paths. One of the functionalities provided is the `.resolve` method, that takes a path and makes it absolute, getting rid of symlinks along the way:

In [None]:
import os
from pathlib import Path

print(Path("..").resolve())
print(Path("/Users"))


# This is part of the resolve method
def resolve(self, path, strict=False):
    # ...
    base = '' if path.is_abslute() else os.getcwd()

    return _resolve(base, str(path)) or sep

/Users/kzms664/Documents/tests_folder
/Users


Before calling the aux function `_resolve` and returning, the function figures out if there is a need to add a base to the path.

If the path is relative, like `".."`, then the base is set to be current working directory (`os.getcwd()`). If the path is absolute, then there is no need for a base, because is already there.

## Dunder methods

### Introduction

Python is a language that has a rich set of built-in functions and operators that work really well with the built-in types. For example, the operator `+` works on numbers, as addition, but it also works on strings, lists, and tuples, as concatenation:

In [32]:
print( 1 + 2.3)
print([1, 2, 3] + [4, 5, 6])

3.3
[1, 2, 3, 4, 5, 6]


This happens through **dunder methods**

### What are dunder methods?

Dunder methods are methods that allow instances of a class to interact with the built-in functions and operators of the language.

The word "dunder" comes from the "double underscore", because the names of the dunder methods start and end with two underscores, for example `__str__` or `__add__`.

Typically, dunder methods are not invoked directly by the programmer, making it look like they are called magic. Sometimes are referred as "magic methods".

They are not called magically. They are just called implicity by the language, at specific times that are well-defined, and that depend on the dunder method in question.

#### The dunder method everyone knows

If you have defined classes in Python, you are bound to have crossed paths with a dunder method: `__init__`.

`__init__` is responsible for initialising your instance of the class, which is why it is in there that you usually set a bunch of attributes related to arguments the class received.

In [33]:
class Square:
    def __init__(self, side_length):
        """__init__ is the dunder method that INITialises the instance.

        To create a square, we need to know the length of its side,
        so that will be passed as an argument later, e.g. with Square(1).
        To make sure the instance knows its own side length,
        we save it wth self.side_length = side_length.
        """
        print("Inside init!")
        self.side_length = side_length

sq = Square(1)

Inside init!


The dunder method `__init__` was called implicitly by the language when you created an instance of a square.

#### Why do dunder methods start and end with two underscores?

The underscores does not have any special significance, this does nothing special.

The underscores are just there to prevent name collision with other methods implemented by unsuspected programmers.

Think of this way: Python has a built-in method called `sum`. You can define `sum` to be something else, but then you lose access to the built-in that sums things, right?

In [35]:
sum(range(10))

sum = 45
sum(range(10))

TypeError: 'int' object is not callable

That's why Python decided that magic methods would have names that start and end with two underscores, to make it less likely  that someone would override one of those methods by accident.

All in all, dunder methods are just like an other method that you have implemented, with the small exception that dunder methods can be called implicity by the language.

#### Operator overloading in Python and dunder methods

All Python operators, like `+`, `==`, and `in`, rely on dunder methods to implement their behaviour.

For example, when Python encounters the code `value in container`, it actually turns that into a call to the appropiate dunder method `__contains__`, which means that Python actually runs the expression `container.__contains__(value)`

In [39]:
my_list = [2, 4, 6]

print(3 in my_list)
print(my_list.__contains__(3))

print(6 in my_list)
print(my_list.__contains__(6))

False
False
True
True


When you want to overload certain operators to make them work in a custom way with your own objects, you need to implement the respective dunder methods.

If you were to create your own type of container, you could implement the dunder method `__contains__` to make sure that your contaners could be on the right-hand side of an expression with the operator `in`.

### List of dunder methods and their interactions

In the book there is a big table explaining all the dunder methods.

### Exploring a dunder method

Three steps to explore a new dunder method when you are lerning about that dunder method.
1. try to understand when the dunder method is called.
2. implement a stub for that method and trigger it with code
3. use the dunder method in a useful situation 

Follow these steps with a practical example, the dunder method `__missing__`

#### What is the dunder method for?

Wha is the dunder method `__missing__` for? as the docs says:

"Called by `dict.__getitem__()` to implement `self[key]` for `dict` subclasseswhen `key` is not in the dictionary"

The dunder method `__missing__` is only relevant for subclasses of `dict`, and is called whenever we cannot find a given key in the dictionary.

#### How to trigger the dunder method

In what situations, that I can reacreate, does the dunder method `__missing__` get called?

From what docs says, it looks like we might need a dictionary subclass, and then we need to access a key that does not exist in that dictionary. This should be enough to trigger the dunde method `__missing__`:

In [None]:
class DictSubclass(dict):
    def __missing__(self, key):
        print(f"Missing {key = }")

my_dict = DictSubclass()
my_dict[0] = True

if my_dict[0]:
    print("Key 0 was 'True'")

my_dict[1]

Key 0 was 'True'
Missing key = 1


#### Using the dunder method in a useful situation

We can try implementing `defaultdict` based on  `__missing__`.

`defaultdict` is a container from the module `collections`, and it's just like a dictionary, except that it uses a factory to generate default values when keys are missing.

Here's an instance of `defaultdict` that returns the value `0` by default:

In [48]:
from collections import defaultdict

olympic_medals = defaultdict(lambda: 0)  # Produce 0 by default

olympic_medals["Phelps"] = 28
print(olympic_medals["Phelps"])
print(olympic_medals["me"])


28
0


To reimplement `defaultdict`, we need to accept a factory function, we need to save that factory, and we need to use it inside `__missing__`.

Just as a side note, notice that `defaultdict` not only returns the default value, but also assigns it to the key that wasn't there before:

In [None]:
from collections import defaultdict

olympic_medals = defaultdict(lambda: 0)  # Produce 0 by default

print(olympic_medals)  # Notice the underlying dictionary is empty

print(olympic_medals["me"])

print(olympic_medals)  # It's not empty anymore

defaultdict(<function <lambda> at 0x10abb8220>, {})
0
defaultdict(<function <lambda> at 0x10abb8220>, {'me': 0})


Given all of this, here is a possible remplementation of `defaultdict`

In [53]:
class my_defaultdict(dict):
    def __init__(self, default_factory, **kwargs):
        super().__init__(**kwargs)
        self.default_factory = default_factory

    def __missing__(self, key):
        """Populate the missing key and return its value."""
        self[key] = self.default_factory()
        return self[key]
    
olympic_medals = my_defaultdict(lambda: 0)  # Produce 0 by default
olympic_medals["Phelps"] = 28

print(olympic_medals["Phelps"])
print(olympic_medals["me"])

28
0
