## Zip up

### How zip works

In a simple for loop we have an iterator `it`

In [None]:
it = range(5)
for element in it:
    print(element)

An "iterator" is something that can be traversed linearly, like a list or a string.

Sometimes you will have two iterators with related information  and we need to loop over those iterators to do something. Check this example:

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for i in range(len(firsts)):
    print(f"{firsts[i]} {lasts[i]}")

This is what `zip` is for: use a pair up iterables that you want to traverse at the same time.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for first, last in zip(firsts, lasts):
    print(f"{first} {last}")

We ar doin an unpacking assignment because zip actually returns tuples.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

for z in zip(firsts, lasts):
    print(z)

### Zip is lazy

`zip` does not create tuples immediatly. `zip` is lazy, meaning it generates tuples on the fly when you iterate over it, for example when you iterate over them in a `for` loop or when you convert it to a list.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson", "Davis"]
z = zip(firsts, lasts)
print(z)
print(list(z))

`zip` being lazy means that by itself is not that similar to a list. For example, you cannot ask for the length of a zip object.

In [None]:
len(z)

### Three is a crowd

`zip` can take three or more iterables and return a tuple of the same length as the shortest iterable.

In [11]:
firsts = ["John", "Jane", "Jack"]
middles = ["Z.", "A.", "C."]
lasts = ["Doe", "Smith", "Johnson"]

for z in zip(firsts, middles, lasts):
    print(z)

('John', 'Z.', 'Doe')
('Jane', 'A.', 'Smith')
('Jack', 'C.', 'Johnson')


### Mismatched lengths

If `zip`'s arguments have different lengths, it will stop as soon as it hits the end of the shortest iterable.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith"]

for z in zip(firsts, lasts):
    print(z)

Starting with Python 3.10, `zip` will be able to receive a keyword argument `strict` to error out if the iterables have different lengths.

In [None]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson", "Davis"]

for z in zip(firsts, lasts, strict=True):
    print(z)

`zip` only errors when finds the length mismatch, not when it's about to start iterating over the longer iterable; this is because the arguments to `zip` are lazy iterators.

In general, `zip` is used with iterators that are expected to have the same lenght.  If that is the case is a good idea to always set `strict=True` to catch bugs in your code.

### Create a dictionary with zip

You can create dictioneries by feeding key-value pairs to the dict function, which means `zip` can be used to create dictionaries from two lists.

In [15]:
firsts = ["John", "Jane", "Jack"]
lasts = ["Doe", "Smith", "Johnson"]

dictionary = dict(zip(firsts, lasts))
print(dictionary)

{'John': 'Doe', 'Jane': 'Smith', 'Jack': 'Johnson'}


## Enumerate me

### How enumerate works

Python newcomers are usually exposed to this type of `for` loop very early.

In [16]:
for i in range(3):
    print(i)

0
1
2


This leads them to "learning" this anti-pattern of `for` loops to go over a list:

In [18]:
words = ["apple", "banana", "cherry"]
for i in range(len(words)):
    print(f"'{words[i]}' has {len(words[i])} characters.")

'apple' has 5 characters.
'banana' has 6 characters.
'cherry' has 6 characters.


The pythonic way of writing such a loop is iterating directly over the list:

In [19]:
words = ["apple", "banana", "cherry"]
for word in words:
    print(f"'{word}' has {len(word)} characters.")

'apple' has 5 characters.
'banana' has 6 characters.
'cherry' has 6 characters.


However, the final step in this indices vs elements comes when yu need to know the index of each element as well. For this, `enumerate` is your friend.

In [21]:
words = ["apple", "banana", "cherry"]
for i, word in enumerate(words):
    print(f"Word #{i}: '{word}' has {len(word)} characters.")

Word #0: 'apple' has 5 characters.
Word #1: 'banana' has 6 characters.
Word #2: 'cherry' has 6 characters.


### Optional `start` argument

The `enumerate` function can also accept an optional `start` argument. This argument specifies the starting index for the enumeration.

In [22]:
words = ["apple", "banana", "cherry"]
for i, word in enumerate(words, start=1):
    print(f"Word #{i}: '{word}' has {len(word)} characters.")

Word #1: 'apple' has 5 characters.
Word #2: 'banana' has 6 characters.
Word #3: 'cherry' has 6 characters.


This optional `start` argument is useful when you want to start the enumeration from a specific index.

By the way, the argument has to be an integer but can be negative.

In [23]:
for i, v in enumerate("abc", start=-4000):
    print(f"Index: {i}, Value: {v}")

Index: -4000, Value: a
Index: -3999, Value: b
Index: -3998, Value: c


### Unpacking when iterating

The `enumerate` function returns a lazy iterator, which means the items you iterate only become available as you need them. This can be useful when you want to process large amounts of data without consuming too much memory.

The items that `enumerate` returns are tuples, where the first element is the index and the second element is the value.

In [24]:
for tuple in enumerate("abc"):
    print(tuple)

(0, 'a')
(1, 'b')
(2, 'c')


### Deep unpacking

Things can get more interesting when you use `enumerate`, for example, on a `zip`:

In [25]:
pages = [5, 17, 32, 50]
for i, (start, end) in enumerate(zip(pages, pages[1:]), start=1):
    print(f"Chapter {i}: {end - start} pages long.")

Chapter 1: 12 pages long.
Chapter 2: 15 pages long.
Chapter 3: 18 pages long.


This snippet takes a list of pages where chapters of a book start and prints the length of each chapter. Notice how `enumerate` returns tuples with indices and values, but those values are extracted from a `zip`, which iteself returns tuples.

We use deep unpacking to access all the values directly.

## Chaining comparison operators

### Chaining of comparison operators

One excelent feature of Python is its ability to chain comparison operators. This can make your code more readable and easier to understand. Check this snippet that looks natural:

In [26]:
a = 1
b = 2
c = 3
if a < b < c:
    print("The numbers are in ascending order.")

The numbers are in ascending order.


When Python sees two comparisons in a row, like `a < b < c`, it behaves as if you wrote `a < b and b < c`, except that `b` is only evaluated once (which is relevant if `b` is an expression like a function call).

Another example usage is for when you want to make sure that three values are the same:

In [29]:
a = b = 1
c = 2

if a == b == c:
    print("The numbers are the same.")
else:
    print("The numbers are different.")

The numbers are different.


You can chain any arbitrary number of comparison operators. For example `a < b < c < d` would check if `a < b`, `b < c`, and `c < d`.

### Pitfalls

#### Non-transitive operators

We can use `a == b == c` to check if three variables are equal, but this won't work for non-transitive operators like `!=` or `<>`.

In [30]:
a = c = 1
b = 2

if a != b != c:
    print("a, b, and c all diferent:", a, b, c)

a, b, and c all diferent: 1 2 1


The problem is that the check is `a != b` and `b!= c`, which checks that `b` is different from both `a` and `c`, but says nothing about whether `a` is different from `c`.

This is because `!=` is a non-transitive operator, i.e., knowing how `a` relates to `b` and knowing how `b` relates to `c` doesn't tell you anything about how `a` relates to `c`.

#### Non-constant expressions or side-effects

Chaining comparisons like `a < b < c` evaluates `b` only once.

If `b` contains an expression with side-effects or if it's something that is not a constant, then the two expressions are not equivalent. Check this example which the element in the middle gets evaluated only once:

In [31]:
def f():
    print("hey")
    return 3

if 1 < f() < 5:
    print("done")

hey
done


Just to corroborate, that this will get evaluate `f()` twice:

In [None]:
if 1 < f() and f() < 5:
    print("done")

This snippet shows that an expression like `1 < f() < 0` can actually evaluate to `True` when its unfolded:

In [35]:
l = [-2, 2]

def f():
    global l
    l = l[::-1]  # Reverse the list
    return l[0]

# evaluated once f() = 2
# if 1 < f() < 0:
#     print("ehh")  # Never gets printed

# evaluated twice: first time f() = 2, second time f() = -2
if 1 < f() and f() < 0:
    print("ehh 2")  # gets printed

ehh 2


#### Ugly chains

This feature looks neat, but some chains where operatos are not aligned look very ugly, so thes chains look good:

In [None]:
a == b == c
a < b <= c
a <= b < c

but this chains look really ugly:

In [None]:
a < b > c   # it's better to use b > max(a, c), it's more readable and easier to understand
a <= b > c
a < b >= c

Now there are some other chains that are just confusing:

In [None]:
lst = []
a < b is True
a == b in lst
a in lst is True

In Python, `is`, `is not`, `in` and `not in` are comparison operators, so you can also chain them. But this creates weird situations like

In [1]:
a = 3
lst = [3, 5]
if a in lst == True:
    print("is True")
else:
    print("is False")

is False


Here is a break down of what this does:
- `a in lst == True` is equivalent to `a in lst` and `lst == True`
- `a in lst` is `True`, but
- `lst == True` is `False`, so
-  `a in lst == True` unfolds to `True and False`, which is `False`

## Truthy, Falsy, and bool

### "Truthy" and "Falsy"

Any object can be tested for truth value, for use in an `if` or `while` condition or as operand of the booleans operations `and`, `or`, and `not`.

In [5]:
5 > 3

True

The next step is using an object that is not boolean value, ex:

In [6]:
lst = [1, 2, 3]
if lst:
    print(lst)

[1, 2, 3]


How can we now if an object is truthy or falsy? The answer is by using the built-in `bool` function.

In [7]:
bool(lst)

True

A value of a given type is Falsy when it is "empty" or "without any useful value". Examples of Falsy values are: empty list, empty string, empty tiple, empty set, empty dictionary, the number 0, the boolean value `False`, and `None`.
- by defaault any object is Truthy
- an object is Falsy if calling `len` on it returns `0`

### The `__bool__` dunder method

An object has a Falsy vale if it defines a `__bool__` method that returns `False`.

`__bool__` is a dunder method that you can use to tell your objects if they are considered "truthy" or "falsy", by implementing it in your class.

In [8]:
class A:
    def __bool__(self):
        return False
    
a = A()
if a:
    print("Go Away!")

When given an arbitrary Python object that needs to be tested fort a truth value, Python first tries to call `bool` on it, in an attempt to use its `__bool__` method.. If the object does not implement a `__bool__` method, then Python tries to call `len` on it. Finally, if that also fails, Python defaults to giving a Truthy value to the object.

### Remarks

#### A note about containers with falsy objects

Things like a list htat only contains zeroes or a dictionary composed of zeroes and empty lists are not Falsy, because the containers themselves are not longer empty:

In [13]:
# These are false
print(bool([]))
print(bool({}))
print(bool(0))

# These are true
print(bool([0]))
print(bool({0: []}))

False
False
False
True
True


#### A note about checking for `None`

Imagine someone implemented the following function to return the integer square root of a number, returning `None` for negative inputs (because negative numbers don't have square roots). 

When you use this function, you know it returns `None` if the computation fails, so you might be tempted to use it like this:

In [18]:
import math

def int_square_root(n):
    if n < 0:
        return None
    return math.floor(math.sqrt(n))

n = int(input("Enter a number: "))
int_sqrt = int_square_root(n)

print("debug int_sqrt value: ", int_sqrt)

if not int_sqrt:
    print("Negative numbers do not have an integer square root.")
else:
    print(int_sqrt)

ValueError: invalid literal for int() with base 10: '0.5'

The problem is that `int_square_root` returned  meaningful value which is `0`  but that value is still Falsy.

So when you want to check fi a function returned `None` do not rely on the Truthy/Falsy value. Instead check explicitly if the returned value is `None`.

In [None]:
returned = ""

# Use
if returned is None:
    pass

if returned is not None:
    pass

# Avoid
if not returned:
    pass

if returned:
    pass

## Boolean short-circuiting

### Return values of the `and` and `or` operators

`x or y` returns `x` if `x` is `True`, otherwise it returns `y`. This is equivalent to the expression `(x or y) == (y if not x else x)`.

In [19]:
if 3 or 5:
    print("Yeah")
else:
    print("Nope")

Yeah


Now look at the program below and see what it prints:

In [20]:
print(3 or 5)

3


A similar thing happens with `and`. `x and y` returns `x` if `x` is `False`, otherwise it returns `y`. This is equivalent to the expression `(x and y) == (x if not x else y)`.

In [24]:
print(False and True)
print(True and 0)

False
0


### Short-circuiting

This is what short-circuting is: not evaluating the whole expression (stopping short of evaluating it) if we already have enough information to determine the result.

#### or

##### False ory

`or` evaluatest to `True` if any of its operands is truthy. If the left operand to `or` is `False` the the `or` operator hast to look to its right operand in order to determine the result.

In [1]:
y = 5  # truthy value
if False or y:
    print("Got in!, y = ", y)
else:
    print("Didn't get in...")

Got in!, y =  5


In [2]:
y = []  # falsy value
if False or y:
    print("Got in!, y = ", y)
else:
    print("Didn't get in second...")

Didn't get in second...


##### True ory

On the other hand, if the left operand to `or` is `True`, we do not need to take a look at `y` because the result will be `True`.

Let's create a simple function that return its argument unchanged but that produces a side-effect of printing something in the screen, then we can use it to take a look at the things that Python evaluates when trying to determin the vale of `x or y`: 

In [4]:
def print_and_return(x):
    print(f"Inside `print_and_return` with x = {x}")
    return x

print(print_and_return(False) or print_and_return(3))
print(print_and_return(True) or print_and_return(3))

Inside `print_and_return` with x = False
Inside `print_and_return` with x = 3
3
Inside `print_and_return` with x = True
True


Notices that, in the second example, `print_and_return` only did one print because it never reached the `print_and_return(3)`

##### Short-circuiting of `or` expressions 

Now we tie everything together. If the left operand to `or` is `False` or falsy, we know that `or` has to look to its right operand and will, therefore, return the vale of its right operand after evaluating it. On the other hand, if the left operand is `True` or truthy, `or` will return the value of the left operand without even evaluating the right operand.

#### and

##### False andy

`and` gives `True` if both operands are `True`. Therefore, if we have an expression like

In [10]:
val = False and y
print(val)

False


do we need to know what `y` is in order to figure out what `val` is? no, we do not, because regardless of wether `y` is `True` or `False`, `val` is always `False`:

In [11]:
print(False and True)
print(False and False)

False
False


If we take the `False` and `y` expressions from this example and compare them with the `if` expression we wrote earlier which was

    `(x and y) == (x if not x else y)`

we see that, in this case, `x` was substituted by `False`, and, therefore, we have
    
    `(False and y) == (False if not False else y)`

Now, the condition inside that `if` expresion reads

    `not False`

which we know evaluates to `True`, meaning that the `if` expression never returns `y`

In [20]:
print(print_and_return([]) and print_and_return(True))  # [] is falsy
print(print_and_return(0) and print_and_return(True))  # 0 is falsy
print(print_and_return({}) and print_and_return(True))  # {} is falsy
print(print_and_return(0) and print_and_return(0))  # both are falsy, but only the left matters

Inside `print_and_return` with x = []
[]
Inside `print_and_return` with x = 0
0
Inside `print_and_return` with x = {}
{}
Inside `print_and_return` with x = 0
0


##### True andy

If we evaluate `True and y`, we figure out that the result of such an expression is always the value of `y`, because the left operand being `True`, or any other truthy value, doesn't give `and` enough information.

##### Short-circuiting of `and` expressions 

To tie everything together. If the left operand to `and` is `False` or falsy, we know the expression returns the value of the left operand regardles of the right operand, and therefore we do not even evaluate the right operand. On the other hand, if the left operand to `and` is `True`, then `and` will evaluate the righ operand and return its value.

### Short-circuiting in plain English

Instead of memorising rules about what sides get evaluated when, just remember that both `and` and `or` will evaluate as many operands as needed to determine the overall Boolean result, and will then return the value of the last side that they evaluated.

As inmediate conclusion, the left operand is always evaluated, as you might imagine.


### `all` and `any`

The built-in functions `all` and `any` also short-circuit, as they are simple extensions of the behaviours provided by the `and` and `or` operators.

`all` wants to make sure that all the values of its argument are truthy, so as soon as it finds a falsy value, it knows it's game over. The docs says `all` is equivalent to this:

In [22]:
def custom_all(iterable):
    for element in iterable:
        if not element:
            return False
    return True

my_list = [True, True, True, True]
my_list_2 = [True, True, False, True]

print(all(my_list))  # True
print(custom_all([]))  # True

print(all(my_list_2))  # False
print(custom_all(my_list_2))  # False

True
True
False
False


Similarly, `any` is going to fo its best to look for some value that is truthy. Therefore, as soon as it fins one, `any` knows it has achieved its goal. Something similar to this:

In [23]:
def custom_any(iterable):
    for element in iterable:
        if element:
            return True
    return False

my_list = [False, False, False, False]
my_list_2 = [False, False, True, False]

print(any(my_list))  # False
print(custom_any(my_list))  # False

print(any(my_list_2))  # True
print(custom_any(my_list_2))  # True

False
False
True
True


### Short-circuiting in chained comparisons

Comparisons operators can be chained arbitrarily, and those are almost equivalent to a series of comparisons separated with `and`, except that the subexpressions are only evaluated once, to prevent wasting resources. Therefore, because we are also using an `and` in the background, chained comparisons are also short-circuiting.

In [24]:
## 1 > 2 is False, so there is no need to evaluate the right side.
print_and_return(1) > print_and_return(2) > print_and_return(3)

Inside `print_and_return` with x = 1
Inside `print_and_return` with x = 2


False

### Examples in code

#### Short-circuit to save time

This does not work but it shows that you should check for the simplier operand first.

In [None]:
import timeit


setup = ""
import re
s = b"a"*1000 + b"*"

validate = False
print(timeit.timeit("validate and not re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s)", setup))

#### Short-circuit to flatten `if` statements

In [None]:
if validate:
    print("Validating...")
    if re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s):
        print("Valid!")
    else:
        print("Not valid!")

It's best to use a single ìf statement instead of a chain of `and` and `or` operators.

In [None]:
if validate and re.fullmatch(b'[A-Za-z0-9+/]*=[0,2]', s):
    print("Valid!")

##### Checking preconditions before expression

In [None]:
def set_terminator(self, term):
    if isinstance(term, str) and self.use_encoding:
        term = bytes(term, self.encoding)
    elif isinstance(term, int) and term < 0:
        raise ValueError("Terminator must be a non-negative integer")


#### Define default values

In [31]:
greet = input("Type your name >> ") or "Guest"
print(f"Hello, {greet}!")

Hello, Chris!


#### Find witnesses in a sequence of items

In [32]:
items = [14, 16, 18, 20, 35, 41, 100]
any_found = False

for item in items:
    any_found = item % 2
    if any_found:
        print(f"Found an odd number: {item}")
        break

Found an odd number: 35


Look as this neat simplified version:

In [33]:
items = [14, 16, 18, 20, 35, 41, 100]
is_odd = lambda x: x % 2

if any(is_odd(witness := item) for item in items):
    print(f"Found an odd number: {witness}")

Found an odd number: 35
