# Python Deep Dive Part 1 - Functional

## Overview

It's about the Python language, built-in types, the standard library, and idiomatic Python.

Read the Zen of Python:

In [1]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!


## Basics review

### The Python type hierarchy

* Numbers
  * Integral
    * Integers
    * Booleans
  * Non-integral
    * Floats
    * Decimals
    * Complex
    * Fractions
* Collections
  * Sequences
    * Mutable
      * Lists
    * Immutable
      * Tuples
      * Strings
  * Sets
    * Mutable
      * Sets
    * Immutable
      * Frozen sets
  * Mappings
    * Dictionaries
* Callables
  * User-defined functions
  * Generators
  * Classes
  * Instance methods
  * Class instances (`__call__()`)
  * Built-in functions (e.g. `len`, `open`)
  * Built-in methods (e.g. `my_list.append`)
* Singletons
  * `None`
  * `NotImplemented`
  * `Ellipsis` (same as the ellipsis literal `...`)

### Multi-line statements and strings

A Python program is nothing more than a text file that contains physical lines of code. The code is parsed by Python compiler and it combines certain lines and make up logical lines of code.

There is a difference between a physical newline character and a logical newline token. Sometimes physical newlines are ignored in order to combine multiple physical lines into a single logical line of code terminated by a logical newline token. It allows us to write code over multiple lines that technically should be written as a single line for readability.

The conversion between physical and logical newlines can be done implicitly or explicitly.

* Implicit line breaks, supports inline comments
  * List literals: []
  * Tuple literals: ()
  * Dictionary literals: {}
  * Set literals: {}
  * Function arguments / parameters

In [2]:
my_list = [
    1, # item 1
    2, # item 2
    3, # item 3
]

def my_func(
    a,
    b, # comment
):
    print(a, b)

print(my_list)
my_func(10, #comment
        20)

[1, 2, 3]
10 20


* Explicit line break, doesn't support inline comments
  * You can break up statements over multiple lines explicitly by using `\`.
  * Multiline statements are not implicitly converted to a single logical line.

In [3]:
a = 1
b = 2
c = 3
if a < b \
    and b < c \
    and a < c:
    print('yes')

yes


* Multiline string literals
  * Multiline string literals can be created using triple delimiters (`'''` or `"""`).
  * Non-visible characters such as newlines, tabs, etc. are actually part of the string.
  * A multiline string is just a regular string.
  * They are not comments, although they can be used as such, especially with *docstrings*.

### Variable names

Identifier names are case-sensitive and must follow certain rules:

* Must start with underscore (`_`) or letter, followed by any number of underscores, letters, or digits.
* Cannot be reserved words.
* Conventions:
  * A single underscore in front of an identifier name indicates this is an internal or private object. Objects named this way will not get imported by a statement such as: `from module import *`
  * Double underscore in front of an identifier name is used to mangle class attributes, useful in inheritance chains.
  * Double underscore in both start and end of an identifier name is used for system-defined names that have a special meaning to the interpreter. Don't invent them.
  * Other naming conventions from [PEP 8 style guide](https://peps.python.org/pep-0008/)
    * Packages: short, all lowercases, preferably no underscores (e.g. utilities)
    * Modules: short, all lowercases (e.g. db_utils)
    * Classes: Upper camel case convention (e.g. CapWords)
    * Functions: lowercase, words separated by underscores (snake_case)
    * Variables: same as functions
    * Constants: all uppercases, words separated by underscores (e.g. CONST_WORDS)

### Conditionals

The `if` Statements can have zero or more `elif` parts, and the `else` part is optional.

Conditional expressions (sometimes called a “ternary operator”) have the lowest priority of all Python operations. The expression `x if C else y` first evaluates the condition. If `C` is true, `x` is evaluated and its value is returned; otherwise, `y` is evaluated and its value is returned.

A `match` statement takes an expression and compares its value to successive patterns given as one or more `case` blocks. This is superficially similar to a switch statement in C, Java or JavaScript (and many other languages), but it’s more similar to pattern matching in languages like Rust or Haskell. Only the first pattern that matches gets executed and it can also extract components (sequence elements or object attributes) from the value into variables. You can combine several literals in a single pattern using `|`.

### Functions

The keyword `def` introduces a function definition. It must be followed by the function name and the parenthesized list of formal parameters. The statements that form the body of the function start at the next line, and must be indented.

The first statement of the function body can optionally be a string literal; this string literal is the function’s documentation string, or *docstring*.

### While loop

The `while` statement is used for repeated execution as long as an expression is true.

This repeatedly tests the expression and, if it is true, executes the suite of the `while` clause; if the expression is false the suite of the `else` clause, if present, is executed and the loop terminates.

A `break` statement executed terminates the loop without executing the `else` clause’s suite. A `continue` statement executed skips the rest of the suite and goes back to testing the expression.

### For loop

The `for` statement is used to iterate over the elements of a *sequence* (such as a string, tuple or list) or other iterable object.

When the iterator is exhausted, the suite in the `else` clause, if present, is executed, and the loop terminates.

A `break` statement executed terminates the loop without executing the `else` clause’s suite. A `continue` statement executed in the first suite skips the rest of the suite and continues with the next item, or with the `else` clause if there is no next item.

The for-loop makes assignments to the variables in the target list. This overwrites all previous assignments to those variables including those made in the suite of the for-loop:

Names in the target list are not deleted when the loop is finished, but if the sequence is empty, they will not have been assigned to at all by the loop.

Code that modifies a collection while iterating over that same collection can be tricky to get right. Instead, it is usually more straight-forward to loop over a copy of the collection or to create a new collection.

### Classes

Classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made. Each class instance can have attributes attached to it for maintaining its state. Class instances can also have methods (defined by its class) for modifying its state.

In [18]:
class Rectangle:
    def __init__(self, width, height):
        self.width = width # this actually calls the setter of width
        self.height = height

    @property
    def width(self):
        return self._width
    
    @width.setter
    def width(self, width):
        if width > 0:
            self._width = width
        else:
            raise ValueError('Width should be greater than 0.')
        
    @property
    def height(self):
        return self._height
    
    @height.setter
    def height(self, height):
        if height > 0:
            self._height = height
        else:
            raise ValueError('Height should be greater than 0.')
        
    def perimeter(self):
        return (self._width + self._height) * 2
    
    def area(self):
        return self._width * self._height
    
    def __repr__(self):
        return f'Rectangle({self._width}, {self._height})'
    
    def __str__(self):
        return f'Rectangle: width {self._width}, height {self._height}'
    
    def __eq__(self, other):
        if isinstance(other, type(self)):
        # if issubclass(type(other), type(self)):
            return self._width == other._width and self._height == other._height
        else:
            return False
        
    def __gt__(self, other):
        if issubclass(type(other), type(self)):
            return self.area() > other.area()
        else:
            return NotImplemented
        
    def __ge__(self, other):
        if issubclass(type(other), type(self)):
            return self.area() >= other.area()
        else:
            return NotImplemented
        
rect1 = Rectangle(10, 5)
print(rect1)
print(repr(rect1))
rect1.width = 20
print(f'width: {rect1.width}, height: {rect1.height}')
print(f'Perimeter: {rect1.perimeter()}')
print(f'Area: {rect1.area()}')
# rect1.width = 0

# rect2 = Rectangle(0, 5)
rect2 = Rectangle(20, 5)
print(rect1 == rect2)

rect3 = Rectangle(10, 5)
print(f'{rect2.area()=}, {rect3.area()=}')
print('rect2 == rect3:', rect2 == rect3)
print('rect2 != rect3:', rect2 != rect3)
print('rect2 > rect3:', rect2 > rect3)
print('rect2 >= rect3:', rect2 >= rect3)
print('rect2 < rect3:', rect2 < rect3)
print('rect2 == 3:', rect2 == 3)
# print('rect2 > 3:', rect2 > 3)

Rectangle: width 10, height 5
Rectangle(10, 5)
width: 20, height: 5
Perimeter: 50
Area: 100
True
rect2.area()=100, rect3.area()=50
rect2 == rect3: False
rect2 != rect3: True
rect2 > rect3: True
rect2 >= rect3: True
rect2 < rect3: False
rect2 == 3: False


## Variables and memory

### Variables are memory references

* A Python variable is a symbolic name that is a *reference* or *pointer* to an object stored somewhere in the memory.
* A reference can be seen as the connection between a variable name and a value, it contains the memory address where the value is held.
* Once an object is assigned to a variable, you can refer to the object by that name.
* A value is another word for the actual data or object.
* When passing a variable to a function in Python, it's important to note that it's passed by reference, meaning the function receives a reference to the existing variable in memory.

In [19]:
my_var = 10
print(my_var)
print(id(my_var))
print(hex(id(my_var)))

10
140710205051976
0x7ff9a5c9d448


### Reference counting

Finding the reference count:

* `sys.getrefcount(var)`, passing `var` to it creates an extra reference
* `ctypes.c_long.from_address(address).value`, passing memory address, does not affect reference count

In [21]:
import sys
import ctypes

my_list = [1, 2, 3]
address_my_list = id(my_list)
my_list2 = my_list
print('id(my_list) == id(my_list2):', address_my_list == id(my_list2))

print(f'{sys.getrefcount(my_list)=}')
print(f'{ctypes.c_long.from_address(address_my_list).value=}')

id(my_list) == id(my_list2): True
sys.getrefcount(my_list)=3
ctypes.c_long.from_address(address_my_list).value=2


### Garbage collection

* Can be controlled programmatically using the `gc` module
* By default it is turned on
* You may turn it off for performance reasons if you are sure your code does not create circular references
* Runs periodically on its own if turned on
* You can call it manually

In [25]:
import ctypes
import gc

def get_ref_count(address):
    return ctypes.c_long.from_address(address).value

def check_obj_by_address(address):
    for obj in gc.get_objects():
        if id(obj) == address:
            return 'Object exists'
    return 'Obj not found'

class A:
    def __init__(self):
        self.b = B(self)
        print(f'A: self={id(self)}, b={id(self.b)}')

class B:
    def __init__(self, a):
        self.a = a
        print(f'B: self={id(self)}, a={id(self.a)}')

gc.disable()

a = A()
address_a = id(a)
address_b = id(a.b)
print('a reference count:', get_ref_count(address_a))
print('b reference count:', get_ref_count(address_b))
print('a:', check_obj_by_address(address_a))
print('b:', check_obj_by_address(address_b))

print('\nRemove variable a')
a = None
print('a reference count:', get_ref_count(address_a))
print('b reference count:', get_ref_count(address_b))
print('a:', check_obj_by_address(address_a))
print('b:', check_obj_by_address(address_b))

print('\nTurn on gc')
gc.collect()
print('a reference count:', get_ref_count(address_a))
print('b reference count:', get_ref_count(address_b))
print('a:', check_obj_by_address(address_a))
print('b:', check_obj_by_address(address_b))

B: self=2025821964944, a=2025821970896
A: self=2025821970896, b=2025821964944
a reference count: 2
b reference count: 1
a: Object exists
b: Object exists

Remove variable a
a reference count: 1
b reference count: 1
a: Object exists
b: Object exists

Turn on gc
a reference count: 0
b reference count: 0
a: Obj not found
b: Obj not found


### Dynamic typing vs static typing

Python employs dynamic typing, allowing variable types to be determined and checked at runtime rather than during compilation. Variables are purely references to objects. They don't have an inherent static type.

Dynamic typing provides versatility and simplicity, allowing developers to work without specifying types explicitly.

In contrast to statically-typed languages, where types are determined at compile-time, Python's dynamic typing offers more flexibility during development.

### Variable reassignment

Variable reassignment is a common practice that involves assigning a new value to an existing variable. When you reassign a variable, you are essentially reassign its reference to a different object. It's important to note that variable reassignment only affects the specific variable being reassigned and doesn't alter other variables referencing the original value.

In [27]:
a = 10
print(f'{id(a)=}')
a += 1
print(f'{id(a)=}')

id(a)=140710205051976
id(a)=140710205052008


### Object mutability

Changing the data inside the object is called modifying the internal state of the object. An object whose internal state can be changed is called mutable, otherwise immutable.

* Immutable
  * Numbers (int, float, booleans, etc)
  * Strings
  * Tuples
  * Frozen sets
  * User-defined classes
* Mutable
  * Lists
  * Dictionaries
  * Sets
  * User-defined classes

Tuples are immutable: elements cannot be deleted, inserted, or replaced, but tuples can contain mutable elements.

In [28]:
t = ([1, 2], [3, 4])
print(f'{t=}, {id(t)=}')
t[0].append(5)
print(f'{t=}, {id(t)=}')

t=([1, 2], [3, 4]), id(t)=2025822634048
t=([1, 2, 5], [3, 4]), id(t)=2025822634048


### Function arguments and mutability

Arguments are passed to Python functions by reference.

Immutable objects are safe from unintended side-effects. Changes made to immutable arguments within a function result in the creation of a new object. But watch out for immutable collection objects that contain mutable objects.

Mutable objects are not safe from unintended side-effects. Modifications to mutable arguments within a function persist outside the function.

Be cautious when using mutable objects as *default* function arguments. Default arguments are evaluated only once during function definition, leading to unexpected behavior if the mutable object is modified.

### Shared references and mutability

The term shared references is the concept of two variables referencing the same object in memory.

In the following cases, Python's memory manager decides to automatically re-use the memory references:

In [29]:
a = 10
b = 10
id(a) == id(b)

True

In [30]:
s1 = 'hello'
s2 = 'hello'
id(s1) == id(s2)

True

With mutable objects, the Python memory manager will never create shared references.

In [32]:
list1 = [1, 2, 3]
list2 = [1, 2, 3]
id(list1) == id(list2)

False

### Variable equality

We can think of variable equality in two fundamental ways:

* Memory address
  * `is`, identity operator, `var1 is var2`
  * `is not`, `var1 is not var2`, or `not(var1 is var2)`
* Object state (data)
  * `==`, equality operator, `var1 == var2`
  * `!=`, `var1 != var2`, or `not(var1 == var2)`

The `None` object can be assigned to variables to indicate that they are not set, i.e., an empty value or null pointer. The Python memory manager will always use a shared reference when assigning a variable to `None`. So we can test if a variable is not set by comparing it's memory address to the memory address of `None` using `is` operator.

### Everything is an object

Operators, functions, classes, types are all objects, i.e., instances of classes. They all have a memory address.

As a consequence:

* Any object can be assigned to a variable
* Any object can be passed to a function
* Any object can be returned from a function

### Interning in CPython

Python interning is a technique aimed at optimizing memory usage and improving performance. Interning allows Python to store only one copy of an object in memory and reuse it instead of creating duplicates.

* Integer interning
  * Small integers (-5 to 256 inclusive) are preloaded into the interpreter's memory on startup to optimize speed and memory. Every time we try to create an integer object in this range, CPython automatically refers to these cached objects in memory instead of creating new integer objects.
* String interning
  * String interning is a very useful mechanism in CPython that allows you to compare strings much faster.
  * As the Python code is compiled, identifiers are interned
    * variable names
    * function names
    * class names
    * etc
  * Some string literals may also be automatically interned
    * string literals look like identifiers
  * The rules for *implicit* string interning can be different. Relying on the rules can lead to unexpected errors.
  * If for some reason we need to intern a string, we can use `sys.intern` function to intern the string explicitly.
  * When should you intern strings explicitly
    * dealing with a large number of strings that could have high repetition
    * lots of string comparisons
    * in general, you do not need to intern strings yourself

In [33]:
a = 10
b = 10
print(f'{id(a)=}, {id(b)=}')
print('a is b:', a is b)

a = 257
b = 257
print(f'{id(a)=}, {id(b)=}')
print('a is b:', a is b)

id(a)=140710205051976, id(b)=140710205051976
a is b: True
id(a)=2025821827632, id(b)=2025821829616
a is b: False


In [37]:
import sys

a = 'hello'
b = 'hello'
print(f'{id(a)=}, {id(b)=}')
print('a is b:', a is b)

a = 'hello world'
b = 'hello world'
print(f'{id(a)=}, {id(b)=}')
print('a is b:', a is b)

a = sys.intern('the quick brown fox')
b = sys.intern('the quick brown fox')
c = 'the quick brown fox'
print(f'{id(a)=}, {id(b)=}, {id(c)=}')
print('a is b:', a is b)
print('a is c:', a is c)

id(a)=2025794550192, id(b)=2025794550192
a is b: True
id(a)=2025821641584, id(b)=2025821634608
a is b: False
id(a)=2025798438624, id(b)=2025798438624, id(c)=2025821711264
a is b: True
a is c: False


In [40]:
from sys import intern
from time import perf_counter

def compare_using_equals(n):
    a = 'a very long string that is not automatically interned' * 200
    b = 'a very long string that is not automatically interned' * 200
    print('a is b:', a is b)
    for _ in range(n):
        if a == b:
            pass

def compare_using_is(n):
    a = intern('a very long string that is not automatically interned' * 200)
    b = intern('a very long string that is not automatically interned' * 200)
    print('a is b:', a is b)
    for _ in range(n):
        if a is b:
            pass

start = perf_counter()
compare_using_equals(10000000)
end = perf_counter()
print('Equals:', end - start)

start = perf_counter()
compare_using_is(10000000)
end = perf_counter()
print('Is:', end - start)

a is b: False


Equals: 4.68861489999108
a is b: True
Is: 0.3186644000088563


### Peephole optimization in CPython

Peephole optimization is a code optimization technique in Python that occurs during compilation.

The Python interpreter includes a peephole optimizer that examines and modifies the generated bytecode for better efficiency. It pre-computes constant expressions and replaces them with the result, improving execution speed.

Peephole optimizations include:

* pre-calculating constant expressions
  * numeric calculations: e.g. `24 * 60` -> `1440`
  * short sequences length < 20 (the threshold 20 has been changed apparently):
    * `(1, 2) * 3` -> `(1, 2, 1, 2, 1, 2)`
    * `'abc' * 2` -> `'abcabc'`
    * `'hello' + 'world'` -> `'helloworld'`
* optimizing membership tests
  * mutables are replaced by immutables
    * lists -> tuples
    * sets -> frozensets

Set membership is much faster than list or tuple membership (sets are basically like dictionaries). So instead of writing `if e in [1, 2, 3]:` or `if e in (1, 2, 3):`, write `if e in {1, 2, 3}:`.

Python's peephole optimization contributes to faster code execution and reduced memory usage during runtime.

In [46]:
def my_func():
    a = 24 * 60
    b = (1, 2) * 5
    c = 'ab' * 10
    d = 'abc' * 7
    e = ['1', '2'] * 3

my_func.__code__.co_consts

(None,
 1440,
 (1, 2, 1, 2, 1, 2, 1, 2, 1, 2),
 'abababababababababab',
 'abcabcabcabcabcabcabc',
 '1',
 '2',
 3)

In [47]:
def my_func():
    for e in [1, 2, 3]:
        pass

my_func.__code__.co_consts

(None, (1, 2, 3))

In [48]:
import string
import time

char_tuple = tuple(string.ascii_letters)
char_set = set(string.ascii_letters)

def membership_test(n, container):
    for _ in range(n):
        if 'z' in container:
            pass

n = 10000000
start = time.perf_counter()
membership_test(n, char_tuple)
end = time.perf_counter()
print('Tuple:', end - start)

start = time.perf_counter()
membership_test(n, char_set)
end = time.perf_counter()
print('Set:', end - start)

Tuple: 3.668143000002601
Set: 0.34278949999134056


## Numeric types

Main types of numbers:

* Boolean truth values: `bool`
* Integer numbers (ℤ): `int`
* Rational numbers (ℚ): `fractions.Fraction`
* Real numbers (ℝ): `float`, `decimal.Decimal`
* Complex numbers (ℂ): `complex`

### Integers

Integers are represented internally using base-2 (binary) digits, not decimal.

Integers are represented as `int` in Python and can be of unlimited length.

The largest integer number that can be represented using 8 bits is 255 (2⁸ - 1).

If we care about handling negative integers as well, then 1 bit is reserved to represent the sign of the number, leaving us only 7 bits for the number itself. So the largest number we can represent using 7 bits is 127 (2⁷ - 1), and the range becomes [-127, 127]. But 0 does not require a sign, so we can squeeze out an extra number by adding the negative 0 to the bottom, and we end up with the range [-128, 127].

In a 32-bit architecture, the operating system can theoretically support up to 4GB of RAM due to the limited address space. This is because a 32-bit processor can only represent 2³² different numbers, resulting in a maximum addressable memory: 2³² bytes = 2² * 2¹⁰ * 2¹⁰ * 2¹⁰ bytes = 4 GB

In [51]:
import sys

print(f'{type(1)=}')
print(f'{sys.getsizeof(1)=}', 'bytes')
print(f'{sys.getsizeof(0)=}', 'bytes')

type(1)=<class 'int'>
sys.getsizeof(1)=28 bytes
sys.getsizeof(0)=24 bytes


Integers support the following arithmetic operations:

* addition `+`
* subtraction `-`
* multiplication `*`
* division `/`: always returns a float
* exponentiation `**`
* floor division `//`: returns the largest integer less than or equal to the result of division
* modulo `%`

The equation always satisfies: n = d * (n // d) + n % d, n stands for numerator, d denominator.

In [56]:
print('10 / 2 =', 10 / 2)
print('10 // 3 =', 10 // 3)
print('10 % 3 =', 10 % 3)
print('10 // -3 =', 10 // -3)
print('10 % -3 =', 10 % -3)
print('-10 // 3 =', -10 // 3)
print('-10 % 3 =', -10 % 3)
print('-10 // -3 =', -10 // -3)
print('-10 % -3 =', -10 % -3)

10 / 2 = 5.0
10 // 3 = 3
10 % 3 = 1
10 // -3 = -4
10 % -3 = -2
-10 // 3 = -4
-10 % 3 = 2
-10 // -3 = 3
-10 % -3 = -1


An integer number is an instance of the `int` class.

* Numerical data types are supported in the argument of the `int` constructor.
* Strings that can be parsed to a number are supported too.
  * constructor can have an optional base parameter
  * if base is not specified, the default is base 10
  * The allowed bases are 0 and 2–36.
  * For base 0, the string is interpreted by the prefix in the it. Base 0 also disallows leading zeros: `int('010', 0)` is not legal, while `int('010')` and `int('010', 8)` are.

In [60]:
from decimal import Decimal

print(f'{int()=}')
print(f'{int(10)=}')
print(f'{int(10.1)=}')
print(f'{int(Decimal("10.1"))=}')
# print(f'{int("10.1")=}')
print(f'{int("10")=}')
print(f'{int("010")=}')
# print(f'{int("010", 0)=}')
print(f'{int("010", 2)=}')
print(f'{int("0b10", 0)=}')

int()=0
int(10)=10
int(10.1)=10
int(Decimal("10.1"))=10
int("10")=10
int("010")=10
int("010", 2)=2
int("0b10", 0)=2


Changing an integer from base 10 to another base:

* `bin`: `bin(10)` -> `'0b1010'`
* `oct`: `oct(10)` -> `'0o12'`
* `hex`: `hex(10)` -> `'0xa'`

In [68]:
import string

def rebase(number: int, base: int):
    """Rebase base 10 integer number to the designated base"""
    if number == 0:
        return '0'
    if base < 2 or base > 36:
        raise ValueError('Invalid base: 2 <= base <= 36')
    
    base_letters = string.digits + string.ascii_lowercase

    # if number < 0:
    #     sign = '-'
    #     number = abs(number)
    # else:
    #     sign = ''
    number, sign = (number, '') if number > 0 else (-number, '-')
    
    rebased = ''

    while number > 0:
        rebased = base_letters[number % base] + rebased
        number = number // base
        # number, mod = divmod(number, base)
        # rebased = base_letters[mod] + rebased

    return sign + rebased

print(f'{rebase(0, 2)=}')
print(f'{rebase(10, 2)=}')
print(f'{rebase(-10, 2)=}')
print(f'{rebase(9, 3)=}')
print(f'{rebase(-106, 16)=}')
print(f'{rebase(1485, 16)=}')
print(f'{rebase(100, 36)=}')
# print(f'{rebase(5, 1)=}')

rebase(0, 2)='0'
rebase(10, 2)='1010'
rebase(-10, 2)='-1010'
rebase(9, 3)='100'
rebase(-106, 16)='-6a'
rebase(1485, 16)='5cd'
rebase(100, 36)='2s'


### Rational numbers

In mathematics, a rational number is a number that can be expressed as the quotient or fraction of two integers, a numerator p and a non-zero denominator q.

The real numbers that are rational are those whose decimal expansion either terminates after a finite number of digits (example: 3/4 = 0.75), or eventually begins to repeat the same finite sequence of digits over and over (example: 9/44 = 0.20454545...). This statement is true not only in base 10, but also in every other integer base, such as the binary and hexadecimal ones.

Rational numbers can be represented in Python using the `Fraction` class in the `fractions` module.

In [77]:
from fractions import Fraction

print(f'{Fraction()=}')
print(f'{Fraction(1, 10)=}')
print(f'{Fraction(0.1)=}') # using float is meaningless, as it may not have an exact representation in base 2
print(f'{Fraction("0.1")=}')
print('Fraction(0.1) == Fraction(1, 10):', Fraction(0.1) == Fraction(1, 10))
print('Fraction("0.1") == Fraction(1, 10):', Fraction('0.1') == Fraction(1, 10))
print(f'{Fraction("22/7")=}')

print(Fraction(1, 2) + Fraction(1, 3))

x = Fraction(7, 22)
print(f'{x.numerator=}')
print(f'{x.denominator=}')
x = Fraction(7, -22)
print(f'{x.numerator=}')
print(f'{x.denominator=}')

Fraction()=Fraction(0, 1)
Fraction(1, 10)=Fraction(1, 10)
Fraction(0.1)=Fraction(3602879701896397, 36028797018963968)
Fraction("0.1")=Fraction(1, 10)
Fraction(0.1) == Fraction(1, 10): False
Fraction("0.1") == Fraction(1, 10): True
Fraction("22/7")=Fraction(22, 7)
5/6
x.numerator=7
x.denominator=22
x.numerator=-7
x.denominator=22


Given a `Fraction` object, we can find an approximate equivalent fraction with a constrained denominator using the `limit_denominator(max_denominator=1000000)` instance method.

In [73]:
import math
from fractions import Fraction

x = Fraction(math.pi)
print(x)
print(x.limit_denominator())
print(x.limit_denominator(10))
print(x.limit_denominator(100))
print(x.limit_denominator(10000))

884279719003555/281474976710656
3126535/995207
22/7
311/99
355/113


### Floats

The `float` class is Python's default implementation for representing real numbers.

The float uses a fixed number of bytes (8 bytes, 64 bits, plus the overhead of objects), unlike integers which can take up more and more memory space as needed.

In [2]:
print('\nFloats may not have exact representations in computer:')
print('0.1:', f'{0.1:.25f}')
print('0.1 + 0.1 + 0.1 == 0.3:', 0.1 + 0.1 + 0.1 == 0.3)

print('\nBecause of the inexact internal representation, the result of round() may also be surprising.')
print(f'{round(1.25, 1)=}')
print(f'{round(1.225, 2)=}')


Floats may not have exact representations in computer:
0.1: 0.1000000000000000055511151
0.1 + 0.1 + 0.1 == 0.3: False

Because of the inexact internal representation, the result of round() may also be surprising.
round(1.25, 1)=1.2
round(1.225, 2)=1.23


As we have seen that some decimal numbers cannot be represented with a finite binary representation. This can lead to some weirdness and bugs in our code if we don't know how to deal with it.

Using rounding will not necessarily solve the problem, although it can be used to round the entirety of both sides of the equality comparison.

More generally, use an appropriate range (ε) within which two numbers are deemed equal. The range should be the larger of absolute and relative tolerances.

`math.isclose(a, b, *, rel_tol=1e-09, abs_tol=0.0)` return `True` if the values `a` and `b` are close to each other and `False` otherwise. The result will be: `abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)`.

In [79]:
import sys
import math

x = 0.1 + 0.1 + 0.1
y = 0.3
epsilon = sys.float_info.epsilon

print(f'{x=}, {y=}')
print('x == y:', x == y)
print('x is close y:', math.isclose(x, y, rel_tol=epsilon, abs_tol=epsilon))

x=0.30000000000000004, y=0.3
x == y: False
x is close y: True


Coerce a float to an integer:

* Truncation: `math.trunc(float)`, `int(float)`
* Floor: `math.floor(float)`
* Ceiling: `math.ceil(float)`
* Rounding: `round(x, n=0)`
  * the `round` function will round the number `x` to the closest multiple of 10^-n, `n` can be negative
  * banker's rounding, a number is rounded to the nearest value, with ties rounded to the nearest value with an *even* least significant digit
  * `round(x)` -> `int`
  * `round(x, n)` -> same type as `x`

Why banker's rounding? It is less biased rounding than ties away from zero.

In [83]:
import math

def traditional_round(f: float):
    """Round away from zero"""
    return int(f + math.copysign(0.5, f))

f = -10.5
print(f'{f=}')
print(f'{round(f)=}')
print(f'{traditional_round(f)=}')

f = -10.6
print(f'{f=}')
print(f'{round(f)=}')
print(f'{traditional_round(f)=}')

f=-10.5
round(f)=-10
traditional_round(f)=-11
f=-10.6
round(f)=-11
traditional_round(f)=-11


### Decimals

The `decimal` module provides support for fast correctly rounded decimal floating point arithmetic. It offers several advantages over the float datatype:

* Decimal numbers can be represented exactly. In contrast, numbers like `1.1` and `2.2` do not have exact representations in binary floating point. End users typically would not expect `1.1 + 2.2` to display as `3.3000000000000003` as it does with binary floating point.
* The exactness carries over into arithmetic. In decimal floating point, `0.1 + 0.1 + 0.1 - 0.3` is exactly equal to zero. In binary floating point, the result is `5.5511151231257827e-017`. While near to zero, the differences prevent reliable equality testing and differences can *accumulate*. For this reason, decimal is preferred in accounting applications which have strict equality invariants.
* The decimal module incorporates a notion of significant places so that `1.30 + 1.20` is `2.50`. The trailing zero is kept to indicate significance. This is the customary presentation for monetary applications. For multiplication, the "schoolbook" approach uses all the figures in the multiplicands. For instance, `1.3 * 1.2` gives `1.56` while `1.30 * 1.20` gives `1.5600`.
* Unlike hardware based binary floating point, the decimal module has a user alterable precision (defaulting to 28 places) which can be as large as needed for a given problem.
* Both binary and decimal floating point are implemented in terms of published standards. While the built-in float type exposes only a modest portion of its capabilities, the decimal module exposes all required parts of the standard. When needed, the programmer has full control over rounding and signal handling. This includes an option to enforce exact arithmetic by using exceptions to block any inexact operations.

A decimal number is immutable. It has a sign, coefficient digits, and an exponent. To preserve significance, the coefficient digits do not truncate trailing zeros. Decimals also include special values such as `Infinity`, `-Infinity`, and `NaN`. The standard also differentiates `-0` from `+0`.

Decimals have a context that controls certain aspects of working with decimals.

* The context can be global, the default context
  * `decimal.getcontext()`
* The context can be temporary (local)
  * `decimal.localcontext(ctx=None)`
    * creates a new context, copied from `ctx` or from default if `ctx` is not specified
    * returns a context manager (use a `with` statement)

In [6]:
import decimal
from decimal import Decimal

ctx = decimal.getcontext()
print(ctx)
print(f'{ctx.prec=}')
print(f'{ctx.rounding=}')

with decimal.localcontext() as ctx:
    ctx.rounding = decimal.ROUND_HALF_UP
    print('local context:', ctx)
    print('getcontext:', decimal.getcontext()) # getcontext will actually get local context
    print(f'inside: {round(Decimal('1.25'), 1)=}')

print(f'outside: {round(Decimal('1.25'), 1)=}')

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])
ctx.prec=28
ctx.rounding='ROUND_HALF_EVEN'
local context: Context(prec=28, rounding=ROUND_HALF_UP, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])
getcontext: Context(prec=28, rounding=ROUND_HALF_UP, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])
inside: round(Decimal('1.25'), 1)=Decimal('1.3')
outside: round(Decimal('1.25'), 1)=Decimal('1.2')


In [7]:
from decimal import Decimal

print(f'{Decimal(1)=}')
print(f'{Decimal(3.14)=}')
print(f'{Decimal("3.14")=}')
print(f'{Decimal((1, (3, 1, 4), -2))=}')

Decimal(1)=Decimal('1')
Decimal(3.14)=Decimal('3.140000000000000124344978758017532527446746826171875')
Decimal("3.14")=Decimal('3.14')
Decimal((1, (3, 1, 4), -2))=Decimal('-3.14')


Context precision affects mathematical operation, but it does not affect the constructor.

In [10]:
import decimal
from decimal import Decimal

decimal.getcontext().prec = 2

a = Decimal('0.12345')
b = Decimal('0.12345')
c = a + b
print(f'{a=}, {b=}')
print('a + b =', c)
print(f'{c=}') # the actual value of c has been permanently affected by precision

a=Decimal('0.12345'), b=Decimal('0.12345')
a + b = 0.25
c=Decimal('0.25')


Some arithmetic operators don't work the same as floats or integers.

* `//`, `%`, and also `divmod()`
  * for integers, `//` performs floor division
  * for Decimals, `//` performs truncated division
  * the equation n = d * (n // d) + n % d still holds
* The Decimal class defines a bunch of various mathematical operations, such as `sqrt`, `exp`, `ln`, `log10`, etc.
* We can still apply functions from math `module` to Decimals, but `Decimal` objects will first be cast to floats, so we lose the whole precision mechanism that made us use Decimal objects in the first place.

In [15]:
import math
import decimal
from decimal import Decimal

x = 0.01
x_dec = Decimal('0.01')

root = math.sqrt(x)
root_mixed = math.sqrt(x_dec)
root_dec = x_dec.sqrt()

print('root:', f'{root:.28f}')
print('root_mixed:', f'{root_mixed:.28f}')
print('root_dec:', f'{root_dec:.28f}')

print('\nquantize method:')
print(Decimal('7.325').quantize(Decimal('0.1'), rounding=decimal.ROUND_DOWN))
print(Decimal('7.325').quantize(Decimal('0.1'), rounding=decimal.ROUND_UP))
print(Decimal('7.325').quantize(Decimal('1'), rounding=decimal.ROUND_UP))
print(Decimal('7.325').quantize(Decimal('1.'), rounding=decimal.ROUND_UP))
print(Decimal('7.325').quantize(Decimal(1), rounding=decimal.ROUND_UP))


root: 0.1000000000000000055511151231
root_mixed: 0.1000000000000000055511151231
root_dec: 0.1000000000000000000000000000

quantize method:
7.3
7.4
8
8
8


There are some drawbacks to the `Decimal` class vs the `float` class

* not as easy to code: construction via strings or tuples
* not all mathematical functions that exist in the `math` module have a `Decimal` counterpart
* more memory overhead
* performance: much slower than floats

### Complex numbers

The `complex` class:

* Constructor: `complex(x, y)`
  * `x`: real part, `y`: imaginary part
  * `x` and `y` are stored as floats
* Literals: `x + yj`, or `x + yJ`
* Some instance properties and methods
  * `.real`: returns the real part
  * `.imag`: returns the imaginary part
  * `.conjugate()`: returns the complex conjugate
* The standard arithmetic operators (`+`, `-`, `*`, `/`, `**`) works as expected with complex numbers
  * real and complex numbers can be mixed
  * `//` and `%` operators are not supported
* `==` and `!=` operators are supported, but we have the same problem as with floats
* Comparison operators such as `<`, `>`, `>=`, and `<=` are not supported
* Functions in the `math` module will not work, use the `cmath` module instead

Rectangular to polar:

* `cmath.phase(x)`: returns the phase φ of the complex number `x`, φ ∈ [-π, π] measured counter-clockwise from the real axis
* `abs(x)`: returns the magnitude r of `x`
* `cmath.polar(x)`: returns the representation of `x` in polar coordinates. Returns a pair `(r, phi)` where `r` is the modulus of `x` and `phi` is the phase of `x`. `polar(x)` is equivalent to `(abs(x), phase(x))`.

Polar to rectangular:

* `cmath.rect(r, phi)`: returns a complex number (rectangular coordinates) equivalent to the complex number defined by `(r, phi)` in polar coordinates
* the conversion is not exact because of the inexact representation of floats and irrational numbers

In [25]:
# Euler's identity
# e^(iπ) + 1 = 0

import cmath
import math
import sys

epsilon = sys.float_info.epsilon

print('e^(iπ) + 1 = 0:', cmath.exp(1j * cmath.pi + 1) == 0)
print('e^(iπ) + 1 is close 0:', cmath.isclose(cmath.exp(1j * cmath.pi) + 1, 0, rel_tol=epsilon, abs_tol=epsilon))

print(f'{cmath.rect(2 ** 0.5, cmath.pi / 4)=}')
print('math.pi is cmath.pi:', math.pi is cmath.pi)
print('math.pi == cmath.pi:', math.pi == cmath.pi)

e^(iπ) + 1 = 0: False
e^(iπ) + 1 is close 0: True
cmath.rect(2 ** 0.5, cmath.pi / 4)=(1.0000000000000002+1.0000000000000002j)
math.pi is cmath.pi: False
math.pi == cmath.pi: True


### Booleans

Python has a concrete `bool` class that is used to represent Boolean values.

* The `bool` class is a subclass of the `int` class.
* They possess all the properties and methods of integers, and add some specialized ones such as `and`, `or`, etc.
* Two constants are defined `True` and `False`. 
* They are singleton objects of `bool`, so they will always retain their same memory address through out the lifetime of your application.



In [30]:
print('bool is subclass of int:', issubclass(bool, int))
print('True is instance of bool:', isinstance(True, bool))
print('False is instance of bool:', isinstance(False, bool))
print(f'{int(True)=}')
print(f'{int(False)=}')
print('True == 1:', True == 1)
print('False == 0:', False == 0)
print('True is 1:', True is 1)
print('True + True ==', True + True)
print('True > False:', True > False)
print('(1 == 2) == False:', (1 == 2) == False)
print('(1 == 2) == 0:', (1 == 2) == 0)
print('-True =', -True)

bool is subclass of int: True
True is instance of bool: True
False is instance of bool: True
int(True)=1
int(False)=0
True == 1: True
False == 0: True
True is 1: False
True + True == 2
True > False: True
(1 == 2) == False: True
(1 == 2) == 0: True
-True = -1


  print('True is 1:', True is 1)


The Boolean constructor:

* The Boolean constructor `bool(x)` returns `True` when `x` is `True`, and `False` when `x` is `False`.
* Many classes contain a definition of how to cast instances of themselves to a Boolean. 
  * This is called the truth value of an object.
  * Every object has a `True` value, except:
    * `None`
    * `False`
    * `0` in any numeric type (e.g. `0`, `0.0`, `0 + 0j`, ...)
    * empty sequences (e.g. list, tuple, string, ...)
    * empty mapping types (e.g. dictionary, set, ...)
    * custom classes that implement a `__bool__` or `__len__` method that returns `False` or `0`
    * the default truthiness of an object is `True`

In [31]:
a = [1, 2, 3]

if a: # work the same way as the code below but more concise
# if a is not None and len(a) > 0:
    print(a[0])
else:
    print('Nothing to see here...')

1


The Boolean operators:

* `not`
* `and`
* `or`

* Commutativity
  * A or B == B or A
  * A and B == B and A
* Distributivity
  * A and (B or C) == (A and B) or (A and C)
  * A or (B and C) == (A or B) and (A or C)
* Associativity
  * A or (B or C) == (A or B) or C == A or B or C
  * A and (B and C) == (A and B) and C == A and B and C
* De Morgan's theorem
  * not (A or B) == not A and not B
  * not (A and B) == not A or not B

Operator precedence from high to low:

* `()`
* `<`, `>`, `<=`, `>=`, `==`, `!=`, `in`, `is`, `is not`, `not in`
* `not`: `not` has a lower priority than non-Boolean operators, so `not a == b` is interpreted as `not (a == b)`, and `a == not b` is a syntax error
* `and`
* `or`

When in doubt, or to be absolutely sure, use parentheses. Also, use parentheses to make your code more human readable.

Short-circuit evaluation:

* `and`: returns the first falsy value or the last value, only evaluates the second operand if the first one is true
* `or`: returns the first truthy value or the last value, only evaluates the second operand if the first one is false

In [4]:
def get_first_letter(s):
    # if s:
    #     return s[0]
    # else:
    #     return ''
    return s and s[0] or ''

s1 = None
s2 = ''
s3 = 'abc'

print(f'{get_first_letter(s1)=}')
print(f'{get_first_letter(s2)=}')
print(f'{get_first_letter(s3)=}')

get_first_letter(s1)=''
get_first_letter(s2)=''
get_first_letter(s3)='a'


### Comparison operators

Categories of operators

* General features
  * binary operators
  * evaluate to a `bool` value
* Identity operators
  * `is`, `is not`
  * compare memory address
* Value comparisons
  * `==`, `!=`
  * compare values, 
  * different types are OK, but must be compatible
* Ordering comparisons
  * `<`, `>`, `<=`, `>=`
  * doesn't work for all types
* Membership operators
  * `in`, `not in`
  * used with iterable types

Numeric types

* Value comparisons will work with all numeric types.
* Mixed types in value and ordering (except complex) comparisons is supported.
* Value equality operators work between floats and Decimals, but using value equality with floats has some issues.

Chained comparisons

* `a == b == c`: `a == b and b == c`
* `a < b < c`: `a < b and b < c`
* `a < b > c`: `a < b and b > c`
* `a < b < c < d`: `a < b and b < c and c < d`
* `'A' < 'a' > 'Z' in string.ascii_letters`: `'A' < 'a' and 'a' > 'Z' and 'Z' in string.ascii_letters`

In [5]:
from fractions import Fraction
from decimal import Decimal

print('1 == 1.0:', 1 == 1.0)
print('1.0 == Decimal("1.0"):', 1.0 == Decimal("1.0"))
print('1 == Fraction(3, 3):', 1 == Fraction(3, 3))
print('0.1 == Decimal("0.1"):', 0.1 == Decimal("0.1")) # floats representation issue
print('4 == 4 + 0j:', 4 == 4 + 0j)
print('3 < 2 < 4 / 0:', 3 < 2 < 4 / 0) # short-circuited

1 == 1.0: True
1.0 == Decimal("1.0"): True
1 == Fraction(3, 3): True
0.1 == Decimal("0.1"): False
4 == 4 + 0j: True
3 < 2 < 4 / 0: False


## Function parameters

### Arguments vs parameters

* `def my_func(a, b)`
  * in this context, `a` and `b` are called parameters of `my_func`
  * `a` and `b` are variables, local to `my_func`
* `my_func(x, y)`
  * `x` and `y` are called the arguments of `my_func`
  * `x` and `y` are passed by reference, i.e., the memory addresses of `x` and `y` are passed

### Positional and keyword arguments

* Positional arguments
  * most common way of assigning arguments to parameters: via the order in which they are passed, i.e., their position
  * a positional argument can be made *optional* by specifying a default value for the corresponding parameter
  * if a positional parameter is defined with a default value, every positional parameter after it must also be given a default value
* Keyword arguments (named arguments)
  * positional arguments can, *optionally*, be specified by using the parameter name whether or not the parameters have default values
  * the order of keyword arguments doesn't matter
  * once you use a named argument, all arguments thereafter must be named too
  * default arguments may still be omitted

In [1]:
def my_func(a, b=20, c=30):
    print(f'{a=}, {b=}, {c=}')

my_func(1, 2, 3)
my_func(1, 2)
my_func(b=20, a=10)
my_func(10, c=40)

a=1, b=2, c=3
a=1, b=2, c=30
a=10, b=20, c=30
a=10, b=20, c=40


### Unpacking iterables

A side note on tuples

* What defines a tuple in Python is not `()` but `,`
* The `()` are used to make the tuple clearer
* To create a tuple with a single element:
  * `(1)` will not work as intended -> `int`
  * `1,` or `(1,)` -> `tuple`
* The only exception is when creating an empty tuple: `()` or `tuple()`

Packed values

* Packed values refers to values that are bundled together in some way
* Tuples and lists are obvious
* A string is considered to be a packed value
* Sets and dictionaries are also packed values
* In fact, any `iterable` can be considered a packed value

Unpacking packed values

* Unpacking is the act of splitting packed values into individual variables contained in a list or tuple
  * `a, b, c = [1, 2, 3]`
  * 3 elements in `[1, 2, 3]` need 3 variables to unpack
  * `a, b, c` is actually a tuple of 3 variables
* The unpacking into individual variables is based on the relative positions of each element
  * This is how positional arguments are assigned to parameters in functions

Swapping values of two variables

* No temporary variable involved
* `a, b = b, a`
* This works because in Python, the entire right hand side is evaluated first and completely, then assignments are made to the left hand side, i.e., `(a, b) = (b, a)`

In [6]:
# parentheses are not necessary when defining a tuple, except for empty tuples
a = (1, 2)
print(type(a))
a = 1, 2
print(type(a))
a = (1) # int
print(type(a))
a = (1,)
print(type(a))
a = 1,
print(type(a))
a = ()
print(type(a))

a, b, c = 'str'
print(f'{a=}, {b=}, {c=}')

<class 'tuple'>
<class 'tuple'>
<class 'int'>
<class 'tuple'>
<class 'tuple'>
<class 'tuple'>
a='s', b='t', c='r'


Unpacking sets and dictionaries

* When unpacking dictionaries, we are actually unpacking the keys.
* Dictionaries and sets are unordered types. They can be iterated, but there is no guarantee the order of the results will match your expectation.
* Starting from Python 3.7, dictionary order is guaranteed to be insertion order. Order is maintained while iterating through the dictionary and while converting a dictionary to some other data type.
* Sets are like dictionaries, except they don't have values, only keys.

In [5]:
s1 = set('python')
s2 = {'p', 'y', 't', 'h', 'o', 'n'}
d = {'p': 1, 'y': 2, 't': 3, 'h': 4, 'o': 5, 'n': 6}

def print_els(container):
    for c in container:
        print(c, end='')
    print()

print_els(s1)
print_els(s2)
print_els(d)

nhotpy
otnhyp
python


### Extended unpacking

The use case for `*`

* We don't always want to unpack every single item in an iterable. We may, for example, want to unpack the first value, and then unpack the remaining values into another variable.
  * we can achieve this using slicing
    * `a = my_list[0]`, `b = my_list[1:]`
  * or, using simple unpacking
    * `a, b = my_list[0], my_list[1:]`
  * we can also use the `*` operator
    * `a, *b = my_list`
    * `b` will be of `list` type
    * apart from cleaner syntax, `*` also works with any iterable, not just sequence types
    * `*` still works with sets and dictionary keys, but unpacking this way may have no guarantee of preserving the order in which the elements were created or added
    * the `*` operator can only be used once on the same level of unpacking in the left hand side of an unpacking assignment
* The `*` operator can also be used to merge iterables into one in the right hand side of an expression.
  * `my_list = = [*list1, *list2]`
  * this is useful in a situation where you might want to create a single collection containing all the items of multiple sets, or all the keys of multiple dictionaries

The `**` unpacking operator

* The `**` operator is used for dictionary unpacking, also known as keyword argument unpacking.
* It allows you to use a dictionary to supply keyword arguments when calling a function.
* It can also be used to merge multiple dictionaries into one. If there are common keys, the newer value will overwrite the older one under the same key.
* The `**` operator cannot be used in the left hand side of an assignment.

Nested unpacking

* Python supports nested unpacking as well.
  * `a, b, (c, d) = [1, 2, [3, 4]]`
  * `a, *b, (c, d, e) = [1, 2, 3, 'xyz']`

In [8]:
my_list = [1, 2, 3, (4, 5, 6)]
a, b, *c = my_list
print(f'{a=}, {b=}, {c=}')
a, *b, (c, d, e) = my_list # nested unpacking
print(f'{a=}, {b=}, {c=}, {d=}, {e=}')
a, *b, (c, *d) = my_list # multiple * operators on different levels of unpacking
print(f'{a=}, {b=}, {c=}, {d=}')

my_string = 'uvwxyz'
a, *b, c, d = my_string
print(f'{a=}, {b=}, {c=}, {d=}')

my_set = {1, 2, 3}
a, *b = my_set
print(f'{a=}, {b=}')
*l, = my_set # convert my_set to a list
print(l)

l1 = [1, 2, 3]
l2 = [4, 5, 6]
s = 'abc'
l = [*l1, *l2, *s]
print(l)

d1 = {'a': 1, 'b': 2, 'c': 5}
d2 = {'c': 3, 'd': 4}
l = [*d1, *d2] # get all the keys of both dictionaries
d = {**d1, **d2} # merge dictionaries into one
print(l)
print(d)

a=1, b=2, c=[3, (4, 5, 6)]
a=1, b=[2, 3], c=4, d=5, e=6
a=1, b=[2, 3], c=4, d=[5, 6]
a='u', b=['v', 'w', 'x'], c='y', d='z'
a=1, b=[2, 3]
[1, 2, 3]
[1, 2, 3, 4, 5, 6, 'a', 'b', 'c']
['a', 'b', 'c', 'c', 'd']
{'a': 1, 'b': 2, 'c': 3, 'd': 4}


### *args

* `*args` allows a function to accept a variable number of non-keyword arguments.
* Use it in a function definition to handle an arbitrary number of positional arguments.
* It is customary (but not required) to name it `*args`.
* `args` will end up a tuple, not a list, unlike what we have when doing iterable unpacking.
* `*args` exhausts positional arguments. You cannot add more positional arguments after `*args`.

Unpacking arguments: we can unpack or deconstruct a collection when pass the collection to af function, just like iterable unpacking.

In [9]:
def average(number, *args):
    return (number + sum(args)) / (len(args) + 1)

def my_func(a, *b):
    print(f'{a=}, {b=}')

print(average(1, 2, 3))

l = [1, 2, 3]
my_func(*l)

2.0
a=1, b=(2, 3)


### Keyword arguments

Mandatory keyword arguments

* We can make keyword arguments mandatory.
* To do so, we create parameters after the positional parameters have been exhausted.
  * `def func(a, b, *args, d)`
  * `args` effectively exhausted all positional arguments and `d` must be passed as a keyword argument
* We can force no positional arguments at all
  * `def func(*, d)`
  * `*` indicates the end of positional arguments
* `def func(a, b=1, *args, d, e=True)`
  * `a`: mandatory positional argument, may be specified using a named argument
  * `b`: optional positional argument, may be specified positionally, as a named argument, or not at all, defaults to `1`
  * `args`: catch-all for any optional additional positional arguments
  * `d`: mandatory keyword argument
  * `e`: optional keyword argument, defaults to `True`
* `def func(a, b=1, *, d, e=True)`
  * `*`: no additional positional arguments allowed

### **kwargs

* `*args` is used to scoop up a variable amount of remaining positional arguments.
  * The parameter name `args` is arbitrary.
  * `*` is the real performer here.
* `**kwargs` is used to scoop up a variable amount of remaining keyword arguments.
  * The parameter name `kwargs` is arbitrary.
  * `**` is the real performer here.
* `**kwargs` can be specified even if the positional arguments have not been exhausted, unlike keyword-only arguments.
* No parameter can come after `**kwargs`

### Putting all types of parameters together

* positional arguments
  * may have default value
  * non-defaulted parameters are mandatory args
  * user may specify them using keywords
  * `*args` collects and exhausts remaining positional arguments
  * `*` indicates the end of positional arguments
* keyword-only arguments
  * after positional arguments have been exhausted, i.e., `*` or `*args` must be used
  * may have default value
  * non-defaulted parameters are mandatory args
  * user must specify them using keywords
  * `**kwargs` collects any remaining keyword arguments, does not require the use of `*` or `*args`

Examples:

* `def func(a, b=10)`
* `def func(a, b, *args)`
* `def func(a, b, *args, kw1, kw2=100)`
* `def func(a, b=10, *args, kw1, kw2=100)`: the default value for `b` is not very useful, because if there are additional positional values, you can't assign them to `args` without overwriting the default value of `b`
* `def func(a, b=10, *, kw1, kw2=100)`
* `def func(a, b, *args, kw1, kw2=100, **kwargs)`
* `def func(a, b=10, *, kw1, kw2=100, **kwargs)`
* `def func(*args)`
* `def func(**kwargs)`
* `def func(*args, **kwargs)`: any named arguments passed to the `func` will be collected by `kwargs`

Typical use case

* Python's `print()` function
  * `print(*objects, sep=' ', end='\n', file=None, flush=False)`
* Often, keyword-only arguments are used to modify the default behavior of a function such as the `print()` function we saw.
* Other times, keyword-arguments might be used to make things clearer.

In [14]:
def calc_hi_lo_avg(*args, log_to_console=False):
    # hi = int(bool(args)) and max(args) # int(bool()) here can be changed to len(), because args will always be a tuple, not None
    hi = len(args) and max(args)
    lo = len(args) and min(args)
    avg = (hi + lo) / 2
    if log_to_console:
        print(f'{hi=}, {lo=}, {avg=}')
    return hi, lo, avg

print(calc_hi_lo_avg())
print(calc_hi_lo_avg(1, 2, 3, 4, 5))
print(calc_hi_lo_avg(1, 2, 3, 4, 5, log_to_console=True))

(0, 0, 0.0)
(5, 1, 3.0)
hi=5, lo=1, avg=3.0
(5, 1, 3.0)


In [16]:
import time

def time_it(fn, *args, rep=1, **kwargs):
    start = time.perf_counter()
    for _ in range(rep):
        fn(*args, **kwargs)
    end = time.perf_counter()
    return (end - start) / rep

def compute_powers_1(n, *, start=1, end):
    result = []
    for i in range(start, end):
        result.append(n ** i)
    return result

def compute_powers_2(n, *, start=1, end):
    # using list comprehension
    return [n ** i for i in range(start, end)]

def compute_powers_3(n, *, start=1, end):
    # using generator expression
    return (n ** i for i in range(start, end))

print(f'{time_it(compute_powers_1, 2, rep=5, end=20000)=}')
print(f'{time_it(compute_powers_2, 2, rep=5, end=20000)=}')
print(f'{time_it(compute_powers_3, 2, rep=5, end=20000)=}')

time_it(compute_powers_1, 2, rep=5, end=20000)=0.573390859994106
time_it(compute_powers_2, 2, rep=5, end=20000)=0.5475154799991288
time_it(compute_powers_3, 2, rep=5, end=20000)=2.3799948394298553e-06


### Default values

What happens at run-time

* When a module is loaded, all code is executed immediately
  * `def func(a=10)`
    * the function object is created, and `func` references it
    * the integer object `10` is evaluated/created and is assigned as the default for `a`
  * `func()`
    * the function is executed
    * by the time this happens, the default value for `a` has already been evaluated and assigned -- it is not re-evaluated when the function is called
* In general, always beware of using a *mutable* object (or a *callable*) for an argument default.

In [18]:
import datetime

# buggy default value for dt
def log_1(msg, *, dt=datetime.datetime.now(datetime.UTC)):
    print(f'{dt}: {msg}')

def log_2(msg, *, dt=None):
    dt = dt or datetime.datetime.now(datetime.UTC)
    print(f'{dt}: {msg}')

log_1('msg_1')
log_1('msg_2')
print('-'*10)
log_2('msg_1')
log_2('msg_2')

2023-12-30 05:11:41.142625+00:00: msg_1
2023-12-30 05:11:41.142625+00:00: msg_2
----------
2023-12-30 05:11:41.142625+00:00: msg_1
2023-12-30 05:11:41.143587+00:00: msg_2


In [19]:
my_list = [1, 2]
def func(a=my_list):
    print(a)

func()
my_list.append(3)
func() # the default value of a was changed unexpectedly

[1, 2]
[1, 2, 3]


In [21]:
# another example of buggy default value
def add_to_list(name, quantity=1, unit='unit', lst=[]):
    lst.append(f'{name}: {quantity} {unit}')
    return lst

shopping_list_1 = add_to_list('apple', 2, 'kilo')
add_to_list('milk', 1, 'liter', shopping_list_1)
print(shopping_list_1)

shopping_list_2 = add_to_list('chicken', 2, 'kilo')
add_to_list('eggs', 2, 'dozen', shopping_list_2)
print(shopping_list_2)

# because the default value for lst has already been evaluated to the same address for the list when the function is defined
print('shopping_list_1 is shopping_list_2:', shopping_list_1 is shopping_list_2)

['apple: 2 kilo', 'milk: 1 liter']
['apple: 2 kilo', 'milk: 1 liter', 'chicken: 2 kilo', 'eggs: 2 dozen']
shopping_list_1 is shopping_list_2: True


In [24]:
def factorial(n: int):
    if n < 1:
        return 1
    else:
        print(f'calculating {n}!')
        return n * factorial(n - 1)
    
print(factorial(3))
print(factorial(3))

calculating 3!
calculating 2!
calculating 1!
6
calculating 3!
calculating 2!
calculating 1!
6


In [28]:
# cached version
# the same default dictionary will be used by the function
def factorial_cached(n, cache={}):
    if n < 1:
        return 1
    elif n in cache:
        return cache[n]
    else:
        print(f'calculating {n}!')
        result = n * factorial_cached(n - 1)
        cache[n] = result
        return result
    
print(factorial_cached(3))
print(factorial_cached(3)) # with cache no need to calculate the same number
print(factorial_cached(2))
print(factorial_cached(4))

calculating 3!
calculating 2!
calculating 1!
6
6
2
calculating 4!
24


## First-class functions

First-class objects

* can be passed to a function as an argument
* can be returned from a function
* can be assigned to a variable
* can be stored in a data structure, such as list, tuple, dictionary, etc

Functions are also first-class objects.

Higher-order functions are functions that:

* take a function as an argument
* and/or return a function

### Docstrings and annotations

Docstrings

* We can document our functions (and modules, classes, etc) using docstrings
* If the first line of code in the function body is a string (not an assignment, just a string by itself), it will be interpreted as a docstring.
* Multiline docstrings are achieved using multiline strings with triple delimiters.
* The docstrings are stored in the function's `__doc__` property.

Function Annotations

* Function annotations give us an additional way to document our functions.
  * `def my_func(a: <expression>, b: <expression>) -> <expression>:`
  * `def my_func(a: str = 'xyz', *args: 'additional params', b: int = 1, **kwargs: 'additional keyword only params') -> str:`
* Annotations can be any expression.
* Annotations are stored in the `__annotations__` property of the function.
  * the property is a dictionary
  * keys are the parameter names
  * for a return annotation, the key is `'return'`
  * values are the annotations

Docstrings and annotations are mainly used by external tools and modules.

Docstrings and annotations are entirely optional, and do not force anything in our Python code.

In [31]:
def my_func(a: 'annotation for a',
            b: 'annotation for b' = 1) -> 'something annotation for return':
    """documentation for my function"""
    return a * b

help(my_func)
print(my_func.__doc__)
print(my_func.__annotations__)

Help on function my_func in module __main__:

my_func(a: 'annotation for a', b: 'annotation for b' = 1) -> 'something annotation for return'
    documentation for my function

documentation for my function
{'a': 'annotation for a', 'b': 'annotation for b', 'return': 'something annotation for return'}


In [32]:
def my_func(a: str,
            b: 'int > 0' = 1,
            *args: 'additional args',
            k1: 'keyword-only arg 1',
            k2: 'keyword-only arg 2' = 100,
            **kwargs: 'additional keyword-only args') -> 'return something':
    print(a, b, args, k1, k2, kwargs)

print(my_func.__annotations__)
my_func('hello', 1, 2, 3, k1='world', k3='times')

{'a': <class 'str'>, 'b': 'int > 0', 'args': 'additional args', 'k1': 'keyword-only arg 1', 'k2': 'keyword-only arg 2', 'kwargs': 'additional keyword-only args', 'return': 'return something'}
hello 1 (2, 3) world 100 {'k3': 'times'}


### Lambda expressions

Lambda expressions are simply another way to create  functions (anonymous functions).

* `lambda [parameter list]: expression`
  * `lambda`: keyword
  * `[parameter list]`: optional
  * `:`: required
  * `expression`: evaluated and returned when the lambda function is called
* The whole lambda expression returns a function object that evaluates and returns the `expression` when it is called
* It can be assigned to a variable.
* It can be passed as an argument to another function.
* It is a function, just like one created with `def` but without a name.

Limitations of lambda expressions

* The body of a lambda is limited to a single expression.
* No assignments.
* No annotations, but default values still work.
* Single logical line of code. Although line-continuation still works, but, in general, you shouldn't have to split your lambda over multiple lines, because lambdas are meant to be simple functions.

In [36]:
f = lambda x, y, *args, z='z', **kwargs: (x, y, args, z, kwargs) # the () of return expression is required here
f('x', 'y', 1, 2, 3, kw1='1', kw2='2')

('x', 'y', (1, 2, 3), 'z', {'kw1': '1', 'kw2': '2'})

In [37]:
def apply_func(x, func):
    return func(x)

print(apply_func(2, lambda x: x ** 2))
print(apply_func(3, lambda x: x ** 3))

4
27


In [38]:
def apply_func(func, *args, **kwargs):
    return func(*args, **kwargs)

print(apply_func(lambda x, y: x + y, 1, 2))
print(apply_func(lambda x, *, y: x + y, 1, y=2))
print(apply_func(lambda *args: sum(args), 1, 2, 3, 4, 5))

3
3
15


### Lambda and sorting

There is a `sorted()` built-in function that builds a new sorted list from an iterable.

* `sorted(iterable, /, *, key=None, reverse=False)`
  * In Python function signatures, the `/` is used to indicate that all the parameters to the left of it must be specified positionally (i.e., without using keyword arguments). This syntax is known as "positional-only" arguments.
  * `key` specifies a function of one argument that is used to extract a comparison key from each element in iterable (for example, `key=str.lower`). The default value is `None` (compare the elements directly).
  * The built-in `sorted()` function is guaranteed to be *stable*. A sort is stable if it guarantees not to change the relative order of elements that compare equal — this is helpful for sorting in multiple passes (for example, sort by department, then by salary grade).
  * The sort algorithm uses only `<` comparisons between items. While defining an `__lt__()` method will suffice for sorting, PEP 8 recommends that *all six rich comparisons* be implemented. This will help avoid bugs when using the same data with other ordering tools such as `max()` that rely on a different underlying method. Implementing all six comparisons also helps avoid confusion for mixed type comparisons which can call reflected the `__gt__()` method.

In [39]:
complex_list = [3 + 3j, 1 - 1j, 6 + 2j, 0, -1 -4j]
# sort complex_list according to the square of the absolute value of a complex number
print(sorted(complex_list, key=lambda c: c.real ** 2 + c.imag ** 2))

name_list = ['Tony', 'Chaplin', 'John', 'Helen', 'Max']
# sort name_list according to the last letter of a name
# stable sort, i.e., the relative order of elements that compare equal will be remained
print(sorted(name_list, key=lambda n: n[-1]))

[0, (1-1j), (-1-4j), (3+3j), (6+2j)]
['Chaplin', 'John', 'Helen', 'Max', 'Tony']


In [41]:
# randomize an iterable using sorted

import random

l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print(sorted(l, key=lambda e: random.random()))
print(sorted(l, key=lambda e: random.random()))
print(sorted(l, key=lambda e: random.random()))

[10, 5, 9, 2, 1, 7, 4, 6, 8, 3]
[3, 6, 7, 5, 9, 2, 10, 8, 4, 1]
[4, 6, 2, 5, 9, 8, 7, 3, 1, 10]


### Function introspection

Functions are first-class objects.

* They have attributes.
* We can attach our own attributes.
* `dir()` is a built-in function that, given an object as an argument, will return a list of valid *attributes* for that object.

Function attributes

* `__name__`: name of function
* `__defaults__`: tuple containing positional parameter defaults
* `__kwdefaults__`: dictionary containing keyword-only parameter defaults
* `__code__`: code object representing the byte-compiled executable Python code, or bytecode. it has the following special read-only attributes:
  * `co_name`: function name
  * `co_argcount`: total number of positional parameters (including positional-only parameters and parameters with default values) that the function has
  * `co_varnames`: a tuple containing the names of the local variables in the function (starting with the parameter names)
  * etc...

The `inspect` module

* The `inspect` module provides several useful functions to help get information about live objects such as modules, classes, methods, functions, tracebacks, frame objects, and code objects.
* There are four main kinds of services provided by this module: type checking, getting source code, inspecting classes and functions, and examining the interpreter stack.
  * `ismodule(obj)`
  * `isclass(obj)`
  * `ismethod(obj)`: return `True` if the object is a bound method written in Python.
  * `isfunction(obj)`: return `True` if the object is a Python function, which includes functions created by a lambda expression.
  * `isroutine(obj)`: return `True` if the object is a user-defined or built-in function or method.
  * `getsource(obj)`: return the text of the source code for an object. The argument may be a module, class, method, function, traceback, frame, or code object. The source code is returned as a single string. An `OSError` is raised if the source code cannot be retrieved. A `TypeError` is raised if the object is a built-in module, class, or function.
  * `getmodule(obj)`: try to guess which module an object was defined in. Return `None` if the module cannot be determined.
  * `getcomments(obj)`: return in a single string any lines of comments immediately preceding the object’s source code (for a class, function, or method), or at the top of the Python source file (if the object is a module). If the object’s source code is unavailable, return `None`.
  * `signature(callable)`: return a `Signature` object for the given callable
* The difference between a function and a method
  * Function: an independent block of code that can be called from anywhere.
  * Method: similar to a function, but is tied to objects or classes and needs an object or class instance to be invoked.

### Callables

* A callable is any object that can be invoked using parentheses, with or without arguments.
* A callable is a function-like object, behaving like a function.
* To verify if an object is callable, we can use the built-in function `callable()`.
  * Return `True` if the object argument appears callable, `False` if not. 
  * If this returns `True`, it is still possible that a call fails, but if it is `False`, calling object will never succeed. 
  * Note that classes are callable (calling a class returns a new instance); instances are callable if their class has a `__call__()` method.

Different types of callables

* Built-in functions
* Built-in methods
* user-defined functions: created using `def` or `lambda` expressions
* methods: functions bound to an object
* classes: calling a class returns a new instance
  * `MyClass(x, y, z)`
  * -> `__new__(x, y, z)`: creates the new object
  * -> `__init__(self, x, y, z)`: initiate the new object, the new object created before is passed into `self` here
  * -> return the object (reference)
* class instances: if the class implements `__call__` method
* others
  * generators
  * coroutines
  * asynchronous generators

In [45]:
from decimal import Decimal

print(callable(print))
print(print()) # if something is callable, it always has a return value
print(callable(Decimal)) # classes are also callable

class MyClass:
    def __init__(self):
        print('classes are callable')
    
    def __call__(self):
        print('instances can be callable too')

my_class = MyClass()
my_class() # instances can be callable
print(callable(my_class))

True

None
True
classes are callable
instances can be callable too
True


### Higher order functions

* A function that takes a function as a parameter and/or returns a function as its return value.
* Higher order functions allow you to write more concise, flexible, and abstract code.
* Higher order functions are often used to create new functions from existing functions, abstract common patterns in code, and write more modular and reusable code.
* The `functools` module in Python provides support for higher order functions.
* Functions like `map()`, `filter()`, and `reduce()` are common higher order functions.

The `map` function

* `map(function, iterable, *iterables)`
* Return an *iterator* that applies function to every item of iterable, *yielding* the results.
* If additional iterables arguments are passed, `function` must take that many arguments and is applied to the items from all iterables in parallel.
* With multiple iterables, the iterator stops when the shortest iterable is exhausted.

The `filter` function

* `filter(function, iterable)`
* Construct an *iterator* from those elements of `iterable` for which `function` is true.
* `iterable` may be either a sequence, a container which supports iteration, or an iterator.
* If `function` is `None`, the identity function is assumed, that is, all elements of `iterable` that are false are removed.
* Note that `filter(function, iterable)` is equivalent to the generator expression `(item for item in iterable if function(item))` if function is not `None` and `(item for item in iterable if item)` if function is `None`.

The `zip` function (not a higher order function)

* `zip(*iterables, strict=False)`
* Iterate over several iterables in parallel, producing tuples with an item from each one.
* `zip()` returns an *iterator* of tuples, where the i-th tuple contains the i-th element from each of the argument iterables.
* Another way to think of `zip()` is that it turns rows into columns, and columns into rows. This is similar to transposing a matrix.
* `zip()` could have different lengths.
  * By default, `zip()` stops when the shortest iterable is exhausted.
  * With the `strict=True` option, it raises a `ValueError` if one iterable is exhausted before the others.
  * Shorter iterables can be padded with a constant value to make all the iterables have the same length. This is done by `itertools.zip_longest()`.

List comprehension

* List comprehensions provide a concise way to create lists.
* Common applications are to make new lists where each element is the result of some operations applied to each member of another sequence or iterable, or to create a subsequence of those elements that satisfy a certain condition.
* A list comprehension consists of brackets containing an expression followed by a `for` clause, then zero or more `for` or `if` clauses. If the expression is a tuple, it must be parenthesized.
  * `[(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]`
  * `[x + y for x, y in zip([1,2,3], [3,1,4])]`
* The initial expression in a list comprehension can be any arbitrary expression, including another list comprehension.

List comprehension alternative to `map`

* `list(map(lambda x: x ** 2, range(5)))`
* `[x ** 2 for x in range(5)]`

List comprehension alternative to `filer`

* `list(filter(lambda x: x % 2 == 0, range(10)))`
* `[x for x in range(10) if x % 2 == 0]`

Combining `map` and `filter`

* `list(filter(lambda y: y < 25, map(lambda x: x ** 2, range(10))))`
* `[y for y in [x ** 2 for x in range(10)] if y < 25]`
* `[x ** 2 for x in range(10) if x ** 2 < 25]`

In [48]:
l1 = [1, 2, 3, 4, 5]
l2 = [7, 8, 9]
l3 = map(lambda x, y: x + y, l1, l2)
print(l3)
print(list(l3))

l4 = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
l5 = filter(None, l4)
l6 = filter(lambda x: x % 2 == 0, l4)
print(l5)
print(l6)
print(list(l5))
print(list(l6))

<map object at 0x000001C23BDFB0D0>
[8, 10, 12]
<filter object at 0x000001C23BB99660>
<filter object at 0x000001C23BBF72B0>
[1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 2, 4, 6, 8]


In [50]:
def factorial(n):
    return 1 if n < 2 else n * factorial(n - 1)

result = map(factorial, range(5))

for i in result:
    print(i)

# won't print anything
# generators can only be iterated over once
# attempting to iterate over an exhausted generator will not produce any output
for i in result:
    print(i)

print(list(result)) # empty list too

1
1
2
6
24
[]


### Reducing functions in Python

* These are functions that recombine an iterable cumulatively, ending up with a single return value.
* Also called accumulators, aggregators, or folding functions.
* `functools.reduce(function, iterable[, initializer])`
  * Apply `function` of two arguments cumulatively to the items of `iterable`, from left to right, so as to reduce the iterable to a single value.
  * For example, `reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])` calculates `((((1+2)+3)+4)+5)`. The left argument, `x`, is the accumulated value and the right argument, `y`, is the update value from the iterable. If the optional `initializer` is present, it is placed before the items of the iterable in the calculation, and serves as a default when the `iterable` is empty. If initializer is not given and `iterable` contains only one item, the first item is returned.
* Built-in reducing functions
  * `min`
    * `min(iterable, *, key=None)`
    * `min(iterable, *, default, key=None)`
    * `min(arg1, arg2, *args, key=None)`
  * `max`
  * `sum(iterable, /, start=0)`
    * Sums `start` and the items of an iterable from left to right and returns the total. The iterable’s items are normally numbers, and the start value is *not* allowed to be a string.
    * The preferred, fast way to concatenate a sequence of strings is by calling `''.join(sequence)`. To add floating point values with extended precision, see `math.fsum()`. To concatenate a series of iterables, consider using `itertools.chain()`.
  * `any(iterable)`
    * Return `True` if any element of the iterable is true. If the iterable is empty, return `False`.
  * `all(iterable)`
    * Return `True` if all elements of the iterable are true (or if the iterable is empty).

In [51]:
def reduce_sequence(func, sequence):
    result = sequence[0]
    for e in sequence[1:]:
        result = func(result, e)
    return result

print(reduce_sequence(lambda x, y: x + y, range(10))) # range is actually an immutable sequence type
print(reduce_sequence(lambda x, y: x if x > y else y, range(10)))

45
9


In [56]:
from functools import reduce

print(reduce(lambda x, y: x + ' ' + y, ('Python', 'is', 'super', 'awesome')))

# using reduce to reproduce any()
print(reduce(lambda x, y: bool(x or y), (0, '', 1, 0.0, 'and', False)))
print(reduce(lambda x, y: bool(x or y), (0, '', None, 0.0, [], (), {}, False)))

# using reduce to reproduce all()
print(reduce(lambda x, y: bool(x and y), (0, '', 1, 0.0, 'and', False)))
print(reduce(lambda x, y: bool(x and y), (1, 'a', 0.1, [1], True)))

# using reduce to calculate factorial
def factorial(n):
    return reduce(lambda x, y: x * y, range(1, n + 1), 1)

print(factorial(1))
print(factorial(3))
print(factorial(10))

Python is super awesome
True
False
False
True
1
6
3628800


### Partial functions

* Partial functions in Python is a function that is created by fixing a certain number of arguments of another function.
* Partial functions allow us to freeze some portion of a function's arguments and generate a new function.
* The built-in module `functools` includes a function `partial` that can be used to create partial functions.
  * `functools.partial(func, /, *args, **keywords)`
  * Return a new partial object which when called will behave like `func` called with the positional arguments `args` and keyword arguments `keywords`. If more arguments are supplied to the call, they are appended to args.

In [65]:
# similar to functools.partial()
def partial_func(func, /, *args, **kwargs):
    def new_func(*new_args, **new_kwargs):
        new_kwargs = {**kwargs, **new_kwargs}
        return func(*args, *new_args, **new_kwargs)
    return new_func

int_base_2 = partial_func(int, base=2)
print(f'{int_base_2('1010')=}')

square = partial_func(pow, exp=2)
print(f'{square(10)=}')
print(f'{square(10, exp=3)}') # you can still overwrite the value of exp predefined in the partial function

modular_multiplicative_inverse = partial_func(pow, exp=-1)
print(f'{modular_multiplicative_inverse(38, mod=97)=}') # here mod must be keyword argument

int_base_2('1010')=10
square(10)=100
1000
modular_multiplicative_inverse(38, mod=97)=23


In [68]:
from functools import partial

int_base_2 = partial(int, base=2)
print(int_base_2('1010'))

def my_func(a, b, *args, k1, k2, **kwargs):
    print(a, b, args, k1, k2, kwargs)

# if want to freeze a and k1 of my_func
my_func_partial = partial(my_func, 'a', k1='k1')
my_func_partial('b', 'args1', 'args2', k2='k2', k3='k3', k4='k4')

# sort list of complex numbers according to their absolute value
sorted_by_abs = partial(sorted, key=lambda x: abs(x))
complexes = [1 + 2j, 3 - 4j, 0, 7 - 8j, 4 + 5j]
print(sorted_by_abs(complexes))

# sort list of points according to their distance to a certain point
origin = (0, 0)
points = [(0, 5), (0, 0), (3, 4), (-5, 1), (-3, -2)]
distance_square = lambda a, b: (a[0] - b[0]) ** 2 + (a[1] - b[1]) ** 2
points_sorted = sorted(points, key=partial(distance_square, origin))
print(points_sorted)

10
a b ('args1', 'args2') k1 k2 {'k3': 'k3', 'k4': 'k4'}
[0, (1+2j), (3-4j), (4+5j), (7-8j)]
[(0, 0), (-3, -2), (0, 5), (3, 4), (-5, 1)]


### The `operator` module

* The `operator` module exports a set of efficient functions corresponding to the intrinsic operators of Python. For example, `operator.add(x, y)` is equivalent to the expression `x+y`.
* These functions are handy in cases where callables must be stored, passed as arguments, or returned as function results.
* Many function names are those used for special methods, without the double underscores. For backward compatibility, many of these have a variant with the double underscores kept. The variants without the double underscores are preferred for clarity.
* The functions fall into categories that perform object comparisons, logical operations, mathematical operations and sequence operations.
  * Object comparison functions are useful for all objects, can return any value, which may or may not be interpreted as a Boolean value
    * `lt(a, b)`, `__lt__(a, b)`: equivalent to `a < b`
    * `le(a, b)`, `__le__(a, b)`: equivalent to `a <= b`
    * `eq(a, b)`, `__eq__(a, b)`: equivalent to `a == b`
    * `ne(a, b)`, `__ne__(a, b)`: equivalent to `a != b`
    * `ge(a, b)`, `__ge__(a, b)`: equivalent to `a >= b`
    * `gt(a, b)`, `__gt__(a, b)`: equivalent to `a > b`
  * Logical operations are generally applicable to all objects, and support truth tests, identity tests, and boolean operations
    * `not_(obj)`, `__not__(obj)`: return the out come of `not obj`, there's no `__not__(obj)` method for object instances, only the interpreter core defines this operation, the result is affected by the `__bool__()` and `__len__()` methods
    * `truth(obj)`: equivalent to using the `bool` constructor
    * `is_(a, b)`: return `a is b`
    * `is_not(a, b)`: return `a is not b`
  * Mathematical and bitwise operations are the most numerous
    * `abs(obj)`, `__abs__(obj)`: return the absolute value of `obj`
    * `add(a, b)`, `__add__(a, b)`: return `a + b`, for `a` and `b` numbers
    * `and_(a, b)`, `__and__(a, b)`: return the bitwise and of `a` and `b`
    * `floordiv(a, b)`, `__floordiv__(a, b)`: return `a // b`
    * `index(a)`, `__index__(a)`: return `a` converted to an integer, equivalent to `a.__index__()`, the result always has exact type `int`.
    * `inv(obj)`, `invert(obj)`, `__inv__(obj)`, `__invert__(obj)`: return the bitwise inverse of the number `obj`, equivalent to `~obj`
    * `lshift(a, b)`, `__lshift__(a, b)`: return `a` shifted left by `b`
    * `mod(a, b)`, `__mod__(a, b)`: return `a % b`
    * `mul(a, b)`, `__mul__(a, b)`: return `a * b`, for `a` and `b` numbers
    * `matmul(a, b)`, `__matmul__(a, b)`: return `a @ b`
    * `neg(obj)`, `__neg__(obj)`: return `-obj`
    * `or_(a, b)`, `__or__(a, b)`: return the bitwise or of `a` and `b`
    * `pos(obj)`, `__pos__(obj)`: return `+obj`
    * `pow(a, b)`, `__pow__(a, b)`: return `a ** b`, for `a` and `b` numbers
    * `rshift(a, b)`, `__rshift__(a, b)`: return `a` shifted right by `b`
    * `sub(a, b)`, `__sub__(a, b)`: return `a - b`
    * `truediv(a, b)`, `__truediv__(a, b)`: return `a / b`
    * `xor(a, b)`, `__xor__(a, b)`: return bitwise exclusive or of `a` and `b`
  * Operations which work with sequences, some of them with mappings too
    * `concat(a, b)`, `__concat__(a, b)`: return `a + b` for `a` and `b` sequences
    * `contains(a, b)`, `__contains__(a, b)`: return the outcome of the test `b in a`
    * `countOf(a, b)`: return the number of occurrences of `b` in `a`
    * `delitem(a, b)`, `__delitem__(a, b)`: remove the value of `a` at index `b`
    * `getitem(a, b)`, `__getitem__(a, b)`
    * `indexOf(a, b)`: return the index of the first occurrence of `b` in `a`
    * `setitem(a, b, c)`, `__setitem__(a, b, c)`: set the value of `a` at index `b` to `c`
    * `length_hint(obj, default=0)`: return an estimated length for the object `obj`, first try to return its actual length, then an estimate using `object.__length_hint__()`, and finally return the default value
  * Operation works with callables
    * `call(obj, /, *args, **kwargs)`, `__call__(obj, /, *args, **kwargs)`: return `obj(*args, **kwargs)`
  * tools for generalized attribute and item lookups, useful for making fast field extractors as arguments for `map()`, `sorted()`, or other functions that expect a function argument
    * `attrgetter(attr)`, `attrgetter(*attr)`: return a callable object that fetches `attr` from its operand, if more than one attribute is requested, returns a tuple of attributes, the attribute names can also contain dots
    * `itemgetter(item)`, `itemgetter(*item)`: return a callable object that fetches `item` from its operand using operand's `__getitem__()` method
    * `methodcaller(name, /, *args, **kwargs)`: return a callable object that *calls* the method `name` on its operand, if additional arguments and/or keyword arguments are given, they will be given to the method as well
  * In-place operators

In [73]:
import operator
from functools import reduce

print(reduce(lambda x, y: x * y, [1, 2, 3, 4]))
print(reduce(operator.mul, [1, 2, 3, 4]))

get_item_2 = operator.itemgetter(2)
print(get_item_2([1, 2, 3, 4]))
print(get_item_2('abcde'))

get_item_2_3 = operator.itemgetter(2, 3)
print(get_item_2_3([1, 2, 3, 4]))
print(get_item_2_3('abcde'))

class MyClass:
    def __init__(self):
        self.a = 'a'
        self.b = 'b'

    def test(self):
        print('test method get called')

obj = MyClass()
get_attr_a = operator.attrgetter('a')
print(get_attr_a(obj))

get_attr_test = operator.attrgetter('test')
get_attr_test(obj)()
method_caller_test = operator.methodcaller('test')
method_caller_test(obj)

complexes = [5 - 10j, 3 + 2j, -2 - 1j, 2 + 10j, 8 - 5j]
sorted_by_real = sorted(complexes, key=operator.attrgetter('real'))
print(sorted_by_real)

tuples = [(2, 3, 4), (-2, 3, 1), (0,), (5, 7, 1), (3, 4)]
sorted_by_first = sorted(tuples, key=operator.itemgetter(0))
# sorted_by_first = sorted(tuples, key=lambda x: x[0])
print(sorted_by_first)

24
24
3
c
(3, 4)
('c', 'd')
a
test method get called
test method get called
[(-2-1j), (2+10j), (3+2j), (5-10j), (8-5j)]
[(-2, 3, 1), (0,), (2, 3, 4), (3, 4), (5, 7, 1)]


## Scopes, closures and decorators

### Global and local scopes

Scopes and namespaces

* When an object is assigned to a variable, that variable points to some object, and we say that the variable (name) is bound to that object.
* That object can be accessed using that name in various parts of our code, but not just anywhere.
* That variable name and it's binding only exist in specific parts of our code.
* The portion of code where that name/binding is defined, is called the lexical scope of the variable.
* These bindings are stored in namespaces.

The global scope

* The global scope is essentially the module scope.
* It spans a single file only.
* There is no concept of a truly global (across all the modules in our entire app) scope in Python.
* The only exception to this are some of the built-in globally available objects, such as:
  * `True`
  * `None`
  * `dict`
  * `print`
* The built-in and global variables can be used in anywhere inside our module, including inside any function.
* Global scopes are nested inside the built-in scope.
* If you reference a variable name inside a scope and Python does not find it in that scope's namespace, it will look for it in an *enclosing* scope's namespace.

The local scope

* When we create functions, we can create variable names inside those functions (using assignment).
* Variables defined inside a function are not created until the function is called.
* Every time the function is called, a new scope is created.
* Variables defined inside the function are assigned to that scope.
* The actual object the variable references could be different each time the function is called, and this is why recursion works.
* When a function finishes running, the scope is gone too, and the reference count of the object was bound to is decremented.

Nested scopes

* Scopes are often nested.
* Built-in scope -> module/global scope -> local scope
* When requesting the object bound to a variable name, Python will try to find the object bound to the variable
  * in current local scope first
  * works up the chain of enclosing scopes
* When modifying a global variable value from inside the function
  * Python interprets it as a local variable (at compile-time)
  * the local variable masks the global variable with the same name
* The `global` keyword
  * we can tell Python that a variable is meant to be scoped in the global scope by using the `global` keyword
* When Python encounters a function definition at compile-time
  * it will scan for any labels (variables) that have values *assigned* to them *anywhere in the function*, if the label has not been specified as `global`, it will be local
  * variables that are *referenced* but not assigned a value *anywhere in the function* will not be local, and Python will, at run-time, look for them in enclosing scopes

In [3]:
def my_func():
    global var
    var = 'hello'

# print(var) # name 'var' is not defined
my_func() # because of var is global, it is still available after the function finishes
print(var)

a = 10
def my_func():
    print(f'global a? {a=}') # local variable 'a' referenced before assignment
    a = 'hello' # a becomes local variable at compile-time

def print(string):
    return f'hello {string}'

print('world') # doesn't print anything, only returns a string, because the built-in print function has been masked by the string function in the module scope
del print
print('world')

hello
world


### Nonlocal scopes

Inner functions

* We can define functions from inside another function.
* Both functions have access to the global and built-in scopes as well as their respective local scope.
* But the inner function also has access to its enclosing scope -- the scope of the outer function.
* This enclosing scope is neither local (to the inner function) nor global, it is called a nonlocal scope.
* The `nonlocal` keyword
  * just as with global variables, we have to explicitly tell Python we are modifying a nonlocal variable
  * we can do that using the `nonlocal` keyword
* Whenever Python is told that a variable is `nonlocal`
  * it will look for it in the enclosing local scopes chain until it first encounters the specific variable name
  * it will look in local scopes, it will *not* look in the global scope

In [7]:
x = 'python'

def outer():
    global x
    x = 'monty'

    def inner():
        # nonlocal x # no binding for nonlocal 'x'
        x = 'hello'
        print(x)

    inner()
    print(x)

outer()

hello
monty


### Closures

Free variables and closures

* Functions defined inside another function can access the outer (nonlocal) variables.
* ```Python
    def outer():
        x = 'Python'
        def inner():
            print(f'{x} rock!')
        return inner
  ```
  * this nonlocal variable `x` is called a free variable
  * `x` is bound to the variable `x` in `outer`
  * this happens when `outer` runs, i.e., when `inner` is created
  * the function `inner` and the free variable `x` together is called a closure
  * when we return `inner`, we are actually returning the closure, not just the `inner` function
  * we can assign that return value to a variable name: `fn = outer()`
  * when we call `fn`, at that time Python will determine the value of `x` in the extended scope
  * but notice that `outer` has finished running before we call `fn` -- its scope was gone

Python cells and multi-scoped variables

* The value of `x` is shared between two scopes -- `outer` and closure.
* The label `x` is in two different scopes, but always reference the same value.
* Python does this by creating a cell as an intermediary object.
* In effect, both variables `x` (in `outer` and `inner`) point to the same cell.
* When requesting the value of the variable, Python will double-hop to get to the final value.

Closures

* You can think of the closure as a function plus an extended scope that contains the free variables.
* The free variable's value is the object cell points to.
* Every time the function in the closure is called and the free variable is referenced, Python looks up the cell object, and then whatever the cell is pointing to.
*  `fn.__code__.co_freevars`: `('x',)`, `a` is not a free variable
* `fn.__closure__`: `(<cell at 0xA500: str object at 0xFF100>, )`
  * `None` or a tuple of cells that contain bindings for the function’s free variables.
  * A cell object has the attribute `cell_contents`. This can be used to get the value of the cell, as well as set the value.
* `hex(id(x))`: `0xFF100`, the indirect reference, the address of the object that contains the string `'Python'`.

Multiple instances of closures

* Every time we run a function, a new scope is created.
* If that function generates a closure, a new closure is created every time as well.

Shared extended scopes

* ```Python
    def outer():
        count = 0
        def inc1():
            nonlocal count
            count += 1
            return count
        def inc2():
            nonlocal count
            count += 2
            return count
        return inc1, inc2
    f1, f2 = outer()
  ```
  * there are two different closures, but they have a shared free variable `count`, i.e. the same cell
* ```Python
    def create adders():
        adders = []
        for n in range(1, 4):
            adders.append(lambda x: x + n)
        return adders
  ```
  * `n` is not defined in the lambda function, so it has to be from the outer scope, i.e., bound to the `n` we created in the loop
  * when the loop iterates, a new scope is created, and both `n` point to the same cell, and the cell points to the value `n` currently equals to
  * the loop creates a different lambda function and closure at each iteration, but the closures all share the same free variable `n`, i.e., the same cell

Nested closures

* ```Python
    def incrementer(n):
        def inner(start):
            current = start
            def inc():
                nonlocal current
                current += n
                return current
            return inc
        return inner
    inner_2 = incrementer(2)
    inc_2_100 = inner_2(100)
    inc_2_100() # 102
    inc_2_100() # 104
  ```

In [2]:
def outer():
    count = 0

    def inner1():
        nonlocal count
        count += 1
        return count
    
    def inner2():
        nonlocal count
        count += 1
        return count
    
    return inner1, inner2

f1, f2 = outer()
print(f'{f1.__code__.co_freevars=}')
print(f'{f2.__code__.co_freevars=}')
print(f'{f1.__closure__=}')
print(f'{f2.__closure__=}') # same cell as above
print(f1())
print(f2())


f1.__code__.co_freevars=('count',)
f2.__code__.co_freevars=('count',)
f1.__closure__=(<cell at 0x000001F65DDE0790: int object at 0x00007FF9D2703998>,)
f2.__closure__=(<cell at 0x000001F65DDE0790: int object at 0x00007FF9D2703998>,)
1
2


In [3]:
def create_adders():
    adders = []
    for n in range(1, 4):
        adders.append(lambda x, y=n: x + y) # defaults get evaluated at creation-time
    return adders

adders = create_adders()
print(f'{adders[0].__code__.co_freevars=}') # no free variables
print(f'{adders[0].__closure__=}') # not a closure, so no cells
# at function creation-time, n is evaluated and put that into y as the default for y
# there's not even a mention of n in the body of the function

print(f'{adders[0](10)=}')
print(f'{adders[1](10)=}')
print(f'{adders[2](10)=}')

adders[0].__code__.co_freevars=()
adders[0].__closure__=None
adders[0](10)=11
adders[1](10)=12
adders[2](10)=13


### Closure applications

In [4]:
class Averager:
    def __init__(self):
        self.numbers = []

    def add(self, number):
        self.numbers.append(number)
        return sum(self.numbers) / len(self.numbers)

def outer():
    numbers = []
    def add(number):
        numbers.append(number)
        return sum(numbers) / len(numbers)
    return add

averager = Averager()
print(averager.add(1))
print(averager.add(2))
print(averager.add(3))
print(averager.add(4))

calc_ave = outer()
print(calc_ave(1))
print(calc_ave(2))
print(calc_ave(3))
print(calc_ave(4))

1.0
1.5
2.0
2.5
1.0
1.5
2.0
2.5


In [5]:
def outer_better():
    count = 0
    total = 0
    def add(number):
        nonlocal count
        nonlocal total
        count += 1
        total += number
        return total / count
    return add

calc_ave = outer_better()
print(f'{calc_ave.__code__.co_freevars=}')
print(f'{calc_ave.__closure__=}')
print(calc_ave(1))
print(calc_ave(2))
print(calc_ave(3))
print(calc_ave(4))

calc_ave.__code__.co_freevars=('count', 'total')
calc_ave.__closure__=(<cell at 0x000001F65DDDE1D0: int object at 0x00007FF9D2703998>, <cell at 0x000001F65DDDD1B0: int object at 0x00007FF9D2703998>)
1.0
1.5
2.0
2.5


In [8]:
from time import perf_counter

class Timer:
    def __init__(self):
        self.start = perf_counter()

    def __call__(self):
        return perf_counter() - self.start
    
def outer():
    start = perf_counter()
    def poll():
        return perf_counter() - start
    return poll
    
t1 = Timer()
print(t1())

timer = outer()
print(timer())

9.450002107769251e-05
4.470010753720999e-05


In [9]:
def counter(fn):
    # a function report the times of the designated function get called
    # it actually works the same way as a decorator
    count = 0
    def inner(*args, **kwargs):
        nonlocal count
        count += 1
        print(f'{fn.__name__} has been called {count} times')
        return fn(*args, **kwargs)
    return inner

def add(a, b):
    return a + b

def mul(a, b):
    return a * b

add = counter(add)
print(add(1, 2))
print(add(2, 3))

mul = counter(mul)
print(mul(4, 5))
print(mul(2, 7))

add has been called 1 times
3
add has been called 2 times
5
mul has been called 1 times
20
mul has been called 2 times
14


### Decorators

* We can modify a function by wrapping it inside another function that add some extra functionality to it.
* This is also called function decoration.
* The outer function is called decorator function.
* In general a decorator function:
  * takes a function as an argument
  * returns a closure
  * the closure usually accepts any combination of parameters
  * runs some code in the inner function (closure)
  * the closure function calls the original function using the arguments passed to the closure
  * returns whatever is returned by that function call
* A function returning another function, usually applied as a function transformation using the `@wrapper` syntax.
* Common examples for decorators are `classmethod()` and `staticmethod()`.
* The decorator syntax is merely syntactic sugar.
* A decorator in Python is any callable Python object that is used to modify a function or a class.

Introspecting decorated functions

* After a function has been decorated, the original function's name points to the closure returned by the decorator.
* The docstring and the signature of the original function are lost.
* The function name and docstring can be fixed by overwriting the inner function's `__name__` and `__doc__`.
* The `functools.wraps(wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES)` function can be used to fix the metadata of the inner function in the decorator, and the `wraps` function is itself a decorator.
  * This is a convenience function for invoking `update_wrapper()` as a function decorator when defining a wrapper function. It is equivalent to `partial(update_wrapper, wrapped=wrapped, assigned=assigned, updated=updated)`.
  * `functools.update_wrapper(wrapper, wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES)`: update a wrapper function to look like the wrapped function.

In [5]:
def counter(fn):
    count = 0

    def inner(*args, **kwargs):
        nonlocal count
        count += 1
        print(f'Function {fn.__name__} has been called {count} times')
        return fn(*args, **kwargs)
    return inner

def add(a: int, b: int = 0):
    """
    adds two values
    """
    return a + b

help(add)

add = counter(add)
help(add)
print(f'{add(1, 2)=}')

Help on function add in module __main__:

add(a: int, b: int = 0)
    adds two values

Help on function inner in module __main__:

inner(*args, **kwargs)

Function add has been called 1 times
add(1, 2)=3


In [6]:
def counter(fn):
    count = 0

    def inner(*args, **kwargs):
        nonlocal count
        count += 1
        print(f'Function {fn.__name__} has been called {count} times')
        return fn(*args, **kwargs)
    return inner

# use the @wrapper syntax
@counter
def add(a: int, b: int = 0):
    """
    adds two values
    """
    return a + b

help(add)
print(f'{add(1, 2)=}')

Help on function inner in module __main__:

inner(*args, **kwargs)

Function add has been called 1 times
add(1, 2)=3


In [7]:
from functools import wraps

def counter(fn):
    count = 0

    @wraps(fn) # fix the metadata of inner
    def inner(*args, **kwargs):
        nonlocal count
        count += 1
        print(f'Function {fn.__name__} has been called {count} times')
        return fn(*args, **kwargs)
    # the following can only fix function name and docstring
    # inner.__name__ = fn.__name__
    # inner.__doc__ = fn.__doc__

    # works the same as the @wraps decorator
    # inner = wraps(fn)(inner)
    return inner

@counter
def add(a: int, b: int = 0):
    """
    adds two values
    """
    return a + b

help(add)
print(f'{add(1, 2)=}')

Help on function add in module __main__:

add(a: int, b: int = 0)
    adds two values

Function add has been called 1 times
add(1, 2)=3


In [2]:
def timed(fn):
    from time import perf_counter
    from functools import wraps

    @wraps(fn)
    def inner(*args, **kwargs):
        start = perf_counter()
        result = fn(*args, **kwargs)
        elapsed = perf_counter() - start

        args_list = [str(arg) for arg in args]
        kwargs_list = [f'{k}={v}' for k, v in kwargs.items()]
        args_string = ','.join(args_list + kwargs_list)
        print(f'{fn.__name__}({args_string}) took {elapsed}s to run')
        return result
    
    return inner

# recursion
# @timed
def calc_nth_fibonacci_recursion(n):
    """
    Return the nth number of a fibonacci series, n starting from 1
    """
    return 1 if n < 3 else calc_nth_fibonacci_recursion(n - 1) + calc_nth_fibonacci_recursion(n - 2)

# calculate the total time elapsed
@timed
def fib_recursion(n):
    return calc_nth_fibonacci_recursion(n)

# calc_nth_fibonacci_recursion(6)

print(f'{fib_recursion(30)=}')

# loop
@timed
def calc_nth_fibonacci_loop(n):
    if n <= 2:
        return 1
    fib_1 = 1
    fib_2 = 1
    for n in range(3, n + 1):
        fib_1, fib_2 = fib_2, fib_1 + fib_2
    return fib_2

print(f'{calc_nth_fibonacci_loop(30)=}')

# reduce
@timed
def calc_nth_fibonacci_reduce(n):
    from functools import reduce
    if n <= 1:
        return 1
    fib = reduce(lambda x, y: (x[0] + x[1], x[0]),
           range(n - 1),
           (1, 0))
    return fib[0]

print(f'{calc_nth_fibonacci_reduce(30)=}')

fib_recursion(30) took 0.18590169993694872s to run
fib_recursion(30)=832040
calc_nth_fibonacci_loop(30) took 7.3999399319291115e-06s to run
calc_nth_fibonacci_loop(30)=832040
calc_nth_fibonacci_reduce(30) took 1.2199976481497288e-05s to run
calc_nth_fibonacci_reduce(30)=832040


In [6]:
def logger(fn):
    from functools import wraps
    import datetime

    @wraps(fn)
    def inner(*args, **kwargs):
        run_dt = datetime.datetime.now(datetime.UTC)
        result = fn(*args, **kwargs)
        print(f'{run_dt}: called {fn.__name__}')
        return result
    
    return inner

def timed(fn):
    from functools import wraps
    from time import perf_counter

    @wraps(fn)
    def inner(*args, **kwargs):
        start = perf_counter()
        result = fn(*args, **kwargs)
        elapsed = perf_counter() - start
        print(f'{fn.__name__} ran for {elapsed}s')
        return result
    
    return inner

# stacked decorators
# the decorators are executed from top to bottom
@logger
@timed
def factorial(n):
    from functools import reduce
    from operator import mul

    return reduce(mul, range(1, n + 1))

# the equivalent to the above decorators
# factorial = logger(timer(factorial))

print(factorial(10))

factorial ran for 1.6400008462369442e-05s
2024-01-04 09:53:04.115387+00:00: called factorial
3628800


In [10]:
# fibonacci class with cache
class Fib:
    def __init__(self):
        self.cache = {1: 1, 2: 1}

    def __call__(self, n):
        if n in self.cache:
            return self.cache[n]
        print(f'Calculating fib({n})')
        self.cache[n] = self.__call__(n - 1) + self.__call__(n - 2)
        return self.cache[n]
    
fib = Fib()
print(f'{fib(1)=}')
print(f'{fib(2)=}')
print(f'{fib(3)=}')
print(f'{fib(4)=}')
print(f'{fib(5)=}')
print(f'{fib(4)=}')
print(f'{fib(3)=}')
print(f'{fib(2)=}')


fib(1)=1
fib(2)=1
Calculating fib(3)
fib(3)=2
Calculating fib(4)
fib(4)=3
Calculating fib(5)
fib(5)=5
fib(4)=3
fib(3)=2
fib(2)=1


In [13]:
# closure with cache
def outer():
    cache = {1: 1, 2: 1}

    def fib(n):
        if n not in cache:
            print(f'Calculating fib({n})')
            cache[n] = fib(n - 1) + fib(n - 2)
        return cache[n]
    
    return fib

fib = outer()
print(f'{fib(1)=}')
print(f'{fib(2)=}')
print(f'{fib(3)=}')
print(f'{fib(4)=}')
print(f'{fib(5)=}')
print(f'{fib(4)=}')

fib(1)=1
fib(2)=1
Calculating fib(3)
fib(3)=2
Calculating fib(4)
fib(4)=3
Calculating fib(5)
fib(5)=5
fib(4)=3


In [15]:
def memoize(fn):
    cache = {}

    def inner(n):
        if n not in cache:
            cache[n] = fn(n)
        return cache[n]
    
    return inner

# decorator with cache
@memoize
def fib(n):
    print(f'Calculating fib({n})')
    return 1 if n < 3 else fib(n - 1) + fib(n - 2)

print(f'{fib(1)=}')
print(f'{fib(2)=}')
print(f'{fib(3)=}')
print(f'{fib(4)=}')
print(f'{fib(5)=}')
print(f'{fib(4)=}')

Calculating fib(1)
fib(1)=1
Calculating fib(2)
fib(2)=1
Calculating fib(3)
fib(3)=2
Calculating fib(4)
fib(4)=3
Calculating fib(5)
fib(5)=5
fib(4)=3


In [18]:
from functools import lru_cache

# built-in decorator with cache
# @cache(user_function) return the same as lru_cache(maxsize=None)
# @functools.lru_cache(user_function)
# @functools.lru_cache(maxsize=128, typed=False)
@lru_cache
def fib(n):
    print(f'Calculating fib({n})')
    return 1 if n < 3 else fib(n - 1) + fib(n - 2)

print(f'{fib(1)=}')
print(f'{fib(2)=}')
print(f'{fib(3)=}')
print(f'{fib(4)=}')
print(f'{fib(5)=}')
print(f'{fib(4)=}')
print(f'{fib(129)=}') # default maxsize=128, this will cause fib(1) replaced by fib(129)
print(f'{fib(1)=}')

Calculating fib(1)
fib(1)=1
Calculating fib(2)
fib(2)=1
Calculating fib(3)
fib(3)=2
Calculating fib(4)
fib(4)=3
Calculating fib(5)
fib(5)=5
fib(4)=3
Calculating fib(129)
Calculating fib(128)
Calculating fib(127)
Calculating fib(126)
Calculating fib(125)
Calculating fib(124)
Calculating fib(123)
Calculating fib(122)
Calculating fib(121)
Calculating fib(120)
Calculating fib(119)
Calculating fib(118)
Calculating fib(117)
Calculating fib(116)
Calculating fib(115)
Calculating fib(114)
Calculating fib(113)
Calculating fib(112)
Calculating fib(111)
Calculating fib(110)
Calculating fib(109)
Calculating fib(108)
Calculating fib(107)
Calculating fib(106)
Calculating fib(105)
Calculating fib(104)
Calculating fib(103)
Calculating fib(102)
Calculating fib(101)
Calculating fib(100)
Calculating fib(99)
Calculating fib(98)
Calculating fib(97)
Calculating fib(96)
Calculating fib(95)
Calculating fib(94)
Calculating fib(93)
Calculating fib(92)
Calculating fib(91)
Calculating fib(90)
Calculating fib(89)
C

Decorator factories

* A Python decorator factory is a function that returns a decorator, providing a way to parameterize the decorator.
* Decorator factories are valuable when customization or configuration of decorators is needed.
* They are implemented by defining a function that takes arguments and returns a decorator function.

In [19]:
def timed(fn, reps):
    from time import perf_counter
    from functools import wraps

    @wraps(fn)
    def inner(*args, **kwargs):
        total_elapsed = 0
        for i in range(reps):
            start = perf_counter()
            result = fn(*args, **kwargs)
            total_elapsed += perf_counter() - start
        print(f'Average execution time: {total_elapsed / reps}')
        return result

    return inner

# @timed(10) # not working
def my_func():
    print('my_func executed')

my_func = timed(my_func, 10)
my_func()

my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
Average execution time: 2.576999831944704e-05


In [22]:
# the outer function is not itself a decorator
# instead it returns a decorator, just like what wraps do
# any arguments passed to outer can be referenced (as free variables) inside the decorator
# the outer function is called a decorator factory function
def outer(reps):
    def timed(fn):
        from time import perf_counter
        from functools import wraps

        @wraps(fn)
        def inner(*args, **kwargs):
            total_elapsed = 0
            for i in range(reps):
                start = perf_counter()
                result = fn(*args, **kwargs)
                total_elapsed += perf_counter() - start
            print(f'Average execution time: {total_elapsed / reps}')
            return result

        return inner

    return timed

@outer(10) # outer gets called, and returns a parameterized decorator
def my_func():
    print('my_func executed')

# my_func = outer(10)(my_func) # equivalent to the decorator above
my_func()

my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
my_func executed
Average execution time: 2.861000830307603e-05


In [24]:
def dec(fn):
    print('running dec')

    def inner(*args, **kwargs):
        print('running inner')
        return fn(*args, **kwargs)
    
    return inner

@dec
def func():
    print('running func')

# func = dec(func) # @dec equivalent

print('after func got decorated')
func()

running dec
after func got decorated
running inner
running func


Class instance decorators

* Because of class instances can be made callable by implementing `__call__` method, we can decorate a function using a class instance.

In [27]:
class DecoratorClass:
    def __init__(self, a, b):
        self.a = a
        self.b = b

    def __call__(self, fn):
        print(f'decorate {fn.__name__}')
        def inner(*args, **kwargs):
            print(f'decorated function called: a={self.a}, b={self.b}')
            return fn(*args, **kwargs)
        return inner

@DecoratorClass(10, 20)
def func():
    print('func called')

print('after func decorated')
func()

decorate func
after func decorated
decorated function called: a=10, b=20
func called


Class decorators

* Decorators can be classes too.
* When we decorate a function with a class, the function is automatically passed as the first argument to the `__init__` method.
* The function actually becomes an instance of the class (an instance with the same name as the function).

In [30]:
class Decorator:
    def __init__(self, fn): # cannot parameterized fn
        print(f'decorate {fn.__name__}')
        self.fn = fn

    def __call__(self, *args, **kwargs):
        print(f'decorated {self.fn.__name__} called')
        return self.fn(*args, **kwargs)
    
@Decorator
def func(a, b=1, *, c):
    print(f'{a=}, {b=}, {c=}')

print('after func decorated')
func(0, c=2)

decorate func
after func decorated
decorated func called
a=0, b=1, c=2


Decorating classes

* We can decorate use decorators to decorate classes, just like how we decorate functions.

In [32]:
def speak_decorator(cls):
    cls.speak = lambda self, message: f'{self.__class__.__name__} says {message}'
    return cls # the return of cls if not necessary if not using @wrapper syntax or its equivalent

@speak_decorator
class Person:
    pass

# Person = speak_decorator(Person) # decorator equivalent

p = Person()
p.speak('Hello world!')

'Person says Hello world!'

In [41]:
import datetime

def debug_info(cls):
    def info(self):
        results = []
        results.append(f'time: {datetime.datetime.now(datetime.UTC)}')
        results.append(f'class: {self.__class__.__name__}')
        results.append(f'id: {hex(id(self))}')
        for k, v in vars(self).items():
            results.append(f'{k}: {v}')
        return results
    
    cls.debug = info
    return cls

@debug_info
class Person:
    def __init__(self, name, birth_year):
        self.name = name
        self.birth_year = birth_year

    def say_hi(self):
        return 'Hello there!'
    
p = Person('John', 1999)
print(p.say_hi())
print(p.debug())

Hello there!
['time: 2024-01-05 08:51:19.424550+00:00', 'class: Person', 'id: 0x229cefbd9a0', 'name: John', 'birth_year: 1999']


In [45]:
@debug_info
class Auto:
    def __init__(self, make, model, year, top_speed, speed):
        self.make = make
        self.model = model
        self.year = year
        self.top_speed = top_speed
        self.speed = speed

    @property
    def speed(self):
        return self._speed
    
    @speed.setter
    def speed(self, speed):
        if speed > self.top_speed:
            raise ValueError('Speed cannot exceed top_speed')
        self._speed = speed

    def run(self):
        print('The car is running')

auto = Auto('Ford', 'Model T', 1908, 45, 10)
print(auto.speed)
# auto.speed = 100 # raise ValueError here
auto.speed = 20
auto.debug()

10


['time: 2024-01-05 09:06:40.274780+00:00',
 'class: Auto',
 'id: 0x229cedc75c0',
 'make: Ford',
 'model: Model T',
 'year: 1908',
 'top_speed: 45',
 '_speed: 20']

In [50]:
def complete_ordering(cls):
    if '__eq__' in dir(cls) and '__lt__' in dir(cls):
        cls.__le__ = lambda self, other: self < other or self == other
        cls.__gt__ = lambda self, other: not (self < other or self == other) # raise TypeError if other of unsupported type
        cls.__ge__ = lambda self, other: not (self < other)
        cls.__ne__ = lambda self, other: not (self == other)
    return cls

@complete_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __abs__(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5
    
    def __repr__(self):
        return f'Point({self.x}, {self.y})'
    
    def __eq__(self, other):
        if isinstance(other, Point):
            return self.x == other.x and self.y == other.y
        else:
            return False
        
    def __lt__(self, other):
        if isinstance(other, Point):
            return abs(self) < abs(other)
        else:
            return NotImplemented

p1, p2, p3 = Point(1, 2), Point(1, 2), Point(2, 3)
n = 1
print(f'{(p1 == p2)=}')
print(f'{(p1 != p2)=}')
print(f'{(p1 > p2)=}')
print(f'{(p1 >= p2)=}')
print(f'{(p1 < p2)=}')
print(f'{(p1 <= p2)=}')
print(f'{(p1 < p3)=}')
print(f'{(p1 >= p3)=}')
print(f'{(p1 != 1)=}')
# print(f'{(p1 < 1)=}') # raise TypeError
# print(f'{(p1 > 1)=}')
# print(f'{(p1 >= 1)=}')

(p1 == p2)=True
(p1 != p2)=False
(p1 > p2)=False
(p1 >= p2)=True
(p1 < p2)=False
(p1 <= p2)=True
(p1 < p3)=True
(p1 >= p3)=False
(p1 != 1)=True


In [51]:
from functools import total_ordering

# Given a class defining one or more rich comparison ordering methods, this class decorator supplies the rest.
# The class must define one of __lt__(), __le__(), __gt__(), or __ge__(). In addition, the class should supply an __eq__() method.
@total_ordering
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y

    def __abs__(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5
    
    def __repr__(self):
        return f'Point({self.x}, {self.y})'
    
    def __eq__(self, other):
        if isinstance(other, Point):
            return self.x == other.x and self.y == other.y
        else:
            return False
        
    def __lt__(self, other):
        if isinstance(other, Point):
            return abs(self) < abs(other)
        else:
            return NotImplemented
        
p1, p2, p3 = Point(1, 2), Point(1, 2), Point(2, 3)
n = 1
print(f'{(p1 == p2)=}')
print(f'{(p1 != p2)=}')
print(f'{(p1 > p2)=}')
print(f'{(p1 >= p2)=}')
print(f'{(p1 < p2)=}')
print(f'{(p1 <= p2)=}')
print(f'{(p1 < p3)=}')
print(f'{(p1 >= p3)=}')
print(f'{(p1 != 1)=}')
# print(f'{(p1 < 1)=}') # raise TypeError
# print(f'{(p1 > 1)=}')
# print(f'{(p1 >= 1)=}')

(p1 == p2)=True
(p1 != p2)=False
(p1 > p2)=False
(p1 >= p2)=True
(p1 < p2)=False
(p1 <= p2)=True
(p1 < p3)=True
(p1 >= p3)=False
(p1 != 1)=True


Single-dispatch generic functions

* A generic function is composed of multiple functions implementing the same operation for different types.
* Which implementation should be used during a call is determined by the dispatch algorithm.
* When the implementation is chosen based on the type of a single argument, this is known as single dispatch.

In [4]:
from html import escape
from decimal import Decimal

def html_escape(obj):
    return escape(str(obj))

def html_int(i):
    return f'{i} (<i>{hex(i)}</i>)'

def html_real(r):
    return f'{r:.2f}'

def html_str(s):
    return html_escape(s).replace('\n', '<br>\n')

def html_list(l):
    items = (f'<li>{htmlize(item)}</li>' for item in l)
    return '<ul>\n' + '\n'.join(items) + '\n</ul>'

def html_dict(d):
    items = (f'<li>{html_escape(k)}={htmlize(v)}</li>' for k, v in d.items())
    return '<ul>\n' + '\n'.join(items) + '\n</ul>'

def htmlize(arg):
    registry = {
        object: html_escape,
        int: html_int,
        float: html_real,
        Decimal: html_real,
        str: html_str,
        list: html_list,
        tuple: html_list,
        set: html_list,
        dict: html_dict
    }
    return registry.get(type(arg), registry[object])(arg)

print(htmlize(['a', 1, 1.2, 'hello\nworld, 1 > 2 != 3',
               (True, False, None),
               {1: '1', 2: '2', 3: '3'},
               {'x', 'y', 'z'}]))

<ul>
<li>a</li>
<li>1 (<i>0x1</i>)</li>
<li>1.20</li>
<li>hello<br>
world, 1 &gt; 2 != 3</li>
<li><ul>
<li>True</li>
<li>False</li>
<li>None</li>
</ul></li>
<li><ul>
<li>1=1</li>
<li>2=2</li>
<li>3=3</li>
</ul></li>
<li><ul>
<li>x</li>
<li>z</li>
<li>y</li>
</ul></li>
</ul>


In [13]:
from html import escape

def single_dispatch(fn):
    registry = {}
    registry[object] = fn

    def decorated(arg):
        return registry.get(type(arg), registry[object])(arg)

    def register(type_):
        def inner(fn_extension):
            registry[type_] = fn_extension
            return fn_extension # just return back the decorated function as it is
        return inner

    def dispatch(type_):
        return registry.get(type_, registry[object])

    decorated.register = register
    decorated.dispatch = dispatch
    decorated.registry = registry
    return decorated

@single_dispatch
def htmlize(obj):
    return escape(str(obj))

print(f'{htmlize.__name__=}')
print(f'{htmlize.register=}')
print(htmlize('I can fly!\n1 < 100'))

# register decorator can be stacked because it returns back the decorated function as it is
@htmlize.register(set)
@htmlize.register(tuple)
@htmlize.register(list)
def html_sequence(sequence):
    items = (f'<li>{htmlize(item)}</li>' for item in sequence)
    return '<ul>\n' + '\n'.join(items) + '\n</ul>'

print(htmlize(['a', 'b', 'c']))
print(htmlize(('1', '2', '3')))
print(htmlize.registry)
print(htmlize.dispatch(int))

htmlize.__name__='decorated'
htmlize.register=<function single_dispatch.<locals>.register at 0x0000022310309580>
I can fly!
1 &lt; 100
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
</ul>
<ul>
<li>1</li>
<li>2</li>
<li>3</li>
</ul>
{<class 'object'>: <function htmlize at 0x000002231030B4C0>, <class 'list'>: <function html_sequence at 0x000002231030B1A0>, <class 'tuple'>: <function html_sequence at 0x000002231030B1A0>, <class 'set'>: <function html_sequence at 0x000002231030B1A0>}
<function htmlize at 0x000002231030B4C0>


`@functools.singledispatch`:

* To define a generic function, decorate it with the `@singledispatch` decorator
* The dispatch happens on the type of the first argument
* To add overloaded implementations to the function, use the `register()` attribute of the generic function, which can be used as a decorator
* For functions annotated with types, the decorator will infer the type of the first argument automatically
* `types.UnionType` and `typing.Union` can also be used
* For code which doesn’t use type annotations, the appropriate type argument can be passed explicitly to the decorator itself
* To enable registering lambdas and pre-existing functions, the `register()` attribute can also be used in a functional form
* The `register()` attribute returns the undecorated function
* Where there is no registered implementation for a specific type, its method resolution order is used to find a more generic implementation
* If an implementation is registered to an abstract base class, virtual subclasses of the base class will be dispatched to that implementation
* To check which implementation the generic function will choose for a given type, use the `dispatch()` attribute
* To access all registered implementations, use the read-only `registry` attribute

## Tuples as data structures and named tuples

### Tuples as data structures

* Tuples are immutable sequences, typically used to store collections of heterogeneous data (such as the 2-tuples produced by the `enumerate()` built-in).
* Tuples are also used for cases where an immutable sequence of homogeneous data is needed (such as allowing storage in a `set` or `dict` instance).

Tuples vs lists vs strings

Tuples|Lists|Strings
---|---|---
containers|containers|containers
order matters|order matters|order matters
heterogeneous/homogeneous|homogeneous/heterogeneous|homogeneous
indexable|indexable|indexable
iterable|iterable|iterable
immutable|mutable|immutable
fixed length|variable length|fixed length
fixed order|variable order|fixed order
cannot do in-place sorts|can do in-place sort|cannot do in-place order
cannot do in-place reverse|can do in-place reverse|cannot do in-place reverse

Immutability of tuples

* Elements cannot be added or removed
* The order of elements cannot be changed
* Works well for representing data structures by giving the order/position of the object contained meaning
  * `london = ('London', 'UK', 8_780_000)`
  * because tuples, strings and integers are immutable, we are guaranteed that the data and data structure for `london` will never change

Extracting data from tuples

* Since tuples are sequences just like lists and strings, we can retrieve items by index.
* We can also use unpacking.

Dummy variables

* Sometimes, we are only interested in a subset of the data fields in a tuple, not all of them.
  * `city, _, population = ('London', 'UK', 8_780_000)`
  * `symbol, year, month, day, *_, close = ('DJIA', 2018, 1, 19, 25987.35, 26071.22, 25942.83, 26071.72)`
  * `_` is actually a legal variable name, so there's nothing special about it.
  * By convention, we use the underscore to indicate this is a variable we don't care about.

In [15]:
london = 'London', 'UK', 8780000
new_york = 'New York', 'USA', 8500000
beijing = 'Beijing', 'China', 21000000
cities = [london, new_york, beijing]

total = 0
# for city in cities:
#     total += city[2]
for *_, population in cities:
    total += population
print(total)

total = sum(city[2] for city in cities)
print(total)

38280000
38280000


In [19]:
# random.uniform(a, b)
# Return a random floating point number N such that a <= N <= b for a <= b and b <= N <= a for b < a.
from random import uniform

def random_shot(radius=1):
    random_x = uniform(-radius, radius)
    random_y = uniform(-radius, radius)

    is_in_circle = random_x ** 2 + random_y ** 2 <= radius ** 2
    return random_x, random_y, is_in_circle

def in_circle_rate(repetitions):
    count = 0
    for _ in range(repetitions):
        *_, inside = random_shot()
        count += inside # bool is indeed subclass of int
    return count / repetitions

print(f'Pi = {in_circle_rate(1000) * 4}')
print(f'Pi = {in_circle_rate(10000) * 4}')
print(f'Pi = {in_circle_rate(100000) * 4}')
print(f'Pi = {in_circle_rate(1000000) * 4}')

Pi = 3.168
Pi = 3.1524
Pi = 3.1494
Pi = 3.141376


### Named tuples

* The positions of the objects contained in tuples have meaning as data structure, but this is not very readable nor transparent.
* In order to make things clearer for the reader, we might want to approach this using a class instead.
* The class approach has some drawbacks.
* When your data structure needs to be immutable, hashable, iterable, unpackable, and comparable, then you can use `namedtuple`.
* Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.
* `namedtuple` is a factory function, which generates a new class inheriting from `tuple`.
* `collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)`
  * Returns a new tuple subclass named `typename`. The new *subclass* is used to create tuple-like objects that have fields accessible by attribute lookup as well as being indexable and iterable.
  * The `field_names` are a sequence of strings such as `['x', 'y']`. Alternatively, `field_names` can be a single string with each fieldname separated by whitespace and/or commas, for example `'x y'` or `'x, y'`.
  * Any valid Python identifier may be used for a fieldname except for names starting with an underscore.
  * If `rename` is true, invalid fieldnames are automatically replaced with positional names. For example, `['abc', 'def', 'ghi', 'abc']` is converted to `['abc', '_1', 'ghi', '_3']`, eliminating the keyword `def` and the duplicate fieldname `abc`.
  * The `__new__` method of the generated class uses the field names provided as parameter names.

Accessing data in a named tuple

* Since named tuples are also regular tuples, we can still handle them just like any other tuples by:
  * index
  * slice
  * iterate
  * unpack
* In addition, we can also access the data using the field names.
* Since namedtuple generated classes inherit from `tuple`, they are therefore immutable.

Introspection

* We can easily find out the field names in a namedtuple generated class by using the class property `_fields`.
* We can see what the code for the generated class is, using the class property `_source` (removed in version 3.7).

Extracting named tuple values to a dictionary

* Instance method `_asdict()` creates a dictionary of all the named values in the tuple.

In [29]:
from collections import namedtuple

Point2D = namedtuple('Point2D', 'x y')

def dot_product(a, b):
    return sum(p[0] * p[1] for p in zip(a, b))

p1 = Point2D(x=1, y=2)
# p1 = Point2D(x=1, y=2, y=3) # SyntaxError: keyword argument repeated: y
p2 = Point2D(3, 4)
p3 = Point2D(1, 2)
print(f'{p1=}')
print(f'{type(p2)=}')
print(f'{isinstance(p2, tuple)=}')
print(f'{p1.x=}, {p1.y=}')
print(f'{p2.x=}, {p2.y=}')
print(f'{p3.x=}, {p3.y=}')
print(f'{(p1 == p3)=}')
x, y = p1
print(f'{x=}, {y=}')
print(f'p1 dot p2 = {dot_product(p1, p2)}')
print(p1._asdict())
print(Point2D._fields)
# print(Point2D._source) # removed in version 3.7
p1_sliced = p1[:1]
print(f'{p1_sliced=}')
print(f'{type(p1_sliced)=}')
# print(f'{(p1_sliced._fields)=}') # after slicing it be a normal tuple

p1=Point2D(x=1, y=2)
type(p2)=<class '__main__.Point2D'>
isinstance(p2, tuple)=True
p1.x=1, p1.y=2
p2.x=3, p2.y=4
p3.x=1, p3.y=2
(p1 == p3)=True
x=1, y=2
p1 dot p2 = 11
{'x': 1, 'y': 2}
('x', 'y')
p1_sliced=(1,)
type(p1_sliced)=<class 'tuple'>


Modifying and extending named tuples

* Named tuples are immutable.
* Just like with strings, we have to create a new tuple with the modified values.
* The `_replace(**kwargs)` instance method
  * It will copy the named tuple into a new one, replacing any values from keyword arguments with specified field names and new values.
  * The keyword name must match an existing field name.
* Sometimes we want to create a named tuple that extends another named tuple, appending one or more fields.
  * We can create a new named tuple by concatenating the existing `_fields` with new field names, and then generate a new class using `namedtuple`.
  * We can use the class method `_make(iterable)` to make a new instance from an existing sequence or iterable.

In [31]:
from collections import namedtuple

Point2D = namedtuple('Point2D', 'x y')
p1 = Point2D(1, 2)
print(f'{p1=} @ {hex(id(p1))}')
p1 = Point2D(p1.x, y=3)
print(f'{p1=} @ {hex(id(p1))}')
p1 = p1._replace(y=4) # create a new instance by replacing some values
print(f'{p1=} @ {hex(id(p1))}')

Point3D = namedtuple('Point3D', Point2D._fields + ('z',)) # extend named tuple
p2 = Point3D._make(p1 + (5,))
print(f'{p2=} @ {hex(id(p2))}')
p3 = Point3D(*p1, 6)
print(f'{p3=} @ {hex(id(p3))}')

p1=Point2D(x=1, y=2) @ 0x2230f9b0f80
p1=Point2D(x=1, y=3) @ 0x2231016c440
p1=Point2D(x=1, y=4) @ 0x2230f998140
p2=Point3D(x=1, y=4, z=5) @ 0x2230f9a1210
p3=Point3D(x=1, y=4, z=6) @ 0x2230f9a08b0


Docstrings and default values of named tuples

* 