### 03 - Python Fundamentals Part 3

#### Outline

* Functions, Mutability, and Pass-by-Assignment
* Control Flow, Loops, Branches, and Errors
* Defaults
* Iterators


In [None]:
# These are part of the standard library
import os
from pathlib import Path

working_directory = Path(os.path.abspath(''))  # Immediately stuff the string into a Path object
static_dir = working_directory / "static" / "03"

----

### Functions, Lambdas, Recursion, and Splatting

Functions are first-class objects and can be passed around like variables.

Lambdas are functions made on-the-spot, also called "anonymous functions" in matlab or "closures" in Rust, Haskell, C#, and Java; the name "lambda" refers to "lambda calculus" which is a fancy name for the field of study around treating functions like variables.

Recursion is useful, but also can be confusing and hazardous. Use only when necessary.

In [None]:
# This is a basic function definition
# NOTE: Show docstring generation with `autodocstring`

def myfunc(a: float, b: int) -> float:
    """
    This is a docstring! Write some notes for future self.
    Pretend your future self is a different person, because they will be

    Args:
        a: first value
        b: who knows!

    Returns:
        a * b unless b is less than 3, in which case, zero
    """

    if b > 2:
        return a * float(b)
    else:
        return 0.0

# Same function as a lambda using a ternary expression.
# Lambdas can be very compact, but difficult to read!

mylambda = lambda a, b: a * float(b) if b > 2 else 0.0


In [None]:
# Functions can be called with "positional" (ordered) or "keyword" (named) arguments, or a mix of both.
# Keyword arguments must come after any positional arguments,
# because otherwise we wouldn't be able to tell what position the positionals are in.

myfunc(5.0, 6)
myfunc(5.0, b=6)
myfunc(b=6, a=5.0) # Different order!
# myfunc(a=5.0, 6)  # nope

In [None]:
# Function can take unspecified numbers of inputs.
# Keyword arguments are passed as a dictionary, while positional arguments are passed as a tuple.
# This means that in performance-intensive cases, using positional arguments can be faster.

def flexiblefunc(*args, **kwargs):
    print(f"Positional arguments: {args}")
    print(f"Keyword arguments: {kwargs}")

# These two patterns have the same effect
flexiblefunc(*(1, "asdf", False), **{"a": 9, "dfgkhj": True})
print("\n")
flexiblefunc(1, "asdf", False, a=9, dfgkhj=True)

In [None]:
# Splatting (or unpacking, or destructuring) allows using that kind of syntax elsewhere.
# Keyword splatting is only usable in function calls.
# You can also splat the contents of an iterator, and we'll see more of that later.

# Splatting can unpack tuples into other collections
print(f"[*(1, 2)]: {[*(1, 2)]}")

# Splatting can also be used with functions that were defined with specific arguments
print(f"myfunc(*(5,6)): {myfunc(*(5.0, 6))}")

# Implicit unpacking for defining multiple variables on a single line
a, b, c = (5, 6, 2)

In [None]:
# Recursion is useful but limited.
#
# Even algorithms with "recursive" in the name don't usually use actual recursion due to these limitations,
# because most recursions can be rewritten as loops that do not have the same limitations.

# A "recursion" is a function that can call itself.
# This example is an "unbounded recursion" that will fail
def recursive_func(count):
    count[0] += 1
    recursive_func(count)

# This is will fail due to stack depth - you can't recurse forever 
count = [0]
try:
    recursive_func(count)
except Exception as e:
    print(f"Crashed after {count[0]} recursions: {e}")  # Another use of format strings!

----

#### Mutability and Pass-by-Assignment

Where variables are passed to function calls like `function(my_variable)`, the variable may be passed either _by value_, meaning the actual content of the variable is copied into the function's workspace (stack frame) or it may be passed _by reference_, meaning only its memory address (pointer) is sent to the function, and the function has to look the value up.

Both pass-by-value and pass-by-reference methods are useful and will be found together in almost any code.

Python uses both pass-by-value and pass-by-reference, and won't tell you which one it is doing. You have to check it yourself.

This is called "pass-by-assignment," and is considered a feature.

As a rule of thumb, 
* Pass-by-**reference** usually applies to **complicated or large types** like classes, collections, strings, arrays, etc.
* Pass-by-**value** usually applies to **simple and small types** like individual numbers.

There are exceptions to both these rules. The only reliable way to know is by checking.

"Mutability" refers to whether a value can be changed after it is assigned. Some types are `immutable`, meaning they can't be changed.

As a rule of thumb,
* Mutable types are usually collections or classes
* Immutable types are rare in python, but tuples and strings are exceptions
  * "Frozen" dataclasses and pydantic models attempt to create immutability for larger objects

In [None]:
def try_to_mutate(x: int):
    x += 1

val_to_try = 1
try_to_mutate(val_to_try)  # This doesn't change `a` because it was passed by value!
val_to_try # Same as before

In [None]:
def try_to_append(x: list[str]):
    x.append("asdf")

list_to_try = []

try_to_append(list_to_try)  # This works, because lists are mutable & passed by reference
print(list_to_try)

try_to_append(list_to_try)
print(list_to_try)

----

### Control Flow

Python has a few types of control flow
* Loops: `for`, `while`, `break`, `continue`
  * `break` ends the enclosing loop
  * `continue` skips to the next evaluation of the loop
* Branches: `if-elif-else`, `match-case`, ternary statements, or-defaults
  * Ternary statements and or-defaults are both shorthand for compressing common types of `if` statements
* Error handling: try-except-finally
  * `finally` block _always_ runs even if there is an error in the `try` block
  * Use this to shut down child processes, close files, delete secrets, etc

#### Loops

In [None]:
# Loops are pretty straightforward, but can be modified with `break` and `continue`

print("A simple for-loop")
for i in range(5):
    print(i)

print("\nA for-loop that skips even numbers")
for i in range(10):
    if i % 2 == 0:  # Modulo operator - calculate remainder of division
        continue
    print(i)

print("\nA simple while-loop")
i = 5
while i > 0:
    i -= 1
    print(i)

print("\nAn infinite loop with independent break condition")
err = 0.1
tol = 0.02
while True:
    err /= 2
    print(err)
    if err < tol:
        break

#### Branches

In [None]:
# `if-elif-else` and `match-case` provide different flavors of the same functionality.
# While they can produce the same logic, some things may be easier to express in one syntax or the other.

# Simple if-statement to clamp a value to [-1, 0, 1]
a = 0
if a > 0:  # Condition checked first
    a = 1
elif a < 0:  # Condition checked second
    a = -1
else:  # Fall-through
    a = 0

# Match-case to clamp a value to [-1, 0, 1]
# NOTE: Added in python 3.10! Don't use if you need compatibility with 3.9-
# NOTE: Does not guarantee exhaustiveness, but linters can help with this if phrased carefully.
# NOTE: Lean on "may not be bound" lint
b = 0
match b:
    case x if x > 0:  # Condition checked first
        b = 1
    case x if x < 0:  # Condition checked second
        b = -1
    case _:  # Fall-through
        b = 0

# Match-case used for structural pattern matching
p = (a, b)
match p:
    case (0, right):
        c = (right + 1) ** 2
    case (left, 0):
        c = (left + 1) ** 2
    case (left, right):  # All cases covered; no need for fall-through
        c = left * right
print(a, b, c)


In [None]:
# Ternary statements provide shorthand for assigning values based on the result of an `if-else` statement.
# They are often used in a similar way to or-defaulting to be more specific about the distinction between
# "truthiness" vs. missing values.

a = None  # NOTE: try replacing this value with `[]`

b = a if a is not None else "b"  # Ternary statement
c = a or "c"                     # or-defaulting

print(a, b, c)

In [None]:
# In both ternaries and or-defaults, the right side evaluates only if the left side fails.
#
# In the case of or-defaults, the right side may run in cases where there is a value
# present on the left but that value is not "truthy."

#   Tracer functions to show whether the right side evaluates
#   hijacking the `None` return from `print` to both print and return in a lambda
make_b = lambda: print("running alternate b") or 5
make_c = lambda: print("running alternate c") or 7

a = True  # NOTE: try replacing this value with `[]` or `None`

b = a if a is not None else make_b()
c = a or make_c()

print(a, b, c)

In [None]:
# The `or` keyword has some interesting and potentially unintuitive behavior.
# It doesn't behave quite like the `||` logical-or operator in other languages.

# `or` doesn't return a bool! It returns one of the values
print(f"\nNone or 5 = {None or 5}")

# If the left side fails, the right side is _always_ returned, even if it also fails
print(f"False or None = {False or None}")

print("\nValues can be truthy or falsey without an obvious boolean conversion")
print(f"[] or True = {[] or True}")
print(f"['asdf'] or [] = {['asdf'] or []}")

#### Error Handling

Python uses exceptions for error handling.

There is no structural method to find what exceptions may be raised by a function.
Best practice is to document what conditions may result in raising an exeption from a function.

In [None]:
# Simplest case.
# NOTE: Only use this for active debugging!
# NOTE: Never leave these in final code. If it doesn't matter whether something runs, don't write it!
# NOTE: Uncomment for presentation (otherwise these intentional exceptions fail tests)

# try:
#     print("Hello `try`")
#     raise Exception
# except:
#     print("Hello `except`")

In [None]:
# Simplest case with a `finally` block
# NOTE: Uncomment for presentation (otherwise these intentional exceptions fail tests)

# try:
#     print("Hello `try`")
#     raise Exception
# except:
#     print("Hello `except`")
#     raise  # Raise whatever exception triggered this block
# finally:  # Run whether there is an exception or not
#     # This runs even if there is an error in the `except` block
#     print("Hello `finally`")

In [None]:
# Real error handling is specific

a = []
try:
    print("Getting first value of `a`!")
    first_elem = a[0]
except IndexError:
    print("Oops")
except ValueError:
    print("What")
    raise


In [None]:
# You can extract info about the error for logging or communicating to the user
# NOTE: Uncomment for presentation (otherwise these intentional exceptions fail tests)

# import traceback as tb

# try:
#     raise ValueError("nope")
# except ValueError as e:
#     print(f"Error: {type(e).__name__}: {e}")
#     execution_context = "".join(tb.format_tb(e.__traceback__))
#     print(f"\nError context: \n{execution_context}")

In [None]:
# Using a finally block to delete a password
# Otherwise it would be saved as plaintext in the notebook!
# NOTE: Uncomment for presentation (otherwise these intentional exceptions fail tests)

# from getpass import getpass

# pw = ""
# try:
#     # NOTE: `getpass` opens in the terminal that opened the notebook
#     # pw = getpass("Enter password: ")
#     pw = "secret password"
#     raise Exception("Oops")
# finally:
#     pw = "" # Erase
#     del pw  # Mark for garbage collection (does not delete immediately!)
#     print("Erased pw")

----

### Defaults, Additivity, and Breaking Changes

Python provides several overlapping methods for expressing the idea of default and optional values:

* Function argument defaults
* `or` operator
* `*args, **kwargs` unpacking

Defaults are at the core of maintainable, extensible code because they allow the addition of new features without breaking existing uses. Whenever possible, new features should _additive_; they should be present _in addition_ to existing features, without interfering.

For example, when adding a new argument to an existing function, if possible, include a sensible default value that works for existing cases.

In [None]:
# Functions can have default values for some or all arguments.
# All arguments with defaults must come after all arguments without defaults.

def myfunc_with_defaults(a: int = 5, b: int = 6) -> float:
    return myfunc(a, b)

myfunc_with_defaults()
myfunc_with_defaults(4)
myfunc_with_defaults(b=7)
# myfunc_with_defaults(a=4, 7)  # nope

In [None]:
# ! LANDMINE !
# Mutable default arguments are shared global state!
# Never do this

def bad_mutable_default(foo: list[str] = []) -> list[str]:
    foo.append("asdf")
    return foo

print(bad_mutable_default())  # Same call, different result every time!
print(bad_mutable_default())
print(bad_mutable_default())
print(bad_mutable_default())

In [None]:
# Handling defaults for mutable types like lists and dictionaries can be done using a None-`or` pattern

def good_mutable_default(foo: list[str] | None = None) -> list[str]:
    foo = foo or []  # Triggers on empty list
    # foo = foo if foo is not None else []  # Does not trigger on empty list
    foo.append("asdf")
    return foo

print("\nAssigning defaults inside the function removes the shared global state")
print(good_mutable_default())
print(good_mutable_default())
print(good_mutable_default())

In [None]:
# None-`or` defaulting can handle non-empty defaults,
#
# This has the disadvantage that if there are a fixed set of valid options, there's no indication
# in the function signature of what those valid options are, and the default values for each option
# are hidden in the function definition.

def good_default_with_values(bar: dict[str, int] | None = None) -> dict[str, int]:
    required_defaults = {
        "a": 5,
        "b": 9
    }

    # Handle case where nothing is defined
    bar = bar or required_defaults

    # Handle case where some required values are defined and others are not
    # by merging dictionaries, prioritizing values from the user
    missing_values = {k: v for k, v in required_defaults.items() if k not in bar.keys()}
    bar.update(missing_values)

    return bar

print(good_default_with_values())
print(good_default_with_values(bar={"a": 3}))
print(good_default_with_values(bar={"c": 7}))

In [None]:
# Pydantic models provide an excellent way to capture such defaults while preserving fixed options,
# making the default values visible, and including required values.
#
# They can also include validation constraints on individual fields, like enforcing that a number be nonzero.

from typing import Literal  # Enums without the pain
import pydantic

class Opts(pydantic.BaseModel):
    # reqd: bool  # Required; no default value here
    a: int = 5
    b: int = 9
    c: Literal["forward", "backward"] = "backward"  # Only two valid options!
    d: list[float] = pydantic.Field(default_factory=list)  # Mutable defaults constructed on init

    # Optional configuration for pydantic class definition.
    # This allows mutation, but applies run-time type checking,
    # and will produce an error if an extra field is supplied to catch typos
    model_config = pydantic.ConfigDict(validate_assignment=True, frozen=False, extra="forbid")

def better_defaults(reqd: bool, opts: Opts | None = None) -> Opts:
    opts = opts or Opts()
    return opts

# print(better_defaults())
print(better_defaults(False))
print(better_defaults(True, opts=Opts(c="forward")))

In [None]:
# What's a breaking change? Anything that can break existing uses of the code

def func(a, b):
    print(a, b)

func(1, 2)

# Breaking change! Added a required argument
def func(a, b, c):
    print(a, b, c)

# Same use no longer works. Forced to update uses, not just the dep version!
# func(1, 2)  # nope

In [None]:
def func(a, b):
    print(a, b)

func(1, 2)

# Not breaking change unless it's "functionality-breaking" - new argument is optional
def func(a, b, c=3):
    print(a, b, c)

# Same use still works, as long as this slightly different return is ok
func(1, 2)

#### What to do about breaking changes?
1. Plan ahead - think about how the code you write is likely to change in the future
2. Start with a restrictive interface, then relax over time - don't start general and then clamp down later!
3. Make new features additive: when making new features, add defaults for new fields or make new functions or classes
4. Just make the breaking change! But make sure to always document it properly by incrementing the major version

#### Documenting breaking changes

Versions have meaning! They're not just arbitrary numbers based on feelings.

---
As a rule of thumb:

`12.7.3`

`_____^----` Patch version `3`: functionality-neutral bug tweaks and bug fixes


`12.7.3`

`___^----` Minor version `7`: additive changes; new features, but no breakage; backward-compatible


`12.7.3`

`^----` Major version `12`: breaking changes; not backward compatible

---

"Zero-versioning" refers to the practice of leaving the major version blank during early development when changes are still happening regularly.

`0.7.3`

`____^----` Minor/Patch version `3`: Non-breaking changes


`0.7.3`

`__^----` Major version `7`: Breaking changes

----

Semantic versioning (semver) for a given language can be subtle, because it's not always obvious how breakage can occur. General info about semver can be found at https://semver.org/ .

## Iterators

Iterators provide a way to visit elements in a collection. They simply return the next value in the collection until they run out.

Iterators are generally of the same form, and have only two distinguising characteristics: their return type and the size of the underlying collection - "what and how many".

In [None]:
# Collections can be accessed by iterators via different methods

stuffmap = {"a": 1, "b": 2}

print(f"dict.items() gets key-value pairs:  {[(k, v) for k, v in stuffmap.items()]}")
print(f"dict.keys() gets just the keys:     {[*stuffmap.keys()]}")  # Splatting iterators!
print(f"dict.values() gets just the values: {[*stuffmap.values()]}")

In [None]:
thingsvec = [1, 2, 3]

# This works for most other ordered types like numpy arrays, too
print(f"lists implicitly generate iterators: {[x for x in thingsvec]}")

In [None]:
# `enumerate` adds an index to an existing iterator
print(f"`enumerate` is handy for keeping track of where you are in an iterator: {[(i, v) for i, v in enumerate(thingsvec)]}")

In [None]:
# `zip` is a builtin method for combining iterators into pairs of values

zipped_items = zip(stuffmap.keys(), stuffmap.values())  # Still an iterator
assert list(zipped_items) == list(stuffmap.items())  # Have to actualize the iterators to compare values

# `zip` can also be used to transpose a series of pairs into a pair of series
keys, vals = zip(*stuffmap.items())

print(keys, vals)

In [None]:
# `itertools` provides numerous methods for doing more interesting things with iterators

from itertools import product, chain, permutations, cycle  # ... and many more

print("Permutations of (1, 3) makes shuffled versions of the same data")
for x in permutations([1, 3]):
    print(x)
print("\n")

print("Cartesian product of (1, 2), (3, 4) makes all the pairs")
for x in product([1, 2], [3, 4]):
    print(x)
print("\n")

print("Chain of (1, 2), (3, 4, 5) flattens ragged data")
for x in chain([1, 2], [3, 4, 5]):
    print(x)
print("\n")

print("Cycle of ('red', 'green', 'blue') loops over the values forever")
color_cycle = cycle(('red', 'green', 'blue'))
for i in range(5):
    print(next(color_cycle))
print("...and so on\n")