# Agenda

0. Q&A
1. `**kwargs` and other parameter types
2. Scoping (LEGB)
3. Enclosing functions
4. Dispatch tables
5. Comprehensions
6. Sorting and `lambda` (and passing functions as arguments to other functions)
7. Type hints/annotations

In [1]:
d = {}
d['a'] = 10
hash('a') % 8

0

In [2]:
hash('b') % 8

2

In [3]:
hash('c') % 8

6

In [4]:
hash('d') % 8

3

In [5]:
for one_letter in 'abcdefghij':
    print(f'{one_letter}: {hash(one_letter) % 8}')

a: 0
b: 2
c: 6
d: 3
e: 0
f: 5
g: 6
h: 0
i: 6
j: 4


In [6]:
d['e'] = 500

In [7]:
d = {'a':10, 'b':20, 'c':30}

new_stuff = {'b':400, 'c':900, 'd':1600}

d.update(new_stuff)   # this modifies d, the "receiving" dict, to take all key-value pairs from new_stuff

d

{'a': 10, 'b': 400, 'c': 900, 'd': 1600}

In [8]:
d = {'a':10, 'b':20, 'c':30}
new_stuff = {'b':400, 'c':900, 'd':1600}

# we can, as of 3.10 (?), use the | operator 

d | new_stuff   # this returns a new dict based merging d + new_stuff together, but doesn't modify d or new_stuff

{'a': 10, 'b': 400, 'c': 900, 'd': 1600}

In [9]:
d

{'a': 10, 'b': 20, 'c': 30}

In [10]:
# if you do want to modify d, you can use |=

d |= new_stuff    # this is (I think) identical in behavior to dict.update

In [11]:
d

{'a': 10, 'b': 400, 'c': 900, 'd': 1600}

# Parameter types for functions

1. Mandatory parameters (positional or keyword arguments)
2. Optional parameters (positional or keyword arguments), with a default argument value stored in the function's `__defaults__` tuple
3. `*args`, where `args` is a tuple containing all of the positional arguments that no other parameter got

In [12]:
def add(first, second):
    return first + second

In [13]:
t = (10, 2)

add(t)  # will this work?  No... we're passing 1 argument, but add requires 2

TypeError: add() missing 1 required positional argument: 'second'

In [14]:
# we saw that we can "unroll" the elements of t:

add(*t)   # this turns the 2-element tuple into two separate arguments

12

# Keyword arguments

We saw that we can call a function with one or more sets of "keyword arguments," where it looks like `NAME=VALUE`. All keyword arguments must come after all positional arguments.

`**kwargs` is the keyword-argument analog to `*args`, which only works with positional arguments. `kwargs` is a dict in which the keys are strings, the names of the keyword arguments that were passed, and the values are whatever values were associated with them. It contains all of the keyword arguments that no other variable accepted.

In [15]:
def myfunc(a, b, **kwargs):
    return f'{a=}, {b=}, {kwargs=}'

In [16]:
myfunc(10, 20)  # just passing two positional arguments

'a=10, b=20, kwargs={}'

In [18]:
myfunc(a=10, b=20)  # known parameters get the values of the keyword arguments

'a=10, b=20, kwargs={}'

In [19]:
myfunc(a=10, b=20, c=30)

"a=10, b=20, kwargs={'c': 30}"

In [20]:
myfunc(a=10, b=20, c=30, d=40, e='hello', f={'x':1, 'y':2})

"a=10, b=20, kwargs={'c': 30, 'd': 40, 'e': 'hello', 'f': {'x': 1, 'y': 2}}"

# Why do we need `**kwargs`?

I can think of two reasons:

1. We have a function that takes *so* many arguments that the signature is unreadable. Instead, we can just accept `**kwargs`, and have the user pass whichever of the arguments they find interesting/necessary. We have to look in the dict to see what keys were passed, but the function signature is still more readable.
2. You have a function that knows what to do, but doesn't know what parameter names or values it'll get. This is typically/often for formatting purposes.

You can have `**kwargs` in your function after `*args` (if it exists). 

In [21]:
# what if I do this:

# parameters:   a    b    kwargs
# arguments     10   20    {'c':30}

myfunc(10, 20, c=30, b=200)

TypeError: myfunc() got multiple values for argument 'b'

# Parameter types for functions

1. Mandatory parameters (positional or keyword arguments)
2. Optional parameters (positional or keyword arguments), with a default argument value stored in the function's `__defaults__` tuple
3. `*args`, where `args` is a tuple containing all of the positional arguments that no other parameter got
4. `**kwargs`, where `kwargs` is a dict containing all of the keyword arguments that no other parameter got

In [22]:
def myfunc(a, b=10, *args, **kwargs):
    return f'{a=}, {b=}, {args=}, {kwargs=}'

In [23]:
myfunc(3)

'a=3, b=10, args=(), kwargs={}'

In [24]:
myfunc(3, x=100, y=200, z=300)

"a=3, b=10, args=(), kwargs={'x': 100, 'y': 200, 'z': 300}"

In [25]:
# how can I pass values to args?
# Just pass more positional arguments...

myfunc(3,4,5,6,7, x=100, y=200)

"a=3, b=4, args=(5, 6, 7), kwargs={'x': 100, 'y': 200}"

In [26]:
# how can I pass values to args
# and *not* overwrite the default value in b?

# you can't.

# but... we'll see an alternative solution soon

# Exercise: XML generator

Just to remind you, XML works based on "tags." Each tag has a name, optional attributes in the open tag, and optional content:

    <tagname></tagname>     # empty
    <name>Reuven</name>     # regular tag with content
    <a><b><c>d</c></b></a>  # nested tags
    <a x="1" y="2">b</a>    # attributes in the opening tag

I want you to write a function, `xml`, that takes 1, 2, or more arguments:

- If we pass a single argument, then that's the name of the tag, and we should return a string with its opening and closing tags, but no content.
- If we pass two arguments, then that's the name of the tag and its content. We should return a string with the tag (opening and closing), and the second argument between them.
- If we pass two arguments plus any keyword arguments, the keyword args are all turned into attributes inside of the opening tag. Note that officially, attributes should have a name, the `=` sign, and then a value inside of double quotes.

Example:

    xml('a')            # '<a></a>'
    xml('a', 'b')       # '<a>b</a>'
    xml('a', 'b', x=1)  # <a x="1">b</a>
    

In [34]:
def xml(tagname, content='', **kwargs):
    if not kwargs:
        print(f'\tNo attributes passed; expect none in the output')
    else:
        print(f'\tGot {len(kwargs)} attributes')
    
    attributes = ''
    for key, value in kwargs.items():
        attributes += f' {key}="{value}"'
    
    output = f'<{tagname}{attributes}>{content}</{tagname}>'
    return output


print(xml('a'))
print(xml('a', 'b'))
print(xml('a', 'b', x=1, y=2))

	No attributes passed; expect none in the output
<a></a>
	No attributes passed; expect none in the output
<a>b</a>
	Got 2 attributes
<a x="1" y="2">b</a>


# Parameter types for functions

1. Mandatory parameters (positional or keyword arguments)
2. Optional parameters (positional or keyword arguments), with a default argument value stored in the function's `__defaults__` tuple
3. `*args`, where `args` is a tuple containing all of the positional arguments that no other parameter got
4. `**kwargs`, where `kwargs` is a dict containing all of the keyword arguments that no other parameter got

In [35]:
def myfunc(a, b=10, *args, **kwargs):
    return f'{a=}, {b=}, {args=}, {kwargs=}'

In [36]:
myfunc(2,4,6,8)

'a=2, b=4, args=(6, 8), kwargs={}'

In [37]:
# if we want to leave be with its default, but allow us at the same time to
# give positional argument values to args, we can move b=10 to *after* the mention of
# *args. Then it becomes a keyword-only parameter, one that can only get values with
# keyword arguments.

def myfunc(a, *args, b=10, **kwargs):
    return f'{a=}, {b=}, {args=}, {kwargs=}'

In [38]:
myfunc(2,4,6,8)

'a=2, b=10, args=(4, 6, 8), kwargs={}'

In [39]:
# in order set b's value we need to explicitly mention it as a keyword argument.

myfunc(2,4,6,8, b=999)

'a=2, b=999, args=(4, 6, 8), kwargs={}'

In [40]:
# what if we put b after *args, but we *don't* give it a default?
# then it becomes a mandatory keyword-only parameter; we must pass a value to it
# as a keyword argument when we invoke the function.

#          pos/kw    pos only      kw only
def myfunc(a,        *args,        b,         **kwargs):   # notice that b no longer has a default
    return f'{a=}, {b=}, {args=}, {kwargs=}'

In [41]:
myfunc(2,4,6,8)

TypeError: myfunc() missing 1 required keyword-only argument: 'b'

In [42]:
myfunc(2,4,6,8, b=999)

'a=2, b=999, args=(4, 6, 8), kwargs={}'

# Parameter types for functions

1. Mandatory parameters (positional or keyword arguments)
2. Optional parameters (positional or keyword arguments), with a default argument value stored in the function's `__defaults__` tuple
3. `*args`, where `args` is a tuple containing all of the positional arguments that no other parameter got, or `*` to indicate the end of the positional parameters
4. Mandatory keyword-only parameters
5. Optional keyword-only parameters (with defaults)
6. `**kwargs`, where `kwargs` is a dict containing all of the keyword arguments that no other parameter got

In [45]:
# what if I want to write a function whose parameters are *all*
# keyword only?

def myfunc(*, a, b):   # if we put * in the list of parameters, that means: after here, only kw args, but no *args
    return a + b

In [46]:
myfunc(a=10, b=20)

30

In [47]:
myfunc(10, 20)

TypeError: myfunc() takes 0 positional arguments but 2 were given

In [48]:
len('abcd')

4

In [49]:
help(len)  # what does this function do, and what is its signature?

Help on built-in function len in module builtins:

len(obj, /)
    Return the number of items in a container.



# / in a function signature

`/` indicates that up to that point, all of the parameters are positional only. You cannot pass them as keyword arguments.



In [50]:
def myfunc(a, b, /):
    return a + b

In [51]:
myfunc(a=10, b=20)

TypeError: myfunc() got some positional-only arguments passed as keyword arguments: 'a, b'

In [52]:
myfunc(10, 20)

30

# Who needs this?

If your function takes both positional arguments *and* `**kwargs`, it's possible that someone will pass a keyword argument that'll get misinterpreted as a positional argument, or at least meant for a names parameter. Using `/` allows us to have a parameter, and assign to it, but also to get keyword arguments with that name.

In [53]:
def write_config(filename, **kwargs):    # let's write the key-value pairs of kwargs into filename, one line at a time
    with open(filename, 'w') as f:
        for key, value in kwargs.items():
            f.write(f'{key}={value}\n')

In [54]:
write_config('config.txt', a=10, b=20, c=30)

In [55]:
!cat config.txt

a=10
b=20
c=30


In [56]:
# but what happens if I do this:

write_config('config.txt', name='Reuven', filename='myfile.txt')

TypeError: write_config() got multiple values for argument 'filename'

In [57]:
# let's rewrite the function such that filename is positional only
# then we won't have that clash!

def write_config(filename, /, **kwargs): 
    with open(filename, 'w') as f:
        for key, value in kwargs.items():
            f.write(f'{key}={value}\n')

In [58]:
write_config('config.txt', 
             name='Reuven', filename='myfile.txt')

In [59]:
!cat config.txt

name=Reuven
filename=myfile.txt


# Parameter types for functions

1. Positional only, followed by `/`
2. Mandatory parameters (positional or keyword arguments)
2. Optional parameters (positional or keyword arguments), with a default argument value stored in the function's `__defaults__` tuple
3. `*args`, where `args` is a tuple containing all of the positional arguments that no other parameter got, or `*` to indicate the end of the positional parameters
4. Mandatory keyword-only parameters
5. Optional keyword-only parameters (with defaults)
6. `**kwargs`, where `kwargs` is a dict containing all of the keyword arguments that no other parameter got

# Return values

- If a function fails to use `return`, then it automatically returns `None` when the function exits.
- If a function uses just `return` with no value after it, then it returns `None`
- A function can otherwise return any value it wants, with no limits or exceptions. This means that we can return a tuple of values, effectively returning more than one value from our function.


In [60]:
# example : a function that takes any number of integers and returns the min and max of them

def min_and_max(*numbers):
    return min(numbers), max(numbers)

In [61]:
min_and_max(10, 20, -5, -7, 23, 1)

(-7, 23)

In [62]:
# I can use unpacking to retrieve them into separate variables
smallest, largest = min_and_max(10, 20, -5, -7, 23, 1)

In [63]:
smallest

-7

In [64]:
largest

23

In [67]:
def myfunc():
    print(2+3)  # this function might print something, but it returns None

In [68]:
myfunc()

5


In [69]:
x = myfunc()

5


In [70]:
type(x)

NoneType

In [71]:
# I can use a type annotation here 
# name is a string, and the function returns a string
def hello(name:str) -> str:
    return f'Hello, {name}!'

In [72]:
hello(5)

'Hello, 5!'

# Scoping

The whole idea of scoping is: When do variables exist, and when do they cease to exist?

This is useful if we reuse the same variable name in multiple programs/functions. Also, if we want access to a variable, is it still around?

At a very basic level, Python only really has two scopes:
- Global (everything outside of a function)
- Local (everything inside of a function)

In [74]:
x = 100

for i in range(10):   # no function body, thus no new scope -- thus, x and i are global variables
    x = i ** 3

x

729

In [77]:
# this prints 100 -- but why?
# we're in the global scope, and Python looks for a global variable x
# where does it look? It says 'x' in globals(), finds it, and retrieves globals()['x']

x = 100

print(f'x = {x}')

x = 100


In [76]:
'x' in globals()

True

In [78]:
# let's add a function to the mix

x = 100

def myfunc():
    print(f'In myfunc, x = {x}')   # is x local? No. is x global? yes!
    
print(f'before, x = {x}')
myfunc()
print(f'after, x = {x}')

before, x = 100
In myfunc, x = 100
after, x = 100


# Scoping search path in Python: LEGB

Python has a total of *four* scopes. It searches through each of them when it needs to find a variable. Actually, it searches through all four when we're in a function, and only in the final two when we're outside of a function:

- `L` -- local, for local variables (this is where we start if we're in a function)
- `E` -- enclosing, if our function was defined inside of another function
- `G` -- global, for global variables (this is where we start if we're outside of a function)
- `B` -- builtin, names that are defined by Python and stick around a while

In [79]:
# what if we assign to x inside of our function?

x = 100

def myfunc():
    x = 200
    print(f'In myfunc, x = {x}') # is x local (i.e., 'x' in __code__.co_varnames)? yes, and its value is 200
    
print(f'before, x = {x}')  # is x global? yes, and its value is 100
myfunc()
print(f'after, x = {x}')   # is x global (still)? yes, 100

before, x = 100
In myfunc, x = 200
after, x = 100


In [80]:
# when did Python notice that x was local inside of myfunc? Not when we ran it, but 
# when we defined it.

myfunc.__code__.co_varnames   # local variable names in myfunc, as recorded at definition/compile time

('x',)

In [81]:
# let's change things a little bit

x = 100

def myfunc():
    print(f'In myfunc, x = {x}') # is x local? yes ... but it has no value
    x = 200   # this, anywhere in the function, forces x to be local
    
print(f'before, x = {x}')  # is x global? yes, 100
myfunc()
print(f'after, x = {x}')  

before, x = 100


UnboundLocalError: cannot access local variable 'x' where it is not associated with a value

In [84]:
def myfunc():
    y += 1    # y was marked as local at compiled time; at runtime, we're saying y = y + 1

myfunc()

UnboundLocalError: cannot access local variable 'y' where it is not associated with a value

In [85]:
myfunc.__code__.co_varnames

('y',)

In [86]:
# what if we want, from inside of our program, to modify the global x?

x = 100

def myfunc():
    global x   # this tells Python's compiler: Don't mark x as local, even if/when we assign to it
    x = 200    # here, we are assigning to the global x
    print(f'In myfunc, x = {x}')
    
print(f'before, x = {x}') 
myfunc()
print(f'after, x = {x}')  

before, x = 100
In myfunc, x = 200
after, x = 200


In [87]:
myfunc.__code__.co_varnames

()

In [88]:
# one variant: Don't assign to the variable, and don't use global
# rather, use import __main__

import __main__    # this module puts all global variables into the __main__ namespace as attributes
x = 100

def myfunc():
    __main__.x = 200    # now I'm not assigning to a global; I'm assigning to an attribute on a module
    print(f'In myfunc, x = {x}')
    
print(f'before, x = {x}') 
myfunc()
print(f'after, x = {x}')  

before, x = 100
In myfunc, x = 200
after, x = 200


In [89]:
# another variant: Remember that assigning to a variable triggers Python's local scoping at compile
# time. But mutating/modifying an existing value via a global variable works just fine without 'global'

y = [10, 20, 30]

def myfunc():
    y[0] = '!'  # modify/mutate y -- really, we're invoking y.__setitem__(0, '!')
    print(f'In myfunc, x = {y}')
    
print(f'before, x = {y}') 
myfunc()
print(f'after, x = {y}')  

before, x = [10, 20, 30]
In myfunc, x = ['!', 20, 30]
after, x = ['!', 20, 30]


# Next up

1. Global and builtin scopes
2. Enclosing/inner functions, closures, and the "enclosing" scope

Resume at :35

Scopes: LEGB -- local, enclosing, global, builtin



In [90]:
# we can access a global variable from within our functions

x = 100

def myfunc():
    print(f'In myfunc, x = {x}')   # is x local? no. is x global? yes! Get its value, and print it
    
print(f'before, x = {x}') 
myfunc()
print(f'after, x = {x}')  

before, x = 100
In myfunc, x = 100
after, x = 100


In [91]:
# consider if we define *two* functions, and then call one from within the other

x = 100

def hello():
    print('Hello from the "hello" function!')

def myfunc():
    hello()   # is hello local? No. Is hello global? Yes, get its value and then () mean execute it
    
print(f'before, x = {x}') 
myfunc()
print(f'after, x = {x}')  

before, x = 100
Hello from the "hello" function!
after, x = 100


# Globals and builtins

Python has some keywords, meaning words that we cannot redefine, such as `def`, and `for` and `while` and `if`.

But many other names that we would think are keywords are actually just names, which we can redefine. These are the "builtin" names, and they have their own namespace. That namespace is checked *after* the local and global namespaces.

In the builtin namespace, we have such names as `int`, `str`, `list`, `dict`, `sum`, `len`, etc.

You can accidentally (or purposely!) redefine these names.

In [93]:
# I need a list, so I'll just name it "list"

list = [100, 200, 300]  # this is a new global variable "list"

In [94]:
# what happens now if I want to invoke "list" on a string?

list('abcd') 

TypeError: 'list' object is not callable

In [95]:
list = 2

In [96]:
list('abcd')

TypeError: 'int' object is not callable

In [97]:
# how can we get out of this mess?
# answer: delete the global name that is "shadowing" the builtin name

del(list)  # this looks scary, but it's actually just deleting the global (not the builtin)

In [98]:
list('abcd')

['a', 'b', 'c', 'd']

In [99]:
del(list)

NameError: name 'list' is not defined

In [100]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BaseExceptionGroup',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'ExceptionGroup',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeErr

In [102]:
len(dir(__builtins__))

160

# A few things to remember

1. We can, in our function body, execute just about any Python code we want.
2. When we use `def`, we (a) create a new function object and (b) assign it to a variable.
3. When you assign to a variable in a function, that variable is local.
4. You can return any type of value from a function.

In [103]:
# consider this:

def myfunc():
    def inner():
        print('Hello from inner!')
    return inner

In [104]:
f = myfunc()

In [105]:
type(f)

function

In [106]:
f()

Hello from inner!


In [107]:
myfunc.__code__.co_varnames

('inner',)

In [108]:
# let's give myfunc and inner parameters

def myfunc(x):
    def inner(y):
        print(f'Hello from inner, {x=} and {y=}!')
    return inner

In [109]:
f = myfunc(10)

In [111]:
f(20)  # yes, we have access to y, our local variable -- but we also have access to x, the local variable from myfunc!

Hello from inner, x=10 and y=20!


In [112]:
# closure -- a function returned from another function, which has access to the outer function's local variables
# this is E, the enclosing scope. An inner function can read from the outer function's locals

In [114]:
f.__code__.co_freevars  # tuple of variables from the enclosing scope that I need access to

('x',)

In [115]:
myfunc.__code__.co_cellvars  # tuple of variables in the outer function, indicating which need to be available to inner fun

('x',)

# Exercise: Password maker maker

We're going to write a function to which we pass a string containing characters from which we want to create passwords. It'll return a function to which we can pass an integer, and get back a randomly generating string of the length specified, from that set of characters.

The idea is that we can (only once) define the set of characters we want in our password, and then invoke the function many times with different lengths.

Example:

    make_digits_password = make_password_maker('123456')
    make_digits_password(5)  # returns 52159

    make_symbol_password = make_password_maker('!@#$%')
    make_symbol_password(5)  # returns %$#@@

    

In [117]:
import random

def make_password_maker(s):
    def make_password(n):
        output = []        
        for i in range(n):
            output.append(random.choice(s))
        return ''.join(output)
    return make_password

make_digits_password = make_password_maker('123456')
make_digits_password(5) 

'56244'

In [118]:
make_symbol_password = make_password_maker('!@#$%')
make_symbol_password(5)  # returns %$#@@

'@##$@'

In [121]:
# what if we want to go a bit further?

def myfunc():
    def inner(y):
        return f'Hello from inner, where {y=}!'
    return inner

f = myfunc()
print(f(20))

Hello from inner, where y=20!


In [122]:
# I want to have the inner function not only return the string, but also
# the number of times that we have called the function.

def myfunc():
    def inner(y):
        count = 0
        output = f'{count} Hello from inner, where {y=}!'
        count += 1
        return output
    return inner

f = myfunc()
print(f(20))
print(f(30))
print(f(40))

0 Hello from inner, where y=20!
0 Hello from inner, where y=30!
0 Hello from inner, where y=40!


In [123]:
# let's try again, but define count in the *outer* function

def myfunc():
    count = 0
    def inner(y):
        output = f'{count} Hello from inner, where {y=}!'
        count += 1
        return output
    return inner

f = myfunc()
print(f(20))
print(f(30))
print(f(40))

UnboundLocalError: cannot access local variable 'count' where it is not associated with a value

In [125]:
# we can tell Python's compiler to see "count" not as a local
# variable, but as a local variable in the enclosing scope.

# the way to do this is with the "nonlocal" keyword

def myfunc():
    count = 0
    
    def inner(y):
        nonlocal count  # when we assign to a local variable in the outer scope, it'll work
        output = f'{count} Hello from inner, where {y=}!'
        count += 1
        return output
    return inner

f = myfunc()
print(f(20))
print(f(30))
print(f(40))

g = myfunc()
print(g(20))
print(g(30))
print(g(40))


0 Hello from inner, where y=20!
1 Hello from inner, where y=30!
2 Hello from inner, where y=40!
0 Hello from inner, where y=20!
1 Hello from inner, where y=30!
2 Hello from inner, where y=40!


In [127]:
# let's replace "nonlocal" with "global"


count = 0

def myfunc():
    count = 0  # this is being ignored
    
    def inner(y):
        global count 
        output = f'{count} Hello from inner, where {y=}!'
        count += 1
        return output
    return inner

f = myfunc()
print(f(20))
print(f(30))
print(f(40))

g = myfunc()
print(g(20))
print(g(30))
print(g(40))

0 Hello from inner, where y=20!
1 Hello from inner, where y=30!
2 Hello from inner, where y=40!
3 Hello from inner, where y=20!
4 Hello from inner, where y=30!
5 Hello from inner, where y=40!


In [130]:
def outer():
    count = 0

    def middle():

        def inner():
            nonlocal count
            count += 1
            print(f'{count=}')
        return inner
    return middle

m = outer()
i = m()
i()
i()
i()

count=1
count=2
count=3


In [131]:
def a():
    return 'a'

def b():
    return 'b'

# I want to let the user choose which function to run

while True:
    s = input('Enter function name: ').strip()

    if not s:
        break

    elif s == 'a':
        print(a())
    elif s == 'b':
        print(b())
    else:
        print('Unknown request')

Enter function name:  a


a


Enter function name:  b


b


Enter function name:  c


Unknown request


Enter function name:  


In [134]:
# Dispatch tables

def a():
    return 'a'

def b():
    return 'b'

funcs = {'a': a,    # the keys are strings
         'b': b}    # the values are functions (function objects)

while True:
    s = input('Enter function name: ').strip()

    if not s:
        break

    elif s in funcs:    # do we see the user's choice in the dict?
        print(funcs[s]())
    else:
        print('Unknown request')

Enter function name:  a


<function a at 0x1091ab9c0>


Enter function name:  b


<function b at 0x1091ab920>


Enter function name:  


# Exercise: Calculators

Our goal is to let the user enter a math expression (with `+` or `-`) and then invoke the appropriate function for adding or subtracting. The user will type something like `2 + 3`. You'll break that apart and then call the right function.

I want you to use a dispatch table for storing, and then choosing, the function that we invoke.

Example:

    Enter math expression: 2 + 3
    2 + 3 = 5
    Enter math expression: 12 - 5
    12 - 5 = 7
    Enter math expression: [ENTER]
    Bye!

In [135]:
pattern = r'(\d+)\s+([-+])\s+(\d+)'

In [138]:
def add(x, y):
    return x + y

def sub(x, y):
    return x - y

def mul(x, y):
    return x * y

ops = {'+':add,
       '-':sub,
       '*':mul}

while s := input('Enter math expression: ').strip():
    first, op, second = s.split()

    try:
        first_n = int(first)
        second_n = int(second)
    
        if op in ops:
            result = ops[op](first_n, second_n)
        else:
            result = f'Bad operator {op}'
    
        print(f'{first} {op} {second} = {result}')

    except ValueError as e:
        print(f'Try again: {e}')

Enter math expression:  2 * 10


2 * 10 = 20


Enter math expression:  


In [141]:
# let's take advantage of the "operator" module

import operator

ops = {'+':operator.add,
       '-':operator.sub,
       '*':operator.mul}

while s := input('Enter math expression: ').strip():
    first, op, second, *ignored_stuff = s.split()

    try:
        first_n = int(first)
        second_n = int(second)
    
        if op in ops:
            result = ops[op](first_n, second_n)
        else:
            result = f'Bad operator {op}'
    
        print(f'{first} {op} {second} = {result}')

    except ValueError as e:
        print(f'Try again: {e}')

Enter math expression:  2 + 3 + 4


2 + 3 = 5


Enter math expression:  2 + 1


2 + 1 = 3


Enter math expression:  


# Type hints

In [142]:
def hello(name:str) -> str:
    return f'Hello, {name}!'

In [143]:
hello(5)

'Hello, 5!'

In [144]:
# where are these type hints stored in Python?

hello.__annotations__

{'name': str, 'return': str}

# Exercise: Calculator with typing

Rewrite the calculator exercise such that `mypy --strict` doesn't complain.


# Next up

1. Comprehensions (list, set, dict, nested)
2. Passing functions as arguments

Resume at 13:30 Paris Time

# Comprehensions

The point is to create a new data structure based on an existing iterable.



In [145]:
numbers = list(range(10))

# if I want all of these numbers squared, and in a list

# method 1: a for loop
output = []

for one_number in numbers:
    output.append(one_number ** 2)

output

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [146]:
# method 2: use a comprehension

output = [one_number ** 2             # SELECT  / expression
          for one_number in numbers]  # FROM    / iteration

output

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

When should we use a comprehension, vs. a regular `for` loop?

A comprehension is definitely appropriate when:
- You have an existing iterable data structure
- You want a new list (or dict or set) based on that iterable
- For each element, you can describe a Python expression that maps from one to the other

In [149]:
s = 'This is a bunch of words for my course'

# I want to find out the total number of letters in this sentence

sum([len(one_word)
 for one_word in s.split()])

30

In [150]:
def get_squares():  
    return [one_number ** 2     
          for one_number in numbers] 

import dis

    

In [151]:
dis.dis(get_squares)

  1           0 RESUME                   0

  3           2 LOAD_GLOBAL              0 (numbers)

  2          12 GET_ITER
             14 LOAD_FAST_AND_CLEAR      0 (one_number)
             16 SWAP                     2
             18 BUILD_LIST               0
             20 SWAP                     2
        >>   22 FOR_ITER                 7 (to 40)

  3          26 STORE_FAST               0 (one_number)

  2          28 LOAD_FAST                0 (one_number)
             30 LOAD_CONST               1 (2)
             32 BINARY_OP                8 (**)
             36 LIST_APPEND              2
             38 JUMP_BACKWARD            9 (to 22)
        >>   40 END_FOR
             42 SWAP                     2
             44 STORE_FAST               0 (one_number)
             46 RETURN_VALUE
        >>   48 SWAP                     2
             50 POP_TOP
             52 SWAP                     2
             54 STORE_FAST               0 (one_number)
             56 RERA

In [156]:
# we can work with a file, too

[one_line.split(':')[0]                # expression is only evaluated + output if the condition is True
 for one_line in open('/etc/passwd')   # iteration
 if not one_line.startswith('#') ]     # any condition at all -- with any methods/operators/etc.

['nobody',
 'root',
 'daemon',
 '_uucp',
 '_taskgated',
 '_networkd',
 '_installassistant',
 '_lp',
 '_postfix',
 '_scsd',
 '_ces',
 '_appstore',
 '_mcxalr',
 '_appleevents',
 '_geod',
 '_devdocs',
 '_sandbox',
 '_mdnsresponder',
 '_ard',
 '_www',
 '_eppc',
 '_cvs',
 '_svn',
 '_mysql',
 '_sshd',
 '_qtss',
 '_cyrus',
 '_mailman',
 '_appserver',
 '_clamav',
 '_amavisd',
 '_jabber',
 '_appowner',
 '_windowserver',
 '_spotlight',
 '_tokend',
 '_securityagent',
 '_calendar',
 '_teamsserver',
 '_update_sharing',
 '_installer',
 '_atsserver',
 '_ftp',
 '_unknown',
 '_softwareupdate',
 '_coreaudiod',
 '_screensaver',
 '_locationd',
 '_trustevaluationagent',
 '_timezone',
 '_lda',
 '_cvmsroot',
 '_usbmuxd',
 '_dovecot',
 '_dpaudio',
 '_postgres',
 '_krbtgt',
 '_kadmin_admin',
 '_kadmin_changepw',
 '_devicemgr',
 '_webauthserver',
 '_netbios',
 '_warmd',
 '_dovenull',
 '_netstatistics',
 '_avbdeviced',
 '_krb_krbtgt',
 '_krb_kadmin',
 '_krb_changepw',
 '_krb_kerberos',
 '_krb_anonymous',
 '_asse

In [157]:
!head shoe-data.txt

Adidas	orange	43
Nike	black	41
Adidas	black	39
New Balance	pink	41
Nike	white	44
New Balance	orange	38
Nike	pink	44
Adidas	pink	44
New Balance	orange	39
New Balance	black	43


# Exercise: Shoe info

The file `shoe-data.txt` contains 100 lines, each with three columns: Brand, color, and the size. The columns are separated with tab characters (`'\t'`).

Use a list comprehension to create a list of dicts where each dict contains three key-value pairs with `brand`, `color`, and `size`.

Get the file from https://files.lerner.co.il/advanced-exercise-files.zip .

In [162]:
filename = 'shoe-data.txt'

def line_to_dict(s):
    brand, color, size = s.strip().split('\t')    
    return {'brand': brand,
            'color': color,
            'size': size}

[line_to_dict(one_line)
 for one_line in open(filename)]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [165]:
# let's do it using zip!

filename = 'shoe-data.txt'

def line_to_dict(s):
    return dict(zip(['brand', 'color', 'size'],
                     s.strip().split('\t')))
    
[line_to_dict(one_line)
 for one_line in open(filename)]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [164]:
dict(zip('abcd', 
         [10, 20, 30, 40]))

{'a': 10, 'b': 20, 'c': 30, 'd': 40}

In [167]:
# I want to let the user enter numbers, separated by spaces
# and then produce a list of those numbers squared

s = input('Enter numbers: ').strip()

[int(one_item) ** 2
 for one_item in s.split()]

Enter numbers:  10 20 30


[100, 400, 900]

In [168]:
# Let's change things: Make sure that every number appears only once in the result
# if the user enters a number more than once, only the first should remain

# we can use a set!

s = input('Enter numbers: ').strip()

set([int(one_item) ** 2
 for one_item in s.split()])

Enter numbers:  10 20 30 10 20 30


{100, 400, 900}

In [169]:
# instead of producing a list with a list comprehension, and then
# running set on it, we can just use a "set comprehension", which
# looks identical to a list comprehension but uses {} instead

s = input('Enter numbers: ').strip()

{int(one_item) ** 2
 for one_item in s.split()}

Enter numbers:  10 20 30 10 20 30 -10 -20 -30


{100, 400, 900}

In [176]:
# let's once again get the usernames in /etc/passwd

{one_line.split(':')[0]
 for one_line in open('/etc/passwd')
 if not one_line.startswith('#')}

{('_accessoryupdater',
  '*',
  '278',
  '278',
  'Accessory Update Daemon',
  '/var/db/accessoryupdater',
  '/usr/bin/false\n'),
 ('_amavisd',
  '*',
  '83',
  '83',
  'AMaViS Daemon',
  '/var/virusmails',
  '/usr/bin/false\n'),
 ('_analyticsd',
  '*',
  '263',
  '263',
  'Analytics Daemon',
  '/var/db/analyticsd',
  '/usr/bin/false\n'),
 ('_appinstalld',
  '*',
  '273',
  '273',
  'App Install Daemon',
  '/var/db/appinstalld',
  '/usr/bin/false\n'),
 ('_appleevents',
  '*',
  '55',
  '55',
  'AppleEvents Daemon',
  '/var/empty',
  '/usr/bin/false\n'),
 ('_applepay',
  '*',
  '260',
  '260',
  'applepay Account',
  '/var/db/applepay',
  '/usr/bin/false\n'),
 ('_appowner',
  '*',
  '87',
  '87',
  'Application Owner',
  '/var/empty',
  '/usr/bin/false\n'),
 ('_appserver',
  '*',
  '79',
  '79',
  'Application Server',
  '/var/empty',
  '/usr/bin/false\n'),
 ('_appstore',
  '*',
  '33',
  '33',
  'Mac App Store Service',
  '/var/db/appstore',
  '/usr/bin/false\n'),
 ('_ard',
  '*',
  '6

# Exercise: Shells

1. The file `linux-etc-passwd.txt` is a `passwd` file that contains blank lines and comments, both of which you should ignore.
2. Produce a set of the different shells used by people on that computer. The shell, aka the command interpreter, is always in the final field of each record.

In [177]:
!head linux-etc-passwd.txt

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin


In [181]:
{one_line.strip().split(':')[-1]
 for one_line in open('linux-etc-passwd.txt')
 if one_line.strip() and not one_line.startswith('#')}

{'/bin/bash',
 '/bin/false',
 '/bin/nologin',
 '/bin/sh',
 '/bin/sync',
 '/usr/sbin/nologin'}

In [182]:
# if you liked set comprehensions, you'll love dict comprehensions!

# let's create a dict with words and their lengths

words = 'this is a bunch of words'

{
    # key             # value     -- for each dict pair
    one_word     :    len(one_word)
    for one_word in words.split()
}

{'this': 4, 'is': 2, 'a': 1, 'bunch': 5, 'of': 2, 'words': 5}

In [188]:
%%timeit

# let's create a dict based on linux-etc-passwd.txt, where the keys
# are usernames and the values are their command shells

{
    # username                        # shell
    one_line.split(':')[0]      :     one_line.strip().split(':')[-1]
    for one_line in open('linux-etc-passwd.txt')
    if one_line.strip() and not one_line.startswith('#')
}

756 µs ± 101 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [187]:
%%timeit

# we can use the walrus operator to save us! (for some value of "saving")

{
    # username           # shell
    fields[0]      :     fields[-1]
    for one_line in open('linux-etc-passwd.txt')
    if one_line.strip() and not one_line.startswith('#') and (fields := one_line.strip().split(':'))
}

549 µs ± 49.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


In [189]:
# Nested comprehensions

# let's say we have a list of lists, and we want to flatten that list
# can we use a list comprehension to do it?

mylist = [[10, 20, 25, 30], [35, 40, 45], [50, 60, 70, 80, 85], [90, 100, 110, 120]]

[one_list
 for one_list in mylist]

[[10, 20, 25, 30], [35, 40, 45], [50, 60, 70, 80, 85], [90, 100, 110, 120]]

In [191]:
# for us to have a resulting list that's longer than the input
# we will need a new tool -- that's the nested comprehension

[one_element
 for one_sublist in mylist
 for one_element in one_sublist]


SyntaxError: iterable unpacking cannot be used in comprehension (213498364.py, line 4)

In [193]:
# thanks to nested comprehensions, you can have as many "if" lines as you want
# in your comprehension... they are "anded" together 

[one_element
 for one_sublist in mylist
 if len(one_sublist) > 3
 for one_element in one_sublist
 if one_element % 2]


[25, 85]

In [194]:
# breaking up our dict comprehension so that it has three "if" lines rather than one
# the multiple "if" lines have "and" between them

{
    # username           # shell
    fields[0]      :     fields[-1]
    for one_line in open('linux-etc-passwd.txt')
    if one_line.strip() 
    if not one_line.startswith('#') 
    if (fields := one_line.strip().split(':'))
}

{'root': '/bin/bash',
 'daemon': '/usr/sbin/nologin',
 'bin': '/usr/sbin/nologin',
 'sys': '/usr/sbin/nologin',
 'sync': '/bin/sync',
 'games': '/usr/sbin/nologin',
 'man': '/usr/sbin/nologin',
 'lp': '/usr/sbin/nologin',
 'mail': '/usr/sbin/nologin',
 'news': '/usr/sbin/nologin',
 'uucp': '/usr/sbin/nologin',
 'proxy': '/usr/sbin/nologin',
 'www-data': '/usr/sbin/nologin',
 'backup': '/usr/sbin/nologin',
 'list': '/usr/sbin/nologin',
 'irc': '/usr/sbin/nologin',
 'gnats': '/usr/sbin/nologin',
 'nobody': '/usr/sbin/nologin',
 'syslog': '/bin/false',
 'messagebus': '/bin/false',
 'landscape': '/bin/false',
 'jci': '/bin/bash',
 'sshd': '/usr/sbin/nologin',
 'user': '/bin/bash',
 'reuven': '/bin/bash',
 'postfix': '/bin/false',
 'colord': '/bin/false',
 'postgres': '/bin/bash',
 'dovecot': '/bin/false',
 'dovenull': '/bin/false',
 'postgrey': '/bin/false',
 'debian-spamd': '/bin/sh',
 'memcache': '/bin/false',
 'genadi': '/bin/bash',
 'shira': '/bin/bash',
 'atara': '/bin/bash',
 'shikma

In [195]:
# how many times does each brand appear in our shoe data?

filename = 'shoe-data.txt'

def line_to_dict(s):
    return dict(zip(['brand', 'color', 'size'],
                     s.strip().split('\t')))
    
shoe_data = [line_to_dict(one_line)
 for one_line in open(filename)]

In [None]:
[