# Unimaginable Things in Python

**Quirky things that might trip your Python experience if you are not careful**

## Purpose of this talk
...

## 01 — Evaluation Order of Function Statements

Many Python programmers were told <u>not</u> to do this:

In [1]:
def append_to_list(value, values_list=[]):
    values_list.append(value)
    return values_list

In [2]:
print(append_to_list(40, [20]))
print(append_to_list(55, []))
print(append_to_list(30))
print(append_to_list(75))  # expected [75]

[20, 40]
[55]
[30]
[30, 75]


### Two important things to note:

1. Every time a **“list literal”** is evaluated, <span class="hl">a new list is <u>always</u> constructed</span>.
    ```python
    []; [2, 3, 5]; ['dog', 'cat']; ...  # list literals
    ```

2. But <span class="hl">all default argument expressions are evaluated <u>just once</u></span> when function definitions are interpreted (i.e. evaluated) during program execution.
    ```python
    # '1+2' is immediately evaluated when 'func' is interpreted
    def func(x: int, y: int = 1+2):  
        ...
    ```
    <u>Note:</u> The same goes for **“type annotations”** 
    <span class="cry">(unless a particular `__future__` import from `python3.7` is used; PEP-563)</span>. <br/>
    In contrast: a function body is <u>not</u> interpret until the it is called.

### How to fix this problem

One way to do it:

In [3]:
def append_to_list(value, values_list=None):
    values_list = values_list or []  # short-circuiting
    values_list.append(value)
    return values_list

print(append_to_list(30))
print(append_to_list(75))

[30]
[75]


#### What if `None` is a valid argument in some contexts?

<span class="hl">Make default argument a module-private value.</span>

In [4]:
class _MISSING:
    pass

def append_to_list(value, values_list=_MISSING):
    if values_list is _MISSING:
        values_list = []
    values_list.append(value)
    return values_list

print(append_to_list(30))
print(append_to_list(75))

[30]
[75]


### Other similar problems

Function with **“current time”** as a default argument.

In [5]:
from datetime import datetime
from time import sleep

In [6]:
def print_time(message, tm=datetime.now()):
    print(f"{tm:%H:%M:%S.%f} - {message}")

current_time = datetime.now()
sleep(1); print_time('expected 2nd')
sleep(1); print_time('expected 3rd (most recent)')
print_time('expected 1st (earliest)', current_time)

08:20:03.216946 - expected 2nd
08:20:03.216946 - expected 3rd (most recent)
08:20:03.216980 - expected 1st (earliest)


#### Fixed version:

In [7]:
def print_time(message, tm=None):
    tm = tm or datetime.now()
    print(f"{tm:%H:%M:%S.%f} - {message}")

current_time = datetime.now()
sleep(1); print_time('expected 2nd')
sleep(1); print_time('expected 3rd (most recent)')
print_time('expected 1st (earliest)', current_time)

08:20:06.231261 - expected 2nd
08:20:07.232860 - expected 3rd (most recent)
08:20:05.229940 - expected 1st (earliest)


## # — Comprehension Syntax
Example of **“list comprehension”**:

In [8]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
squared_values = [ x**2 for x in values ]
print(squared_values)

[25, 4, 16, 1, 64, 49, 9, 36]


The above comprehension syntax can be desugared into:

In [9]:
squared_values_alt = list( x**2 for x in values )
print(squared_values_alt)

[25, 4, 16, 1, 64, 49, 9, 36]


### Decomposing the comprehension syntax

Breaking down this line of code:
```python
squared_values_alt = list( x**2 for x in values )
x = 10
```
A step-by-step break-down:

In [10]:
expr_01 = ( x**2 for x in values )  # an iterable
expr_02 = list(expr_01)  # list takes any iterable
print(expr_02)

[25, 4, 16, 1, 64, 49, 9, 36]


### Do we always need a list?
Sometimes, **“comprehension-style generator expression”** suffices.

In [11]:
squared_values_iterable = ( x**2 for x in values )  # generator expression
for i, v in enumerate(squared_values_iterable, start=1):
    print(f'{i:2d}: {v}')  # f-string since python 3.6

 1: 25
 2: 4
 3: 16
 4: 1
 5: 64
 6: 49
 7: 9
 8: 36


<span class="joy">😀 **saves memory** — preventing unnecessary memory allocation for intermediate lists.</span>

In [12]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
sq_values = ( x**2 for x in values )
odd_sq_values = [ x for x in sq_values if x % 2 == 1 ]
even_sq_values = [ x for x in sq_values if x % 2 == 0 ]

print(odd_sq_values)
print(even_sq_values)

[25, 1, 49, 9]
[]


<span class="cry">😢 Iterable `sq_values` is already <u>consumed</u> after its first iteration.</span>

### Quick Fix
Make `sq_values` a list instead of a generator object.

In [13]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
sq_values = [ x**2 for x in values ]
odd_sq_values = [ x for x in sq_values if x % 2 == 1 ]
even_sq_values = [ x for x in sq_values if x % 2 == 0 ]

print(odd_sq_values)
print(even_sq_values)

[25, 1, 49, 9]
[4, 16, 64, 36]


## # — Functions Are Data Too

**TODO:** Maybe regexp.compile is a more compelling use-case. 

We will write a function `extractors_from_keys` with the following specification.
- <u>**input**</u>: A list `keys` of dictionary keys.
- <u>**output**</u>: A list of dict-value extractors (callables). <br/>
    Each extractor will receive a dict as input and then will extract a value from it 
    under a corresponding key in the list `keys`, <br/>
    preserving the original order. If such a key does not exists, then `None` should be returned.

For example,
```python
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    pprint(extractor(data))  # expected outputs: 150, None, 200
```

### Helping Function
Use `dict.get` method on a dictionary object.

In [14]:
data = { 'puppy': 150, 'kitten': 200 }
print(data.get('puppy'))
print(data.get('otter'))
print(data.get('kitten'))

150
None
200


In [15]:
def extractors_from_keys(keys):
    return [ (lambda data: data.get(k)) for k in keys ]

In [16]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    print(extractor(data))

200
200
200


<span class="hl">Every lambda function shared the same `k`, last element in the list `keys`</span> where
```python
keys = ['puppy', 'otter', 'kitten']
########################## -- k --
```

### How to fix

#### Method 1

<span class="hl">Make a copy of each key inside its own **“closure”** in constructed callables.</span>

For example, make a placeholder for key `k` and supply its value with `functools.partial`.

In [17]:
from functools import partial
def extractors_from_keys(keys):
    return [ partial((lambda k, data: data.get(k)), k) for k in keys ]

In [18]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    print(extractor(data))

150
None
200


#### Method 2

<span class="hl">Create a callable object (with ``__call__`` method) from a given key `k`.</span>

In [19]:
class Extractor(object):
    def __init__(self, key):
        self.key = key
    def __call__(self, data):
        return data.get(self.key)

def extractors_from_keys(keys):
    return [ Extractor(k) for k in keys ]

In [20]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    print(extractor(data))

150
None
200


## # — Opening Files and Iterators

Suppose that we wrote a function to <u>count the number of words</u> for each string in a sequence. 

In [21]:
import re
word_re = re.compile(r'\w+')

def count_words(sentence):
    """Counts number of words in a given sentence."""
    return len(word_re.findall(sentence))

def wordcounts_from_sentences(sentences):
    """
    Returns a generator object which yields the word
    count for each sentence in the input.
    """
    for sentence in sentences:
        yield count_words(sentence)

This code should work for any iterable of strings. **¯\\\_(ツ)\_/¯**

### Let's try with a list of strings

In [22]:
sentences = [
    "Hello, how are you?",
    "I'm fine thank you. And you?",
    "...",
    "Ellipsis?",
    "Yes! It's a valid expression",
]

for wc in wordcounts_from_sentences(sentences):
    print(wc)

4
7
0
1
6


### What about a file???

<span class="hl">Technically, a file object is an iterable of lines in the file.</span>

In [23]:
def wordcounts_from_file(filename):
    """
    Returns a generator object which yields the word
    count for each line in the file.
    """
    # Good practice to open files with with-stmt!
    with open(filename) as fileobj:
        return wordcounts_from_sentences(fileobj)

This function should work with text files too! **¯\\\_(ツ)\_/¯**

In [24]:
for wc in wordcounts_from_file('input.txt'):
    print(wc)

ValueError: I/O operation on closed file.

<span class="cry">😢 File object was forced closed <u>before</u> the first line was even read.</span>

### Easiest way to fix
Since `wordcounts_from_sentences` is a generator function, we do the same for `wordcounts_from_file`.

In [25]:
def wordcounts_from_file(filename):
    """
    Returns a generator object which yields the word
    count for each line in the file.
    """
    # Good practice to open files with with-stmt!
    with open(filename) as fileobj:
        #return wordcounts_from_sentences(fileobj)
        yield from wordcounts_from_sentences(fileobj)

In [26]:
for wc in wordcounts_from_file('input.txt'):
    print(wc)

4
7
0
1
6
