# Unimaginable Things in Python

**Quirky things that might trip your Python experience if you are not careful**

In [1]:
from pprint import pprint

## Purpose of this talk
...

## 01 â€” Comprehension Syntax
Example of **List Comprehension**:

In [2]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
squared_values = [ x**2 for x in values ]
pprint(squared_values)

[25, 4, 16, 1, 64, 49, 9, 36]


The above comprehension syntax can be desugared into:

In [3]:
squared_values_alt = list( x**2 for x in values )
pprint(squared_values_alt)

[25, 4, 16, 1, 64, 49, 9, 36]


### Decomposing the comprehension syntax

Breaking down this line of code:
```python
squared_values_alt = list( x**2 for x in values )
x = 10
```
A step-by-step break-down:

In [4]:
expr_01 = ( x**2 for x in values )  # an iterable
expr_02 = list(expr_01)  # list takes any iterable
pprint(expr_02)

[25, 4, 16, 1, 64, 49, 9, 36]


### Do we always need a list?
Sometimes, **comprehension-style generator expression** suffices.

In [5]:
squared_values_iterable = ( x**2 for x in values )  # generator expression
for i, v in enumerate(squared_values_iterable, start=1):
    print(f'{i:2d}: {v}')  # f-string since python 3.6

 1: 25
 2: 4
 3: 16
 4: 1
 5: 64
 6: 49
 7: 9
 8: 36


<span class="hl">ðŸ˜€ **saves memory** â€” preventing unnecessary memory allocation for intermediate lists.</span>

In [6]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
sq_values = ( x**2 for x in values )
odd_sq_values = [ x for x in sq_values if x % 2 == 1 ]
even_sq_values = [ x for x in sq_values if x % 2 == 0 ]

pprint(odd_sq_values)
pprint(even_sq_values)

[25, 1, 49, 9]
[]


<span class="cry">ðŸ˜¢ Iterable `sq_values` is already **consumed** after its first iteration.</span>

### Quick Fix
Make `sq_values` a list instead of a generator object.

In [7]:
values = [5, 2, 4, 1, 8, 7, 3, 6]
sq_values = [ x**2 for x in values ]
odd_sq_values = [ x for x in sq_values if x % 2 == 1 ]
even_sq_values = [ x for x in sq_values if x % 2 == 0 ]

pprint(odd_sq_values)
pprint(even_sq_values)

[25, 1, 49, 9]
[4, 16, 64, 36]


## 02 â€” Functions Are Data Too

We will write a function `extractors_from_keys` with the following specification.
- **input**: A list `keys` of dictionary keys.
- **output**: A list of callables, each of which will receive a dictionary as input and will extract a value from the dictionary under the corresponding key in the list `keys`, preserving the ordering of the keys. If such a key does not exists, then `None` should be returned.

For example,
```python
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    pprint(extractor(data))  # expected outputs: 150, None, 200
```

### Helping Function
Use `dict.get` method on a dictionary object.

In [8]:
data = { 'puppy': 150, 'kitten': 200 }
pprint(data.get('puppy'))
pprint(data.get('otter'))
pprint(data.get('kitten'))

150
None
200


In [9]:
def extractors_from_keys(keys):
    return [ (lambda data: data.get(k)) for k in keys ]

In [10]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    pprint(extractor(data))

200
200
200


<span class="hl">Every lambda function shared the same `k`, last element in the list `keys`</span> where
```python
keys = ['puppy', 'otter', 'kitten']
########################## -- k --
```

### How to fix

#### Method 1

<span class="hl">Make a copy of each key inside its own **closure** in constructed callables.</span>

For example, make a placeholder for key `k` and supply its value with `functools.partial`.

In [11]:
from functools import partial
def extractors_from_keys(keys):
    return [ partial((lambda k, data: data.get(k)), k) for k in keys ]

In [12]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    pprint(extractor(data))

150
None
200


#### Method 2

<span class="hl">Create a callable object (with ``__call__`` method) from a given key `k`.</span>

In [13]:
class Extractor(object):
    def __init__(self, key):
        self.key = key
    def __call__(self, data):
        return data.get(self.key)

def extractors_from_keys(keys):
    return [ Extractor(k) for k in keys ]

In [14]:
data = { 'puppy': 150, 'kitten': 200 }
animal_extractors = extractors_from_keys(['puppy', 'otter', 'kitten'])

for extractor in animal_extractors:
    pprint(extractor(data))

150
None
200


## 03 â€” Opening Files and Iterators

Suppose that we wrote a function which attempts to **count the number of words\*** for each **string in a sequence**. 

In [15]:
import re
word_re = re.compile(r'\w+')

def count_words(sentence):
    """Counts number of words in a given sentence."""
    return len(word_re.findall(sentence))

def wordcounts_from_sentences(sentences):
    """
    Returns a generator object which yields the word
    count for each sentence in the input.
    """
    for sentence in sentences:
        yield count_words(sentence)

This code should work for any iterable of strings. **Â¯\\\_(ãƒ„)\_/Â¯**

### Let's try with a list of strings

In [16]:
sentences = [
    "Hello, how are you?",
    "I'm fine thank you. And you?",
    "...",
    "Ellipsis?",
    "Yes! It's a valid expression",
]

for wc in wordcounts_from_sentences(sentences):
    pprint(wc)

4
7
0
1
6


### What about a file???

<span class="hl">Technically, a file object is an iterable of lines in the file.</span>

In [17]:
def wordcounts_from_file(filename):
    """
    Returns a generator object which yields the word
    count for each line in the file.
    """
    # Good practice to open files with with-stmt!
    with open(filename) as fileobj:
        return wordcounts_from_sentences(fileobj)

This function should work with text files too! **Â¯\\\_(ãƒ„)\_/Â¯**

In [18]:
for wc in wordcounts_from_file('input.txt'):
    pprint(wc)

ValueError: I/O operation on closed file.

<span class="cry">ðŸ˜¢ File object was forced closed before the first line was even read.</span>

### Easiest way to fix
Since `wordcounts_from_sentences` is a generator function, we do the same for `wordcounts_from_file`.

In [19]:
def wordcounts_from_file(filename):
    """
    Returns a generator object which yields the word
    count for each line in the file.
    """
    # Good practice to open files with with-stmt!
    with open(filename) as fileobj:
        #return wordcounts_from_sentences(fileobj)
        yield from wordcounts_from_sentences(fileobj)

In [20]:
for wc in wordcounts_from_file('input.txt'):
    pprint(wc)

4
7
0
1
6
