## Iterators, generators and comprehensions

* These are used for loops, filtering and mapping of data
* Common and very _pythonic_, but tricky

* An **Iterator** is anything that can be looped over
* `list` and `set` for example, or `dict` and `dict.items()`.
* **Generators** are _lazy_ iterators; the next item is only produced when requested
    * You can create your own generators with the `yield` keyword
    * But for now, we just want to deal with generators created by others

In [None]:
## dicts and lists are iterators

x = dict(a='A', b='B')

for key in x:
    print(key)

In [None]:
for key, value in x.items():
    print(key, value)

In [None]:
for i in [0, 1, 2, 3, 4, 5, 6]:
    print(i)

In [None]:
## WARNING don't change the list when you're looping over it

l = [1, 2, 3]

for item in l:
    
    l = [1, 2, 3, 4]
    
    ## l.append(item)
    print(item)

In [None]:
## Generators are "lazy" iterators; the next item is only produced when
## requested

for i in range(5):
    print(i)

In [None]:
range(5)

In [None]:
range(500).__sizeof__()

In [None]:
## you can make the list concrete by passing it to `list`

list(range(500)).__sizeof__()

In [None]:
d = {3: 4, 5: 6}

d.keys()

In [None]:
d.values()

In [None]:
import os

os.walk('.')

In [None]:
for root, dirs, files in os.walk("."):
    print('ROOT: ## ', root, end='\n\n')
    for name in files:
        print(name)
    print()

**Exercise** 

* Using a `for` loop and the range function, create a list of all squares of even numbers less than 1000
* Create a list of all filenames on your desktop folder, including the files in subfolders
    * How many of them end in '.docx'?
    * Count all the letters in these filenames and list the five most common. Use `from collections import Counter`

In [None]:
## solution

l_squares = list()

for i in range(0, 100):
    
    ## add the square to the list if i is even and it's less than 1000
    pass

## Why use generators

* Saves memory
* Sometimes you don't know beforehand how many iterations you need
* Can make many program parts run simultaneously
    * For example, in a user interface

## Comprehensions

* Comprehensions are quick ways to filter or compose `list`, `set` and `dict`

In [None]:
## list comprehension uses square brackets

[x for x in [1, 2, 3]]

In [None]:
def square(x):
    return x ** 2

[square(i) for i in range(0, 10)]

In [None]:
import string

alphabet = string.ascii_lowercase
alphabet

In [None]:
## The `join` function is useful for composing strings

"".join(['a', 'b', 'c'])

**Exercise**

* Using a list comprehension, create a new alphabet of uppercase letters
* Create a new list of uppercase letters with only letters that appear in the word 'conjunction'
* Create a word by filtering the alphabet for letters that do not appear in the word 'conjunction' and joining them together with underscores.
    * The first part should be 'a_b_d'

In [None]:
## solutions



In [None]:
## dict comprehension

d = {
    i: i ** 2 for i in range(5)
}

d

In [None]:
## switch values and keys

{
    value: key for key, value in d.items()
}

In [None]:
## sorting a dict

scores = {
    "alfred": 4.7,
    "Charlie": 7.8,
    "gijs": 5.6,
    "Bettie": 3.8,
    "frederik": 6.7
}

scores_list = [(key, value) for key, value in scores.items()]

scores_list

In [None]:
## if you want the names, sorted by score, you first have to sort this list of tuples
## then you can extract the names

def score(pair):
    return pair[1]

sorted(scores_list, key=score)

## Exercise

* Create a list of scores, sorted by the name
* Get a list of the squares of all even numbers less than 1000
* Make a dictionary with as keys the lowercase letters and values uppercases, repeated 10 times

```
answer = {
    'a': 'AAAAAAAAAA',
    'b': ..
}
```

In [None]:
## solutions

