# Interations in Python
The inspiration for this section of the talk comes from and borrows heavily upon Ned Batchelder's PyCon 2013 talk titled _"Loop like a native: while, for, iterators, generators"_.

Please see https://nedbatchelder.com/text/iter.html for the original material which is a lot more extensive.

### Looping over a list

In [1]:
cats = ['Max', 'Chloe', 'Bella', 'Oliver', 'Kitty Purry']
cats

['Max', 'Chloe', 'Bella', 'Oliver', 'Kitty Purry']

##### First attempt

In [2]:
for i in range(len(cats)):
    print(cats[i])

Max
Chloe
Bella
Oliver
Kitty Purry


##### Second attempt

In [3]:
for cat in cats:
    print(cat)

Max
Chloe
Bella
Oliver
Kitty Purry


### Looping over a dictionary

In [4]:
cats = {'Oliver': 5000, 'Max': 20000, 'Chloe': 15000, 'Bella': 10000, 'Kitty Purry': 1}
# These are apparently real cat names, and the first 4 are the most common ones in the USA

In [5]:
for cat_name in cats:
    print(cat_name)

Oliver
Max
Chloe
Bella
Kitty Purry


So this is actually just looping over looping over keys.
How do we get the counts?

In [6]:
for cat_name in cats:
    print(cats[cat_name])

5000
20000
15000
10000
1


That worked, but it's not very elegant or readable. We can do better.

In [7]:
for cat_count in cats.values():
    print(cat_count)

5000
20000
15000
10000
1


In reality though, that doesn't seem very useful. Having counts like that is probably not all that useful.
This is especially given the order of a dictionary cannot be relied upon.

In [8]:
for cat_name in cats:
    print(cat_name, cats[cat_name])

Oliver 5000
Max 20000
Chloe 15000
Bella 10000
Kitty Purry 1


That works, but again it's not very expressive. Python gives dictionaries a .items() method.

In [9]:
for cat_tuple in cats.items():
    print(cat_tuple[0], cat_tuple[1])

Oliver 5000
Max 20000
Chloe 15000
Bella 10000
Kitty Purry 1


That's a bit better, as we're not repeating cats, but we can actually make it a lot clearer by using tuple unpacking.

In [10]:
# Example of a tuple
list(cats.items())[0]

('Oliver', 5000)

In [11]:
# Unpacking the tuple
for cat_name, cat_count in cats.items():
    print(cat_name, cat_count)

Oliver 5000
Max 20000
Chloe 15000
Bella 10000
Kitty Purry 1


###### .keys() - what is it for?

PEP20 talks about the Python philosophy and is a good guide for what one should strive to achieve when writing Python code and among other things it says:

> __There should be one-- and preferably only one --obvious way to do it.__

> Although that way may not be obvious at first unless you're Dutch.

In [12]:
for cat in cats.keys():
    print(cat)

Oliver
Max
Chloe
Bella
Kitty Purry


In [13]:
type(cats.keys())

dict_keys

It's actually to be used with set operations.

In [14]:
# cats - {'Kitty Purry'}

In [15]:
cats.keys() - {'Kitty Purry'}

{'Bella', 'Chloe', 'Max', 'Oliver'}

So .keys() exist as to return what is in effect a set that can be used for normal set operations.

<center>![Kitty Purry](http://i.perezhilton.com/wp-content/uploads/2012/01/katy-perrys-kitty-purrah__oPt.jpg "Kitty Purry")</center>

## List comprehension
Let's say we want to store cat names as upper case

In [16]:
# Let's get a cat names list from the dictionary keys
cat_names = list(cats.keys())  # You don't really need list() because of duck typing
cat_names

['Oliver', 'Max', 'Chloe', 'Bella', 'Kitty Purry']

In [17]:
cat_names_upper = [cat_name.upper() for cat_name in cat_names]
cat_names_upper

['OLIVER', 'MAX', 'CHLOE', 'BELLA', 'KITTY PURRY']

## Dictionary comprehension
Let's say we want to store cat names as upper case and we want to increment the number of cats by one

In [18]:
new_cats = {cat_name.upper(): cat_count + 1 for cat_name, cat_count in cats.items()}
new_cats

{'BELLA': 10001,
 'CHLOE': 15001,
 'KITTY PURRY': 2,
 'MAX': 20001,
 'OLIVER': 5001}

In [19]:
# What if we only want actually common cat names?
common_cats = {cat_name: cat_count for cat_name, cat_count in cats.items() if cat_count > 1000}
common_cats

{'Bella': 10000, 'Chloe': 15000, 'Max': 20000, 'Oliver': 5000}

## Enumerate
What if we want to know on which loop iteration we're currently on?

##### First attempt

In [20]:
# We could write
i = 0
for i in range(len(cat_names)):
    print("Cat name: {}. Order in dictionary: {}".format(cat_names[i], i))

Cat name: Oliver. Order in dictionary: 0
Cat name: Max. Order in dictionary: 1
Cat name: Chloe. Order in dictionary: 2
Cat name: Bella. Order in dictionary: 3
Cat name: Kitty Purry. Order in dictionary: 4


##### Second attempt

In [21]:
i = 0
for cat in cat_names:
    print("Cat name: {}. Order in dictionary: {}".format(cat, i))
    i += 1

Cat name: Oliver. Order in dictionary: 0
Cat name: Max. Order in dictionary: 1
Cat name: Chloe. Order in dictionary: 2
Cat name: Bella. Order in dictionary: 3
Cat name: Kitty Purry. Order in dictionary: 4


##### Third time's the charm

In [22]:
# But then we have this straggling i that's still around after the loop, we have to increment it on its own statement etc.
for i, cat in enumerate(cats):
    print("Cat name: {}. Order in dictionary: {}".format(cat, i))

Cat name: Oliver. Order in dictionary: 0
Cat name: Max. Order in dictionary: 1
Cat name: Chloe. Order in dictionary: 2
Cat name: Bella. Order in dictionary: 3
Cat name: Kitty Purry. Order in dictionary: 4


In [23]:
list(enumerate(cats))

[(0, 'Oliver'), (1, 'Max'), (2, 'Chloe'), (3, 'Bella'), (4, 'Kitty Purry')]

This works for lines in files and any kind of iterable items, not just strings or integers.

## Zip
What if we have two lists? That's what we have zip for!

In [24]:
# Let's pretend we have to lists, and create them from the previous stuff
cats_names = list(cats.keys())
cats_counts = list(cats.values())
print(cats_names)
print(cats_counts)

['Oliver', 'Max', 'Chloe', 'Bella', 'Kitty Purry']
[5000, 20000, 15000, 10000, 1]


In [25]:
for cat_name, cat_count in zip(cats_names, cats_counts):
    print(cat_name, cat_count)

Oliver 5000
Max 20000
Chloe 15000
Bella 10000
Kitty Purry 1


#### Sorting two lists

In [26]:
sorted_cat_names, sorted_cat_counts = zip(*sorted(zip(cats_names, cats_counts), key=lambda x: x[1]))

In [27]:
for cat_name, cat_count in zip(sorted_cat_names, sorted_cat_counts):
    print(cat_name, cat_count)

Kitty Purry 1
Oliver 5000
Bella 10000
Chloe 15000
Max 20000


In [28]:
# What if we want it to be descending rather than ascending?
sorted_cat_names, sorted_cat_counts = zip(*sorted(zip(cats_names, cats_counts), key=lambda x: x[1], reverse=True))
for cat_name, cat_count in zip(sorted_cat_names, sorted_cat_counts):
    print(cat_name, cat_count)

Max 20000
Chloe 15000
Bella 10000
Oliver 5000
Kitty Purry 1


#### i and i+1 at the same time

In [29]:
i = 0
for i in range(len(sorted_cat_names) - 1):
    print("Current: {}. Next: {}".format(sorted_cat_names[i], sorted_cat_names[i+1]))

Current: Max. Next: Chloe
Current: Chloe. Next: Bella
Current: Bella. Next: Oliver
Current: Oliver. Next: Kitty Purry


In [30]:
for current_cat, next_cat in zip(sorted_cat_names, sorted_cat_names[1:]):
    print("Current: {}. Next: {}".format(current_cat, next_cat))

Current: Max. Next: Chloe
Current: Chloe. Next: Bella
Current: Bella. Next: Oliver
Current: Oliver. Next: Kitty Purry


#### Finding min/max in an interable

In [31]:
# Highest count of cats
print("Max:", max(cats.values()))

print("Min:", min(cats.items(), key=lambda x: x[1]))

Max: 20000
Min: ('Kitty Purry', 1)


## Itertools
Itertools is a part of the standard library that has a lot of functions to doing interesting custom iterations. These are just some basic examples, just be aware it's there.

In [32]:
import itertools
import string
repeat_sequence = itertools.cycle(string.ascii_lowercase)
for number, _ in zip(repeat_sequence, range(50)):
    print(number, end=" ")

a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w x 

In [33]:
cat_sequence = itertools.cycle(cats)
for cat, _ in zip(repeat_sequence, range(50)):
    print(cat, end=" ")

z a b c d e f g h i j k l m n o p q r s t u v w x y z a b c d e f g h i j k l m n o p q r s t u v w 

In [34]:
for cat in itertools.repeat(cat_names[0], 3):
    print(cat)

Oliver
Oliver
Oliver


## Creating your own stream

In [35]:
# Import a csv we'll use for the next part
import csv
def csv_generator(csv_reader):
    for row in csv_reader:
        for cell in row:
            yield cell

In [36]:
with open('sources/tmdb_5000_movies.csv') as f:
    csv_reader = csv.reader(f)
    for cell, _ in zip(csv_generator(csv_reader), range(20)):
        print(cell, end=" ")

budget genres homepage id keywords original_language original_title overview popularity production_companies production_countries release_date revenue runtime spoken_languages status tagline title vote_average vote_count 

In [37]:
def odd_numbers(stream_of_numbers):
    for number in stream_of_numbers:
        if number % 2 == 1:
            yield number

In [38]:
for number in odd_numbers(range(50)):
    print(number, end=" ")

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 