## Iterators and Iterables

### Itrators

We have iterators like:

    for Loop
    while loop
    do while
    
### Iterables

are objects that has an associated `iter()` method.

Examples: lists, strings, dictionaries, file connections

Applying `iter()` to an iterable creates an iterator

### Iterator

Produces next value with `next()`



In [74]:
word = 'Data'
it = iter(word)
next(it)

'D'

In [75]:
next(it)

'a'

In [76]:
next(it)

't'

### Iterating at once with *

In [79]:
word = 'Data'
it = iter(word)
print(*it)

D a t a


### Iterating over dictionaries

We must use `.items()` method to iterate over dict. 

`.items()` method returns both the key and value

In [82]:
pythonistas = {'hugo': 'bowne-anderson', 'francis': 'castro'}

for key, value in pythonistas.items():
    print(key,":", value)

hugo : bowne-anderson
francis : castro


### Iterating over file connections

In [90]:
file = open('text.txt')

#create iterable object
it = iter(file)

#print next value
print(next(it))
print(next(it))
print(next(it))

Abebe beso bela

chala chube chebete

gomen betena


## Playing with iterators

### Using enumerate()

Takes any iterable object

Returns a special enumerate object

The enumerate object itself is iterable

In [92]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
e = enumerate(avengers)
print(type(e))

<class 'enumerate'>


In [94]:
e_list = list(e)
print(e_list)

[]


### enumerate() and unpack

We can number elements while unpacking

In [96]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']

for index, value in enumerate(avengers):
    print(index, value)

0 hawkeye
1 iron man
2 thor
3 quicksilver


In [99]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
for index, value in enumerate(avengers, start=5):
    print(index, value)

5 hawkeye
6 iron man
7 thor
8 quicksilver


### Using zip()

Zipping two objects together creats a zip object.

A zip object is:

    Iterator of tuples
    
    we acn turn this iterator into a list
    
A zip object is also iterable

In [101]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']
z = zip(avengers, names)
print(type(z))

<class 'zip'>


In [102]:
print(z)

<zip object at 0x000001ECB1E03388>


In [104]:
# convert the object into a lit to see what's in it

z_list = list(z)
print(z_list)

[]


### zip() and unpack

we can unpack zipped object using for loop.

In [105]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']

for z1, z2 in zip(avengers, names):
    print(z1, z2)

hawkeye barton
iron man stark
thor odinson
quicksilver maximoff


### Print zip with *

In [106]:
avengers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']
z = zip(avengers, names)
print(*z)

('hawkeye', 'barton') ('iron man', 'stark') ('thor', 'odinson') ('quicksilver', 'maximoff')


## Using iterators to load large files into memory

### Loading data in chunks using Iterators

There can be too much data to hold in memory

`Solution:` load data in chunks!

Pandas function: read_csv()

`Specify the chunk`: `chunk_size`

In [110]:
import pandas as pd
result = []

for chunk in pd.read_csv('avocados.csv', chunksize=10):
    
    total += sum(chunk['nb_sold'])
    
print(total)

1855050000.0


## List comprehensions

Collapse for loops for building lists into a single line

Components:
    
    Iterable
    
    Iterator variable (represent members of iterable)
    
    Output expression

In [115]:
doctor = ['house', 'cuddy', 'chase', 'thirteen', 'wilson']

y=[]

for x in doctor:
    y.append(x[0])

# sprint(y)

['h', 'c', 'c', 't', 'w']


In [118]:
# Using LC

result = [x[0] for x in doctor]
print(result)

['h', 'c', 'c', 't', 'w']


In [119]:
print([x[0] for x in doctor])

['h', 'c', 'c', 't', 'w']


**We want to build the following Multi Dimensional Matrix using loops**

In [120]:
matrix = [[0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4],
          [0, 1, 2, 3, 4]]

In [136]:
x = []

for i in range(0,5):
    x.append(i)
    
print(x)

[0, 1, 2, 3, 4]


### Nested Loops and Nested LC

We can nest LC's too but the only tradeoff is readability.

In [127]:
col = []
row=[]
for i in range(0,5):
    col.append(i)
    for j in range(1):
        row.append(col)
        
print(row)

[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]


In [134]:
# Create a 5 x 5 matrix using a list of lists: matrix
row_lc = [[col for col in range(2)] for row in range(5)]

print(row_lc)

[[0, 1], [0, 1], [0, 1], [0, 1], [0, 1]]


The LC starts from the row or left part then goes into the nested part. 

In [135]:
column = [col for col in range(4)]
print(column)

[0, 1, 2, 3]


In [138]:
matrix = [column for row in range(4)]
print(matrix)

[[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]]


In [142]:
# Create a list of strings: lannister
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']

# Create a generator object: lengths
lengths = (len(lannister) for person in lannister)

# Iterate over and print the values in lengths
print(*lengths)

5 5 5 5 5


## Advanced comprehensions

### Conditionals in comprehensions

We can also add conditional statements to our LC's!

We can place the conditional statement for LC's in two different places:

    1. on the iterable 
    
    2. on the output

In [140]:
# Conditionals on the iterable

[num ** 2 for num in range(10) if num % 2 == 0]

[0, 4, 16, 36, 64]

In [141]:
# Conditionals on the output expression

[num ** 2 if num % 2 == 0 else 0 for num in range(10)]

[0, 0, 4, 0, 16, 0, 36, 0, 64, 0]

## Dict comprehensions

Unlike List comprehensions, DC's :

    Create dictionaries
    
    Use curly braces {} instead of brackets []

In [143]:
pos_neg = {num: -num for num in range(9)}
print(pos_neg)

{0: 0, 1: -1, 2: -2, 3: -3, 4: -4, 5: -5, 6: -6, 7: -7, 8: -8}


## Introduction to generator expressions

### List comprehensions vs. generators (Tuple Comprehension TC)

List comprehension(LC):

    returns a list
    
    stores the values
    
Dict comprehension (DC):

    returns a dict
    
    stores the values
    
Generators (TC):

    returns a generator object
    
    doesn't store in the memory
    
All of them (LC, DC, TC) can be iterated over!!

### Creating the TC (Generator)

In [145]:
result = (num for num in range(6))

type(result)

generator

In [146]:
print(result)

<generator object <genexpr> at 0x000001ECB46C4EC8>


In [152]:
print(*result)




In [153]:
# also possible to iterate over the Generator
result = (num for num in range(6))

for num in result:
    print(num)

0
1
2
3
4
5


## WHY is that Generators are so cool??

Generators don't STORE values in the MEMORY! 

They work on the fly!

We can subset part of the Generator and convert it to list to see and work with them!

Since we can't directly see what's in our Generator, we may need to convert it into lists!

In [154]:
r = (num for num in range(10**100000))

In [156]:
print(r)

<generator object <genexpr> at 0x000001ECB46C4F48>


### Subsetting our Mega Generator

In [175]:
even_nums = ((num for num in range(10**4) if num >999 and num < 1005))

print(even_nums)

<generator object <genexpr> at 0x000001ECB47128C8>


In [176]:
print(list(even_nums))

[1000, 1001, 1002, 1003, 1004]


## Generator functions

Produces generator objects when called

Defined like a regular function - `def`

Yields a `sequence of values` instead of returning a single value

Generates a value with `yield` keyword

In [181]:
def num_sequence(n):
    
    """Generate values from 0 to n."""
    
    i = 0
    
    while i < n:
        yield i
        i += 1

In [182]:
result = num_sequence(5)
print(type(result))

<class 'generator'>


In [183]:
print(*result)

0 1 2 3 4


## Re-cap: list comprehensions

### Basic

[output expression for iterator variable in iterable]

### Advanced

[output expression + conditional on output for iterator variable in iterable + conditional on iterable]