<a href="https://colab.research.google.com/github/yihaozhong/479_data_management/blob/main/iterators_and_comprehensions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Looping "Under the Hood"

In Python, if an object can give up one element at a time, you can iterate over that particular object

### Types that are Iterable

Some examples of iterable objects are:

In [None]:
# strings
for ch in "Oscar":
    print(ch," ",end="")
print() 

O  s  c  a  r  


In [None]:
# lists
for ele in ["Charlie", "Brown"]:
    print(ele)

Charlie
Brown


In [None]:
# tuples
for n in (1, 2, 3):
    print(n)

1
2
3


In [None]:
# sets
for ele in {'Abraham', 'Isaac', 'Jacob'}:
    print(ele)

Abraham
Isaac
Jacob


In [None]:
# range objects
for i in range(5):
    print(i)

0
1
2
3
4


In [None]:
# dictionaries
d = {'x':1, 'y':2}
for k, v in d.items():
    print(k, v)

x 1
y 2


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
# a file object
with open('/content/drive/MyDrive/Colab Notebooks/llama.txt', 'r') as f:
   for line in f:
       print(line,end="")

The one-l lama,
He's a priest.
The two-l llama,
He's a beast;
And I will bet 
A silk pajama
There isn't any 
Three-l lllama.

### But How Does Looping Actually Work? 

A for loop works like this:

1. it calls iter() on the object it's looping over
2. we get an iterator back
3. the for loop continually calls next on the iterator object
4. until a StopIteration exception occurs

The looping "interface" consists of a couple of methods:

1. __iter__ on a container object; it will return an object that implements next
2. __next__ on an iterator object; it will give back an element

^^^ both container and iterator can be the *same* object (which is why you can return self for body of __iter__)

Some definitions:

* iterable object is an object that's capable of giving back an iterator
* iterator is an object that implements the __next__ method

Some built in functions:

* `iter(obj)` - will cause `obj`'s `__iter__` method to be called
* `next(obj)` - will cause `obj`'s `__next__` method to be called

### Repeatedly Calling Next on a List

This is _kind_ of how looping works. Let's start off with a list:

In [None]:
artists = ["Matisse","Picasso","O'Keeffe", "Cassatt","Renoir"]

In [None]:
# can we iterate with next?
try:
  next(artists)
except TypeError as e:
  print("TypeError:",e)  

TypeError: 'list' object is not an iterator


So to iterate we need to construct an iterator. 

In [None]:
my_iterator = iter(artists)
print(artists)
print(my_iterator)


['Matisse', 'Picasso', "O'Keeffe", 'Cassatt', 'Renoir']
<list_iterator object at 0x7f5e162605d0>


...repeatedly call next on the iterator to get all elements

In [None]:
next(my_iterator)

'Matisse'

In [None]:
next(my_iterator)

'Picasso'

In [None]:
next(my_iterator)

"O'Keeffe"

In [None]:
next(my_iterator)

'Cassatt'

In [None]:
next(my_iterator)

'Renoir'

In [None]:
# last element causes StopIteration to be "thrown"
try:
  next(my_iterator) 
except StopIteration as e:
  print("StopIteration") 

StopIteration


### Our Own Iterable...

In [None]:
class Countdown:
    def __init__(self, start):
        self.cur = start
    def __iter__(self):
        return self
    def __next__(self):
        ret = self.cur
        if self.cur > 0:
            self.cur -= 1
        else:
            raise StopIteration
        return ret

In [None]:
c = Countdown(5)

In [None]:
for i in c:
    print(i)
try:
  next(c) 
except StopIteration as e:
  print("StopIteration")    

5
4
3
2
1
StopIteration


In [None]:
c=Countdown(5)
print(next(c))
print(next(c))
print(next(c))
print(next(c))
print(next(c))
try:
  next(c) 
except StopIteration as e:
  print("StopIteration")

5
4
3
2
1
StopIteration


Ed Exercise: 

Create a "clock iterator" which adds one to the hours of a clock. Use times between 0 and 23, where 0 is midnight and 12 is noon. 23+1 equals 0. You can set the initial time to any integer between 0 and 23. (We are not supporting minutes or seconds.) This iterator, since it loops around, should never raise the StopIteration error.

In [None]:
#@title Clock Iterator { display-mode: "form" }
class Clock:
    def __init__(self, initTime):
        self.currentTime = initTime
    def __iter__(self):
        return self
    def __next__(self):
        ret = self.currentTime
        if self.currentTime < 23:
            self.currentTime += 1
        elif self.currentTime==23:
            self.currentTime=0
        return ret

In [None]:
c=Clock(21)
print(next(c))
print(next(c))
print(next(c))
print(next(c))
print(next(c))
print(next(c))


21
22
23
0
1
2


In [None]:
# dangerous to use in a loop!
# since it is automatically infinite
c=Clock(10)
i=0
for hour in c:
  print(hour," ",end="")
  i+=1
  if (i%20==0): print()
  if (i==300): break

10  11  12  13  14  15  16  17  18  19  20  21  22  23  0  1  2  3  4  5  
6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  0  1  
2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  
22  23  0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  
18  19  20  21  22  23  0  1  2  3  4  5  6  7  8  9  10  11  12  13  
14  15  16  17  18  19  20  21  22  23  0  1  2  3  4  5  6  7  8  9  
10  11  12  13  14  15  16  17  18  19  20  21  22  23  0  1  2  3  4  5  
6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  0  1  
2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  
22  23  0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  
18  19  20  21  22  23  0  1  2  3  4  5  6  7  8  9  10  11  12  13  
14  15  16  17  18  19  20  21  22  23  0  1  2  3  4  5  6  7  8  9  
10  11  12  13  14  15  16  17  18  19  20  21  22  23  0  1  2  3  4  5  
6  7  8  9  10  11  12  13  14  15  16  17  18  19  2

## Using Sorted; a Function as an Argument

In [None]:
sorted(artists)

['Cassatt', 'Matisse', "O'Keeffe", 'Picasso', 'Renoir']

In [None]:
artists

['Matisse', 'Picasso', "O'Keeffe", 'Cassatt', 'Renoir']

In [None]:
# sort by length of word (note that len, a function, is passed in as keyword argument!)
sorted(artists, key=len)

['Renoir', 'Matisse', 'Picasso', 'Cassatt', "O'Keeffe"]

In [None]:
# sort by the last character in each name
sorted(artists, key=lambda s:s[-1])


['Matisse', "O'Keeffe", 'Picasso', 'Renoir', 'Cassatt']

## List, Set, Dictionary Comprehensions

In [None]:
# a list comprehension
sFreeArtists=[name.upper() for name in artists if 's' not in name ]
sFreeArtists

["O'KEEFFE", 'RENOIR']

In [None]:
numbers = [1, 2, 2, 3, 3, 3]

In [None]:
# set comprehension
number_set={n * n for n in numbers}
print(number_set)
print(type(number_set))

{1, 4, 9}
<class 'set'>


In [None]:
print(list(enumerate("banana")))

[(0, 'b'), (1, 'a'), (2, 'n'), (3, 'a'), (4, 'n'), (5, 'a')]


In [None]:
# dictionary comprehension
# note that keys can be overwritten
d={ch: i for i, ch in enumerate('banana')}
print(type(d))
print(d)

<class 'dict'>
{'b': 0, 'a': 5, 'n': 4}


Ed exercise: write a list comprehension that returns a list of all the vowels in the string "The rain in Spain is mainly on the plain." (duplicates are OK). Then write a set comprehension for the same thing, but without the duplicates.

In [None]:
[c for c in "The rain in Spain is mainly on the plain" if c.lower() in "aeiou"]

['e', 'a', 'i', 'i', 'a', 'i', 'i', 'a', 'i', 'o', 'e', 'a', 'i']

In [None]:
{c for c in "The rain in Spain is mainly on the plain" if c.lower() in "aeiou"}

{'a', 'e', 'i', 'o'}