# <div style="text-align:center;">Iterables, Iterators, Generators</div>
<div style="text-align:center;">4/14/24</div>

<u>General definitions:</u>
- **Iteration** ==> a general term for taking each item of something, one after another. Anytime you use a loop, explicity or implicit, to go over a group of items, that is an iteration
- **Iterator** ==> an object that allows the programer to traverse through a sequence of data **without having to store the entire data in the memory**
- **Iterable** ==> is an object, which one can iterate over

<div style="text-align:center;">
    <img src="https://miro.medium.com/v2/resize:fit:786/format:webp/1*9l5bpqRGEa4LNLg3WTISPg.png" style="float:left; width:300px; margin right:10px;"/>
</div>


## **Point to remember:**
- every **iterator** is also an **iterable**
- not all **iterables** are **iterators** 

## Trick 
- every **iterable** has an **iter function**
- every **iterator** has both **iter function** as well as a **next function**

In [51]:
#Iterable:
# use for loop or dir() to see if an object is iterable
# or not 
# if dir() shows the iter function, then it is an iterable

a = 2 
L = [1,3,5]
T = (1,3,4)
D = {2:3,5:3}

# print(dir(a))
print(dir(L))

['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getstate__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']


In [49]:
# Iterator:
# use dir() both iter and next function should be there

# L is not an iterator 
dir(L) # it only has iter not next function

#iter_L is an iterator 
iter_L = iter(L)
dir(iter_L)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__length_hint__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

## Iterable 
- contains **multiple** data values that are stored and accessed in a specific way
- a data type that can be iterated over (or stepped through one element at a time)
- supports the **in** operator to check for existence of an element and the **len** function to get the number of elements
- an iterable contains multiple data values, so it's also called a container

Common built-in interables:
- list --> mutable
- str --> immutable 
- tuple --> immutable
- set --> mutable 
- dictionary --> mutable 

## Comprehension 
- should be used when an iterable is being created by adding one data element at a time to the iterable 

### 1. <u>List comprehension:</u>

[expression **for** item **in** iterable if condition]

### 2. <u>Set comprehension:</u>

{expression **for** item **in** iterable if condition}

### 3. <u>Dictionary comprehension:</u>

{ key_expression : value_expression **for** key, value interable if condition}

{ key_expression : value_expression **for** key interable if condition}

### 2D comprension to create a _list_ from a table:

In [None]:
[expression(item) for row in table for item in row]

### 2D comprension to create a _table_ from another table:

In [None]:
[[expression(item) for item in row] for row in table]

In [1]:
table = [[1,-3,5,6],[2,3,-1,9],[2,8,-4,-8]]
pos = [[ n for n in row if n>0] for row in table]
print(pos)

[[1, 5, 6], [2, 3, 9], [2, 8]]


## a) Collections: Default Dictionary 
- if we're not sure that the key exists, we need to use the get method
- a default dict will automatically fill in a default value if they key doesn't exist
- part of the collections module that we ned to import

In [None]:
# Notation:
val = myDict.get(aKey,defaultValue)

In [None]:
import collections 
myDefaultDict = collections.defaultdict(default_data_type)

For data type and the default value is:
1. int --> 0
2. float --> 0.0
3. sequence --> empty sequence 

In [3]:
# TO COUNT CHARACTERS IN A GIVEN STRING:
import collections
myStr = "what is going on everybodyyy"
letterCount = collections.defaultdict(int)
for char in myStr:
    letterCount[char] += 1 

dict(letterCount)

{'w': 1,
 'h': 1,
 'a': 1,
 't': 1,
 ' ': 4,
 'i': 2,
 's': 1,
 'g': 2,
 'o': 3,
 'n': 2,
 'e': 2,
 'v': 1,
 'r': 1,
 'y': 4,
 'b': 1,
 'd': 1}

## b) Collections: Counter Method
- provides an easier way to count

In [9]:
# USE COUNTER FOR A FASTER WAY

c = collections.Counter(iterable_object)

# to add another iterable to an exisitng counter
# use update(), just like with a dict:
c.update(another_iterable_object)

# to print the count of occurences:
for key,count in c.items():
    print(key, count)

# to print the count in sorted order, highest to lowest
# we can use the sorted function or:
for key,count in c.most_common():
    print(key,count)

In [11]:
c = collections.Counter("programmer")
print(c.most_common())

[('r', 3), ('m', 2), ('p', 1), ('o', 1), ('g', 1), ('a', 1), ('e', 1)]


## Unpacking operator

When we use the * operator:
1. in front of a sequence data type (such as a list or tuple)
2. and the sequence is used in RHS context (ex. it provides the data)
3. and the LHS requires multiple indivdual data values

Then Python will interpret the * as the <u>unpacking</u> operator. 

In [14]:
L = [8,9,0]

# the * will be interpreted as the unpacking operator
# it unpacks the a sequence of data into individual data values
print(L)
print(*L)
print(8,9,0)

[8, 9, 0]
8 9 0
8 9 0


## Packing operator
When we use the * operator:
1. in front of a sequence data type (such as a list or tuple)
2. and the sequence is being used in LHS context (ex. it receives data)
3. and the RHS is made of multiple individual data values

Then Python will interpret the * as the <u>packing</u> operator

- the packign operator groups together or packs individual data values into a sequence

In [16]:
L = [8,3,0,2,7]

#var1 --> 8
#var2 --> 3
#theRest is [0,2,7]
(var1, var2, *theRest) = L

#Error! 
# without the packign operator there are
# to few LHS variables
(var1, var2, theRest) = L

<div style="text-align:center;">
    <img src="https://prepinstadotcom.s3.ap-south-1.amazonaws.com/wp-content/uploads/2021/02/Packing-unpacking.png" style="width:400px; height:400px; margin: 0 auto;"/>
</div>


## Iterable Functions
- all()
- any()
- enumerate()
- len()
- max(), min()
- reversed()
- sorted()
- sum()
- zip()

### 1. all() 
- return true if every element evaluates to true or if the iterable is empty
- return false otherwise

In [None]:
# Notation:
all(elem % 2 for elem in iterable)

### 2. any()
- return true if one element evaluates to true
- return false otherwise or if the iterable is empty

In [None]:
# Notation:
any(elem > 0 for elem in iterable)

In [18]:
L = [8, 4, 3, -2, 0, 2, 11, -7, 6]
T = tuple()
if any(elem < 0 for elem in L) : # True
    print("Negative value detected in L")
if any(elem > 100 for elem in L) : # False
    print("At least 1 value larger than 100")
if all(elem > 0 for elem in L) : # False
    print("All positive values")
if all(elem != 0 for elem in T) : # True
    print("All non-zero in T")

Negative value detected in L
All non-zero in T


### 3. enumerate()
- used with **for in** to get a **count** and **the element** of an iterable 

In [None]:
# Notation:
for count, elem in enumerate(iterable, start=n):
    # code
    # count starts at n and counts up 

In [20]:
T = ("A", "B", "C")
for num, letter in enumerate(T, 65) :
    print(num, letter)
print()
for num, letter in enumerate(T) :
    print(num, letter)

65 A
66 B
67 C

0 A
1 B
2 C


#### len()
- return the number of elements of the iterable

In [None]:
# Notation:
len(iterable)

### 4. max() & min()
- return the max or the min value of the iterable

In [None]:
# Notation:
max(iterable)
min(iterable)

### 5. reversed()
- return an iterator in reversed order

In [None]:
# Notation:
for elem in reversed(iterable):
    # iterate from the end to the front of iterable 
    # to fetch elem

### 6. sorted() 
- return an interable in sorted order 

In [None]:
# Notation:
sorted(iterable) # ascending 
sorted(iterable, reverse=True) # descend

### 7. sum()
- for an iterable of numbers: return the total of all element values

In [None]:
# Notation:
sum(iterable,n)
# return: sum of all elemens + n 
# n defaults to 0 

### 8. zip()
- returns an iterator of tuples

In [None]:
# Notation:
for t in zip(seq1, seq2, seq3):
    # t is a tuple of corresponding elements 
    # from seq1, seq2, seq3
    # number of iteration = shortest list length 

In [23]:
L1 = (1, 3, 5, 7, 9)
L2 = (2, 4, 6, 8)
L3 = (100, 200, 300, 400, 500)
for elem in zip(L1, L3) :
    print(elem)
D = dict(zip(L3, L1))
print(D)
#D = dict(zip(L1, L2, L3)) # RUN THIS TO SEE WHAT HAPPENS
for elem in zip(L1, L2, L3) :
    print(elem)

(1, 100)
(3, 200)
(5, 300)
(7, 400)
(9, 500)
{100: 1, 200: 3, 300: 5, 400: 7, 500: 9}
(1, 2, 100)
(3, 4, 200)
(5, 6, 300)
(7, 8, 400)


## Copying Iterables

In [None]:
L1 = [8,2,5]
L2 = L1 # L2 is an another name for the list at L1
# this is not creating a copy of the list L1 

#### <u>Common ways to copy a 1D list:</u>

In [None]:
L2 = list(L1)
L2 = L1.copy()

#### <u>Common Ways to copy a 2D or high dimensional iterable:</u>
- deepy copy ==> make a new object of each internal iterable
- ex: if L3 is a list of 3 internal lists, a deep copy will copy 4 lists:
  - the 3 internal lists that are referenced by L3

In [24]:
import copy 
L3 = [[2,3,4],[4,5,9],[2,4,8]]
L4 = copy.deepcopy(L3)
print(L3)
print(L4)

[[2, 3, 4], [4, 5, 9], [2, 4, 8]]
[[2, 3, 4], [4, 5, 9], [2, 4, 8]]


## Iterator
- contains **one** data value that is part of a sequence of data
- has a method to generate and return the next data value in the sequence
- an interator can work with an iterable, as a "front-end" to the iterable
  - in this case the iterator uses the sequence of data from the iterable and returns one data value at a time
- an iterator can also be a stand alone object
  - in this case it has code to generate one data at a time from an algorithm
 

iterable listA --> iterator --> one data value

iterator with code to generate data --> one data value

## Iterator that works with an iterable 
- an iterator is actually the mechanism used **"under the hood"** in a **for loop**
- are also used "under the hood" to implement functions that work with **iterables**: range(), enumerate(), and zip()
  - these are all functions that allow us to fetch one data at a time from the iterable


This code is runninng like this in Python:

In [None]:
for elem in tupleA:
    print(elem) 

In [None]:
i = iter(tupleA) # create iterator which is the "front end" for tupleA
while True:
    try:
        print(next(i)) # use nect to get one data values from the iterable
    except StopIteration: # end lop when StopIteration exception occurs
        break

To use an iterator i that works with an iterable, as shown above:
- Create an object of the iter class and pass in the iterable. The iterator now becomes the “front end” to the iterable.
- Write a while loop that stops (breaks) when a StopIteration exception is detected. StopIteration occurs when there is no more data in the iterable.
- In each iteration of the while loop, call next() to get the next data in the sequence.

If we want to print the entire data sequence from these functions, such as zip() we cannot simply print from zip(). This code below gives us the adress of the iterator object not the sequence of tuples:

In [25]:
L1 = [1,3,5]
L2 = [2,4,6]

print(zip(L1,L2)) 

<zip object at 0x107849100>


We can use a for loop so that the next() function will be called:

In [28]:
for elem in zip(L1, L2):
    print(elem, end="")

(1, 2)(3, 4)(5, 6)

Or taking advantage of the fact that an iterator will produce the sequence of data, we can use the unpacking operator:

In [30]:
print(*zip(L1,L2))

(1, 2) (3, 4) (5, 6)


## Stand Alone iterator 
- Iterators generate data one piece at a time, saving memory by not storing the entire sequence
- This is super useful, especially for large or infinite data sets
- ex. instead of making a list of a million prime numbers, an iterator would give you one prime number at a time when you need it, saving memory

## Writing an Iterator 
- ex. We need to generate powers of 2, like 1, 2, 4, 8, 16, etc., based on the user's choice. Since this sequence is endless and we only need a specific number of values, we can use an iterator. The iterator will give us one value at a time, based on what the user wants, without storing all the values at once. We'll create a Powers2 class as the iterator, then use it to get the next power of 2 value whenever needed.

In [32]:
class Powers2:
    def __init__(self):
        self._exp = 0
    def __iter__(self):
        return self
    def __next__(self):
        value, self._exp = self._exp, self._exp + 1
        return 2 ** value 


In [35]:
p2 = Powers2()
print(next(p2))

for i in range(3):
    print(next(p2))

1
2
4
8


Note that only one data value is generated by the iterator. There is no memory used to store multiple powers of 2. 

## Generator 
- a simple way to implement an iterator class
- most of the time when we want an iterator, we write a generator instead

There are two ways to write a generator:
1. a generator *expression* ==> simplet coding but it can only be used with a finite sequence of data
2. a generator *function* ==> more coding (not as much as writing an iterator directly) but it can work with an infinite sequence


<div style="text-align:center;">
    <img src="https://media.licdn.com/dms/image/D4D12AQF2avI7qeUjMw/article-cover_image-shrink_720_1280/0/1682146643594?e=1718841600&v=beta&t=-4GMYb11TzTsfNCtsavBr005ER9aTV1AM6jO6hVrMsA" style="width:350px; height:300px; margin: 0 auto;"/>
</div>


### 1) Generator Expression:

In [None]:
gen = ( elem for elem in iterable if condition)

# Usage:
val = next(gen)

In [42]:
p2 = (2** num for num in range(1,1001))
print(next(p2))
for i in range(3):
    print(next(p2))

2
4
8
16


### 2) Generator Comprehension:

In [43]:
p2 = [2** num for num in range(1,1001)]

### What is the difference between () or []? 

**Comprehension expression:**
- Pro: because the list is stored, we can walk forward or backward to fetch any data as many times as we like
- Con: produces a list with al 1000 data values, which needs to be stored and takes up memory space, and we might not need all 1000 values

**Generator expression:**
- Pro: produces no data value unless requested with next() so no data storage neede. This on-demand generayion is also called <u>lazy evaluation</u>, the work to generate data is only done when needed
- Con: can only go to the next data (no going back to previous data), and once data is fetched with next(), we cannot fetch it again


## Generator Function
- when we want a genertor that works with an indi
- the generator function must include a yield keyword, which returns the next value in the sequence
- **it is the <u>yield</u> that makes Python interpret the function as a generator function**

In [45]:
def Powers2():
    exp = 0
    while True:
        yield 2 ** exp
        exp += 1

p2 = Powers2()
print(next(p2))
for i in range(3):
    print(next(p2))

1
2
4
8


## Examples:

In [54]:
def gen_demo():
    yield "first"
    yield "second"
    yield "third"

gen = gen_demo()

print(next(gen))
print(next(gen))
print(next(gen))

first
second
third


In [55]:
def my_range(start,end):
    for i in range(start,end):
        yield i 

for i in my_range(15,26):
    print(i)

15
16
17
18
19
20
21
22
23
24
25


In [56]:
gen = (i**2 for i in range(1,30))
for i in gen:
    print(i)

1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
361
400
441
484
529
576
625
676
729
784
841


## Advantages of Iterators and Generators 
- **very efficent** ways to get a subset of consecutive values from a **large data sequence**
- **eliminates the CPU time** to generate extra values that may not be needed if the user only wants a couple of values
- also **saves a lot of memory space** because there's no need to keep a big list
- a generator is a **shortcut** to writing a full iterator
  - either as a function with yield or as a generator expression
- generator only moves forward (move to the next element in the sequence)
  - if we want to be able to "undo" the previous next() call, we can write the iterator and provide a backup() method to go back to the previous element 