## Scaling with Generators

#### 1. Iteration in Python

In [2]:
numbers = [7,4,11,3]
iter(numbers)
for num in numbers:
    print(num)

7
4
11
3


In [3]:
number_iter = iter(numbers)
id(number_iter)



140492475161472

In [4]:
numbers.__iter__

<method-wrapper '__iter__' of list object at 0x7fc6f4c43580>

In [5]:
numbers.__iter__()

<list_iterator at 0x7fc6f4135c40>

In [6]:
names = ["Tom", "Shelly", "Garth"]
names_it = iter(names)
next(names_it)

'Tom'

In [7]:
next(names_it)


'Shelly'

In [8]:
next(names_it)

'Garth'

In [9]:
def gen_extra_nums():
    n = 0
    while n < 4:
        yield n
        n+=1
    yield 42

In [10]:
for num in gen_extra_nums():
    print(num)

0
1
2
3
42


In [11]:
def gen_squares(max_root):
    for num in range(max_root):
        yield num**2

In [12]:
max = 5
for square in gen_squares(max):
    print(square)

0
1
4
9
16


##### 1.1. Do we need generators?

* Strictly speaking, NO. We just want them to make useful patterns of scalability far easier.

In [13]:
class SquareIterator:
    def __init__(self, max_root):
        self.max_root = max_root
        self.current_root_value = 0

    def __iter__(self):
        return self 
    
    def __next__(self):
        if self.current_root_value >= self.max_root:
            raise StopIteration
        square = self.current_root_value ** 2
        self.current_root_value += 1
        return square

##### Code explain

* Each value is obtained by invoking its __next__() method, until it raises StopIteration.

In [14]:
for square in SquareIterator(10):
    print(square)

0
1
4
9
16
25
36
49
64
81


#### 2. Generator patterns and scalable Composability

In [15]:
## Sample generator function
def matching_line_from_files(path, pattern):
    with open(path) as handle: # open a read-only oject called handle
        for line in handle:
            if pattern in line:
                yield line.rstrip('\n')

#### 3. Text lines to Dicts

In [17]:
for line in matching_line_from_files("logs.txt", "WARNING:"):
    print(line)



In [18]:
def parse_log_records(lines):
    for line in lines:
        level,message = line.split(':', 1)
        yield {'level': level, 'message': message}


In [19]:
log_lines = matching_line_from_files("logs.txt", "WARNING:")
for record in parse_log_records(log_lines):
    print(record)



In [20]:
with open("logs.txt") as handle:
    for record in parse_log_records(handle):
        print(record)

{'level': 'DEBUG', 'message': " User 'tinytik' updated to Pro version\n"}
{'level': 'INFO', 'message': ' Sent email campaign, completed successfully\n'}


#### 4. Composable interfaces

In [21]:
# Break up services in matching_lines_from_file() into 2 generator functions

def lines_from_file(path):
    with open(path) as handle:
        for line in handle:
            yield line.rstrip('\n')

def matching_lines(lines, pattern):
    for line in lines:
        if pattern in line:
            yield line

In [22]:
lines = lines_from_file("logs.txt")
matching = matching_lines(lines, 'WARNING:')
for line in matching:
    print(line)



* **lines_from_file()** is a **source** function. 
* A real program wants to do something with that stream, consuming it without producing another iterator, call that a **sink**

* **Source Functions**:
    * Definition: Source functions are responsible for creating or retrieving data that will be used by other parts of the program. 
    * Examples:
        * Reading data from a file. 
        * Receiving data from a network socket. 
        * Generating random numbers. 
        * Retrieving data from a database. 
* --> Purpose: To provide the initial data for the program's operations. 
* **Sink Functions**:
    * Definition: Sink functions are responsible for receiving and processing data, often as the final step in a data pipeline. 
    * Examples:
        * Writing data to a file. 
        * Sending data over a network socket. 
        * Displaying data on the screen. 
        * Storing data in a database. 
* --> Purpose: To provide a place for the data to end up after being processed by other parts of the program. 

#### 5. Fanning out


In [23]:
def words_in_text(lines):
    for line in lines:
        for word in line.split():
            yield word

In [24]:
poem_line = lines_from_file('poem.txt')
poem_word = words_in_text(poem_line)
for word in poem_word:
    print(word)

all
night
our
room
was
outer-walled
with
rain
drops
fell
and
flattened
on
the
tin
roof
and
rang
like
little
disks
of
metal


* The **word_in_text()** function takes the **fan out** approach. No iput records are dropped, it's still in the "mapping" category of generator functions. 

#### 6. Fanning in

* The generator function consumes more than one input record to proudce each output record

In [32]:
def house_records(lines):
    record = {}
    for line in lines:
        if line == '':
            yield record 
            record = {}
            continue 
        key, value = line.split(': ',1)
        record[key] = value 
    yield record


In [33]:
with open('house_sales.txt', 'r') as file:
    lines = [line.strip() for line in file]

for record in house_records(lines):
    print(record)

{'address': '1423 99th Ave', 'square_feet': '1705', 'priced_usd': '340210'}
{'address': '24257 Pueblo Dr', 'square_feet': '2305', 'priced_usd': '170210'}
{'address': '127 Cochran', 'square_feet': '2068', 'priced_usd': '320500'}
