# Unleashing the Power of Python Generators

Python provides several constructs for creating iterators, with generators being one of the most powerful, nifty, and efficient. They allow you to iterate through a sequence of values, but unlike a regular function, they yield values one at a time, pausing their state between each yield. This makes them super efficient for large datasets or infinite sequences since they only compute values as you need them.

#### Key Concepts
- **`yield` Keyword:** The `yield` statement is used to produce a value and pause the function's execution, maintaining its state for subsequent calls.
- **Lazy Evaluation:** Generators evaluate data lazily, meaning they generate values only as needed.

Let us look at a simple example:

In [6]:
def generator_example(): # Generator Function
    yield 1
    yield 2
    yield 3

gen = generator_example() # Creating instance of generator

print(next(gen)) # Extracting values from gen object one at a time
print(next(gen))
print(next(gen))
print(next(gen)) # Generator object indicating with "StopIteration" error that it is done

1
2
3


StopIteration: 

Notice how the generator function generator_example uses yield instead of return. Each time yield is called, the generator's state is saved, and the value is returned to the caller. When next() is called again, the generator resumes right after the yield. This iteration goes on till the generator has exhausted its limits, like in the 4th yield call here, which raised a `StopIteration` error

However, the same can be achieved more efficiently with an iterator like `For Loop` without having to worry about when the end of the generator is reached. i.e.

In [7]:
gen = generator_example() # Creating instance of generator

for val in gen: # Looping over generator to fetch its values
    print(val)

1
2
3


Generators in Python are often described as using lazy evaluation because they don't compute values until needed. Instead of producing all the values at once and storing them in memory, generators yield values one at a time and only when requested.

Imagine you have a huge dataset. Using a generator, Python will calculate each item in the sequence only when you iterate over it, rather than calculating and storing the entire sequence in memory upfront. This makes them memory-efficient and perfect for handling large or even infinite datasets.

Lazy but brilliant - right when you need them, they spring into action.

## Generator Comprehension
Generator comprehension is a powerful, concise way to create generators in Python. Similar to list comprehensions, generator comprehensions allow you to define a generator in a single line. However, while list comprehensions return a list, generator comprehensions return a generator object, which yields items one at a time.

In [8]:
number_generator = (_ for _ in range(1000001))
print(number_generator)
print(type(number_generator))

<generator object <genexpr> at 0x0000014877B75000>
<class 'generator'>


## Benefits of Generators
Generators are like that efficient buddy who gets things done with minimal fuss and maximum efficiency:

**1. Memory Efficiency:** Generators produce values one at a time, only when they are requested, rather than generating an entire sequence at once. This means you don't have to load all the data into memory, which is particularly helpful when working with large datasets.
Example: If you're processing a file with millions of lines, loading the entire file into memory would be inefficient or even infeasible. A generator reads and processes one line at a time, reducing memory usage.

In [9]:
import sys

number_list = [_ for _ in range(1000001)]
print(f'Size of number_list      is {sys.getsizeof(number_list):>7} bytes')

number_generator = (_ for _ in range(1000001)) # Generator for numbers 0 to 1000000
print(f'Size of number_generator is {sys.getsizeof(number_generator):>7} bytes')

Size of number_list      is 8448728 bytes
Size of number_generator is     192 bytes


**2. Infinite Sequences:** Generators can be used to model infinite sequences, such as the natural numbers or Fibonacci Series, etc. This isn't possible with lists or other data structures because they would run out of memory. In the below example, we see how easily we could create an infinite Fibonacci Series, while only consuming ~200 bytes of memory.

In [10]:
# Define a Generator Function for Infinite Fibonacci Series
def fibonacci_generator():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, (a+b)

# Create the generator object
gen = fibonacci_generator()
print(gen)
print(sys.getsizeof(gen))

<generator object fibonacci_generator at 0x0000014878381A40>
200


**3. Lazy Evaluation:** Since generators compute values on the fly, they're great for scenarios where you might not need all the values upfront. This leads to better performance because it avoids unnecessary calculations.

In the above example, we created a generator for the infinite Fibonacci Series. Utilizing the same, in the below example we'll extract and produce the first 10 elements from the series.

In [11]:
# Define a Generator Function for Infinite Fibonacci Series
def fibonacci_generator():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, (a+b)

# Create the generator object
gen = fibonacci_generator()

# Get the first 5 numbers from the generator
for _ in range(10):
    print(next(gen), end=', ')

0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 

Or, even better we can use Python libraries like itertools to slice a generator object to get specific parts of the generator object. In the below example, we extract, the 10th to 20th elements from the infinite Fibonacci Series.

In [12]:
import itertools as itt

gen = fibonacci_generator()
fibo_10_20 = [*itt.islice(gen, 9, 20)] # use islice from itertools to slice 10th to 20th elements and expand them into a list

print(f'Size of fibo_10_20 is {sys.getsizeof(fibo_10_20)} bytes')
print(fibo_10_20)

Size of fibo_10_20 is 184 bytes
[34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]


**4. Improved Performance:** Generators can improve performance by avoiding the overhead of creating a full list or sequence before processing can start. Since they yield items one at a time, execution can begin immediately, and you don't have to wait for all the data to be generated. This can lead to faster execution times, especially for operations involving large data sets. So, generators not only improve performance in terms of memory usage but also make your code more responsive.

*Example:* If you're streaming data from a sensor or external API, a generator allows you to process the data in real time as it comes in, rather than waiting for the entire dataset. Or you could easily read a very large log file one line at a time, without having to load its entire contents into memory.

In [13]:
# Generator function to read log file and yield lines with errors
def read_large_file(file_path):
    with open(file_path, 'r') as file:
        for line in file:
            if "ERROR" in line:
                yield line

log_file = 'large_log_file.txt' # Hypothetical Log File
error_lines = list(read_large_file(log_file))
print(f"Total error lines: {len(error_lines)}")

Total error lines: 151


**5. Simplified Code and Iteration:** Generators allow you to write cleaner and more readable code. Instead of managing iteration manually with counters and loops, generators let you focus on what you want to produce, and Python handles the iteration for you.

It also provides with Encapsulation as Generators automatically remember the last position in the iteration, so you don't need to manage the state manually between calls. This makes them ideal for handling tasks that require maintaining an internal state over iterations, like reading through a large file or sequence of items.

**6. Pipelining or Chaining Generators:** Generators can be chained together to form a pipeline of operations. This allows you to process data in stages and can be very useful for tasks like data processing, where each stage of the pipeline processes one item at a time.

*Example:* You could have a generator that generates even numbers and another that generates square roots of those even numbers. Each stage processes one item at a time, allowing for the efficient handling of large datasets.

In [15]:
import math

def even_numbers(numbers):
    for n in numbers:
        if n % 2 == 0:
            yield n

def even_sqrt(numbers):
    for n in numbers:
        yield round(math.sqrt(n), 2)

# Chain them together
numbers = range(30)
for num in zip(even_numbers(numbers), even_sqrt(even_numbers(numbers))):
    print(num, end=',  ')

(0, 0.0),  (2, 1.41),  (4, 2.0),  (6, 2.45),  (8, 2.83),  (10, 3.16),  (12, 3.46),  (14, 3.74),  (16, 4.0),  (18, 4.24),  (20, 4.47),  (22, 4.69),  (24, 4.9),  (26, 5.1),  (28, 5.29),  

### Conclusion

In conclusion, Python generators offer a powerful and memory-efficient way to work with large datasets and complex computations. By yielding values one at a time, generators allow for lazy evaluation and can significantly optimize performance. Whether you're dealing with infinite sequences or simply want to improve the readability and maintainability of your code, generators are an essential tool 
in any Python programmer's toolkit.