\[<< [Other functions concepts](./05_other_functions_concepts.ipynb) | [Index](./00_index.ipynb) | [Content Manager](./07_context_managers.ipynb) >>\]

# Sequence, Iterator and Generator

**Sequences**

- Sequences are ordered collections of elements in Python.
- Each element in a sequence is indexed by a unique integer.
- Common sequence types in Python include strings, lists, tuples, and range objects.
- Sequences are iterable, meaning you can traverse through their elements one by one.

Sequence support `membership operator` (`in` and `not in`)

In [None]:
numbers = [1, 2, 3, 4, 5]

print(f"{1 in numbers = }")
print(f"{10 not in numbers = }")

Sequence support `concatenation`

In [None]:
numbers1 = [1, 2, 3, 4, 5]
numbers2 = [6, 7, 8, 9, 10]

print(numbers1 + numbers2)

Sequence support `repetition`

In [None]:
numbers1 = [1, 2, 3, 4, 5]

print(3 * numbers)

Sequence support built-in functions like `len`, `min`, `max` etc.

In [None]:
numbers1 = [1, 2, 3, 4, 5]

print(f"{len(numbers) = }")
print(f"{min(numbers) = }")
print(f"{max(numbers) = }")

Sequence are `indexable`

In [None]:
numbers = [1, 2, 1, 3, 5, 7, 2, 9, 3, 7, 3, 4, 5]

print(f"Index of first 1: {numbers.index(1) = }")
print(f"Index of first 1, at or after index 1: {numbers.index(1, 1) = }")
print(f"Index of first 3, at or after index 4 and before 12: {numbers.index(3, 4, 12) = }")

Sequence can be `slice`

In [None]:
numbers = [1, 2, 1, 3, 5, 7, 2, 9, 3, 7, 3, 4, 5]

print(f"{numbers[2] = }")
print(f"{numbers[2:10] = }")
print(f"{numbers[2:10:3] = }")
print(f"{numbers[-1] = }")

---
**Iterators**

- An iterator is an object that implements the iterator protocol.
- The iterator protocol involves two methods: `__iter__()` and `__next__()`.
- The `__iter__()` method returns the iterator object itself.
- The `__next__()` method returns the next element in the sequence or raises a `StopIteration` exception if there are no more elements to be iterated.
- Iterators allow efficient traversal through a collection of elements without loading the entire collection into memory.
- Custom iterators can be created by defining classes with `__iter__()` and `__next__()` methods.
- Think of [**lazy evaluation**](https://en.wikipedia.org/wiki/Lazy_evaluation) when you think of `iterators`.

In [None]:
numbers = [1, 2, 3]

number_iterator = iter(numbers)

In [None]:
print(type(number_iterator))

In [None]:
print(number_iterator)

In [None]:
print(next(number_iterator))
print(next(number_iterator))
print(next(number_iterator))
print(next(number_iterator))

In [None]:
numbers = [1, 2, 3]

number_iterator = iter(numbers)

for number in number_iterator:
    print(number)

`for` loop under the hood

In [None]:
number_iterator = iter(numbers)

while True:
    try:
        value = next(number_iterator)
    except StopIteration:
        break
    else:
        print(value)

**Creating custom iterator**

Generally custom iterator are created using class and you will define an `__iter__` method and a `__next__` method.

While defining an iterator you might want to raise `StopIteration` once you reach a certain condition. This is what is used by `for` to stop calling next over the iterator object.

Example of creating a iterator which behave similar to Python's built-in `range` functions:

In [None]:
class MyRangeIterator:
    def __init__(self, start, end, step=1):
        self.current = start
        self.end = end
        self.step = step

    def __iter__(self):
        # An iterator must return itself as an iterator
        return self

    def __next__(self):
        if self.current < self.end:
            value = self.current
            self.current += self.step
            return value
        else:
            raise StopIteration

In [None]:
for item in MyRangeIterator(0, 10, 2):
    print(item)

You can also have infinite iterator which iterates forever. User have to put bound to make sure they don't end up with infinite loop.

Example of creating an infinite iterator which produce square sequence starting from number 1:

In [None]:
class SquareIterator:
    def __init__(self):
        self._number = 0
        
    def __iter__(self):
        return self
    
    def __next__(self):
        self._number += 1
        return self._number ** 2

In [None]:
square_iterator = SquareIterator()

for number in square_iterator:
    if number > 150:
        break
    print(number)

Later we will have a look into `itertools` module to see how we can use functions like `islice` or `takewhile` to get certain number of items from an iterator.

---

**Generators**

- A generator is a special type of iterator that simplifies iterator creation using functions.
- Instead of implementing `__iter__()` and `__next__()` methods, you can use the `yield` keyword within a function.
- When a generator function is called, it returns a generator object that can be iterated over.
- The `yield` statement produces a value on-the-fly during iteration, and the function's state is saved between yields.
- Generators are memory-efficient as they generate values as needed, making them ideal for handling large datasets or infinite sequences.
- Infinite sequences can be created easily using generators without consuming excessive memory.

In [None]:
def square_generator():
    number = 1
    while True:
        yield number ** 2
        number += 1

In [None]:
print(type(square_generator()))

In [None]:
sq = square_generator()
print(next(sq))
print(next(sq))
print(next(sq))
print(next(sq))

In [None]:
for number in square_generator():
    if number > 150:
        break
    print(number)

---
The `yield from` statement is used in Python to delegate part of the operations of a generator to another generator, iterable, or iterator. It simplifies and enhances the capabilities of nested generators, allowing you to avoid writing complex nested loops or repetitive `for` loops. The `yield from` statement was introduced in Python 3.3 to improve the clarity and efficiency of working with nested generators.

When to use `yield from`:

You should use `yield from` when you have multiple generators, iterables, or iterators that need to be combined or delegated to produce a single stream of data. It simplifies the process of handling nested generators and can significantly improve the readability of your code.

Here are some situations where `yield from` is helpful:

1. Combining data from multiple sources: When you have different data sources represented as generators or iterables, `yield from` can help you combine them into a single generator efficiently.

2. Recursive generators: If you have generators that call other generators recursively, `yield from` simplifies the delegation of responsibility between generators, avoiding the need for explicit loops.

3. Fluent generator chaining: When you want to chain multiple generators in a fluent and expressive manner, `yield from` can make the code more concise and readable.

In [None]:
def dataset1():
    for i in range(1, 6):
        yield f"Dataset 1 - Data point {i}"

def dataset2():
    for i in range(6, 11):
        yield f"Dataset 2 - Data point {i}"

def dataset3():
    for i in range(11, 16):
        yield f"Dataset 3 - Data point {i}"

def combined_datasets():
    yield from dataset1()
    yield from dataset2()
    yield from dataset3()


for data_point in combined_datasets():
    print(data_point)

---
**Iterator** are more memory-efficient than **Generator** 

In [None]:
import sys

# Get memory usage for generator
gen = square_generator()
print(f"Memory usage for infinite square generator: {sys.getsizeof(gen)} bytes")

# Get memory usage for iterator
it = SquareIterator()
print(f"Memory usage for infinite square iterator: {sys.getsizeof(it)} bytes")

---
**Generator** are faster then **Iterator**

In [None]:
gen = square_generator()
%timeit -n 100000 -r 5 next(gen)

In [None]:
it = SquareIterator()
%timeit -n 100000 -r 5 next(it)

---
Useful `built-in functions`, `itertools functions`, `functools functions` to work with Iterators

**Aggregators**

1. `sum()`:
   - Example: Calculating the total sales from a list of daily sales.

In [None]:
daily_sales = [100, 150, 120, 80, 200]
total_sales = sum(daily_sales)
print(f"Total Sales: {total_sales}")  # Output: Total Sales: 650

2. `max()`:
   - Example: Finding the highest temperature recorded in a week.

In [None]:
weekly_temperatures = [24, 27, 25, 28, 26, 22, 30]
highest_temperature = max(weekly_temperatures)
print(f"Highest Temperature: {highest_temperature}")  # Output: Highest Temperature: 30

3. `min()`:
   - Example: Finding the lowest score in a quiz.

In [None]:
quiz_scores = [85, 72, 90, 68, 78]
lowest_score = min(quiz_scores)
print(f"Lowest Score: {lowest_score}")  # Output: Lowest Score: 68

4. `any()`:
   - Example: Checking if any of the employees have overtime hours.

In [None]:
overtime_hours = [0, 0, 0, 3, 1, 0]
has_overtime = any(overtime_hours)
print(f"Has Overtime: {has_overtime}")  # Output: Has Overtime: True

5. `all()`:
   - Example: Verifying if all students passed in a class.

In [None]:
exam_results = [True, True, False, True, True]
all_passed = all(exam_results)
print(f"All Passed: {all_passed}")  # Output: All Passed: False

6. `len()`:
   - Example: Counting the number of students in a class.

In [None]:
students = ["Alice", "Bob", "Charlie", "David", "Eva"]
num_students = len(students)
print(f"Number of Students: {num_students}")  # Output: Number of Students: 5

---
**Slicing Iterables**

1. `slice()` - Works only with sequence:
   - Example: Extracting a specific portion of a string representing a date.

In [None]:
date_str = "2023-07-15"
year_slice = slice(0, 4)
month_slice = slice(5, 7)
day_slice = slice(8, 10)

year = date_str[year_slice]
month = date_str[month_slice]
day = date_str[day_slice]

print(f"Year: {year}, Month: {month}, Day: {day}")

`slice` will not work on iterators.

In [None]:
numbers = SquareIterator()
first_5_square = slice(numbers, 5)

for num in first_5_square:
   print(num)

2. `islice()` (itertools) - Works with iterators:
   - Example: Extracting a limited number of elements from a long list.

In [None]:
from itertools import islice

numbers = SquareIterator()
first_5_square = islice(numbers, 5)  # Get the first 10 numbers

for num in first_5_square:
   print(num)

print("*" * 20)

numbers = SquareIterator()
next_5_square = islice(numbers, 5, 10)  # Get from 5th item to 10th item

for num in next_5_square:
   print(num)

---
**Selection and Filtering**

1. `filter()`:
   - Example: Filtering even numbers from a list of integers.

In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

def is_even(num):
   return num % 2 == 0

for even_number in filter(is_even, numbers):
    print(even_number)

2. `itertools.filterfalse()`:
   - Example: Filtering strings with a length greater than 5 from a list of words.

In [None]:
from itertools import filterfalse

words = ["apple", "banana", "orange", "grape", "watermelon"]

for long_word in filterfalse(lambda word: len(word) <= 5, words):
    print(long_word)

3. `functools.partial()` (for filtering with a custom function):
   - Example: Using `functools.partial()` to filter scores above a certain threshold from a list of exam results.

In [None]:
from functools import partial

def is_above_threshold(threshold, score):
   return score >= threshold

exam_results = [85, 90, 78, 92, 68, 95]

threshold_filter = partial(filter, partial(is_above_threshold, 90))
for score_above_threshold in threshold_filter(exam_results):
    print(score_above_threshold)

4. `itertools.takewhile()`:
   - Example: Taking numbers from a list until a condition is met.

In [None]:
from itertools import takewhile

def is_positive(num):
   return num > 0

numbers = [1, 2, 3, -4, 5, 6]
for positive_number in takewhile(is_positive, numbers):
    print(positive_number)

5. `itertools.dropwhile()`:
   - Example: Skipping elements from a list until a condition is met.

In [None]:
from itertools import dropwhile

def is_negative(num):
   return num < 0

numbers = [-1, -2, -3, 4, 5, 6]
for positive_number in dropwhile(is_negative, numbers):
    print(positive_number)

6. `itertools.compress()`:
   - Example: Selecting elements from a list based on a corresponding boolean mask.

In [None]:
from itertools import compress

data = [1, 2, 3, 4, 5]
mask = [True, False, True, False, True]

for selected_data in compress(data, mask):
    print(selected_data)

---
**Infinite Iterators**

1. `itertools.count()`:
   - Example: Generating a unique identifier for each item in a list.

In [None]:
import itertools

items = ['apple', 'banana', 'orange', 'grape']
unique_ids = zip(itertools.count(start=1), items)

for uid, item in unique_ids:
   print(f"Item: {item}, ID: {uid}")

   In this example, `itertools.count()` is used to generate an infinite sequence of numbers starting from 1. The `zip()` function combines the count with each item from the list, creating unique identifiers for each item.

2. `itertools.cycle()`:
   - Example: Assigning tasks to a group of workers in a cyclical manner.

In [None]:
import itertools

tasks = ['Task A', 'Task B', 'Task C', 'Task D', 'Task E', 'Task F', 'Task G', 'Task H']
workers = ['John', 'Alice', 'Bob']
task_assignments = zip(tasks, itertools.cycle(workers))

for task, worker in task_assignments:
   print(f"Task: {task}, Assigned to: {worker}")

   In this example, `itertools.cycle()` is used to create an infinite iterator that cycles through the list of workers. The `zip()` function pairs each task with the next worker in the cycle, ensuring that tasks are assigned in a cyclical fashion.

3. `itertools.repeat()`:
   - Example: Repeating a specific value a certain number of times.

In [None]:
import itertools

value = 'Hello, World!'
repeated_values = itertools.repeat(value, times=3)

for repeated_value in repeated_values:
   print(repeated_value)

   In this example, `itertools.repeat()` is used to generate an iterator that repeats a specific value (`'Hello, World!'`) for a specified number of times (`3` in this case). The loop then prints each repeated value.

---
**Mapping and Reducing**

In [None]:
# Sales data (product, [sales_for_store_1, sales_for_store_2, ...])
sales_data = [
    ("Apple", [100, 150, 120]),
    ("Banana", [80, 90, 70]),
    ("Orange", [200, 180, 220]),
    ("Grapes", [70, 60, 80]),
]

In [None]:
# Using map() to calculate the total sales for each product
total_sales_map = map(lambda product_sales: (product_sales[0], sum(product_sales[1])), sales_data)
print(list(total_sales_map))

In [None]:
from itertools import starmap
from functools import reduce

# Function to calculate total sales for a product
def calculate_total_sales(product, sales_list):
    total_sales = reduce(lambda x, y: x + y, sales_list)
    return (product, total_sales)

# Calculate total sales for each product
total_sales_per_product = list(starmap(calculate_total_sales, sales_data))

print(total_sales_per_product)

In [None]:
from functools import reduce


# Using functools.reduce() to get the grand total sales for all products
grand_total_sales = reduce(lambda total, product_sales: total + sum(product_sales[1]), sales_data, 0)
print(grand_total_sales)

In [None]:
from itertools import accumulate

numbers = [1, 2, 3, 4, 5]
cumulative_sums = list(accumulate(numbers))

print(cumulative_sums)

---
**Chaining and Teeing**

1. `itertools.chain(*args)`:
   - Example: Combining multiple lists into a single flattened list.

In [None]:
from itertools import chain

list1 = [1, 2, 3]
list2 = [4, 5, 6]
list3 = [7, 8, 9]

combined_list = list(chain(list1, list2, list3))
print(combined_list)

2. `itertools.chain.from_iterable(it)`:
   - Example: Flattening a list of lists into a single list.

In [None]:
from itertools import chain

nested_lists = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

flattened_list = list(chain.from_iterable(nested_lists))
print(flattened_list)

3. `itertools.tee(iterable, n)`:
   - Example: Creating multiple independent copies of an iterator.

In [None]:
from itertools import tee

numbers = [1, 2, 3, 4, 5]

# Create two independent copies of the iterator
iterator1, iterator2 = tee(numbers, 2)

# Using the first iterator
for num in iterator1:
   print(f"Iterator 1: {num}")

# Using the second iterator
for num in iterator2:
   print(f"Iterator 2: {num}")

---
**Grouping**

1. `itertools.groupby()`:
   - Example: Grouping students based on their grades in a class.

In [None]:
from itertools import groupby

def get_grade(student):
   score = student['score']
   if score >= 90:
       return 'A'
   elif score >= 80:
       return 'B'
   elif score >= 70:
       return 'C'
   elif score >= 60:
       return 'D'
   else:
       return 'F'

students = [
   {'name': 'Alice', 'score': 88},
   {'name': 'Bob', 'score': 95},
   {'name': 'Charlie', 'score': 72},
   {'name': 'David', 'score': 63},
   {'name': 'Eva', 'score': 78},
   {'name': 'Frank', 'score': 92},
]

students.sort(key=get_grade)  # Sort the list based on grades
grouped_students = {grade: list(students) for grade, students in groupby(students, key=get_grade)}

for grade, students in grouped_students.items():
   print(f"Students with grade {grade}: {[student['name'] for student in students]}")

2. `collections.defaultdict()` (from the collections module):
   - Example: Counting the occurrences of each word in a text document.

In [None]:
from collections import defaultdict

text = "Lorem ipsum dolor sit amet consectetur adipiscing elit Lorem ipsum ipsum."
word_counts = defaultdict(int)

for word in text.split():
   word_counts[word] += 1

print(word_counts)

---
**Combinatorics**

1. `itertools.permutations()`:
   - Example: Finding all possible arrangements of a set of letters to form words.

In [None]:
from itertools import permutations

letters = ['a', 'b', 'c']

for perm in permutations(letters):
   print(''.join(perm))

2. `itertools.combinations()`:
   - Example: Selecting all possible combinations of items from a set.

In [None]:
from itertools import combinations

colors = ['red', 'green', 'blue']

for comb in combinations(colors, 2):
   print(comb)

3. `itertools.product()`:
   - Example: Generating all possible combinations of items from multiple sets.

In [None]:
from itertools import product

numbers = [1, 2]
letters = ['A', 'B']

for prod in product(numbers, letters):
   print(prod)

\[<< [Other functions concepts](./05_other_functions_concepts.ipynb) | [Index](./00_index.ipynb) | [Content Manager](./07_context_managers.ipynb) >>\]