## More useful python Data Structure and Algorithms

## bisect
- This module provides support for maintaining a list in sorted order without having to sort the list after each insertion. 
- For long lists of items with expensive comparison operations, this can be an improvement over linear searches or frequent resorting.

- **Maintaining a Real-Time Leaderboard or Ranked List**
    - When managing a dynamic list where order must be preserved, such as a game leaderboard or a list of stock prices, using insort avoids the overhead of sorting the entire list after every single addition
- **Performing Efficient Range Queries**
    - For large datasets, you can quickly find all elements within a specific range without scanning the entire list. bisect_left and bisect_right find the boundary indices in logarithmic time (O(log n)), and slicing extracts the sub-list efficiently. 

- **Implementing Numeric Table Lookups (Grading Systems)**
    - The bisect function (an alias for bisect_right) is effective for mapping numeric values to specific categories or bins, such as assigning letter grades based on scores.

In [10]:
import bisect

sorted_numbers = [1, 4, 5, 10, 15, 20, 25]
start_range = 5
end_range = 20

# Find the leftmost index for the start of the range
left_bound = bisect.bisect_left(sorted_numbers, start_range)
# Find the rightmost index for the end of the range
right_bound = bisect.bisect_right(sorted_numbers, end_range)

# Extract the elements within that range
elements_in_range = sorted_numbers[left_bound:right_bound]
print(f"Elements between {start_range} and {end_range}: {elements_in_range}")
# Output: Elements between 5 and 20:

Elements between 5 and 20: [5, 10, 15, 20]


In [11]:
import bisect


def grade(score, breakpoints=[60, 70, 80, 90], grades='FDCBA'):
    # bisect returns the index in 'grades' corresponding to the score
    i = bisect.bisect(breakpoints, score)
    return grades[i]


scores_to_check = [33, 99, 77, 70, 89, 90, 100]
results = [grade(score) for score in scores_to_check]

print(f"Scores: {scores_to_check}")
print(f"Grades: {results}")
# Output: Grades: ['F', 'A', 'C', 'C', 'B', 'A', 'A']

Scores: [33, 99, 77, 70, 89, 90, 100]
Grades: ['F', 'A', 'C', 'C', 'B', 'A', 'A']


In [12]:
import bisect

data = [('black', 0), ('blue', 1), ('red', 5), ('yellow', 8)]
def get_price(r): return r[1]


# 1. Finding an insertion point
search_price = 7
index = bisect.bisect_left(data, search_price, key=get_price)
print(f"Insertion index for price {search_price}: {index}")

# 2. Inserting a new item (using insort)
new_product = ('brown', 7)
bisect.insort(data, new_product, key=get_price)
print(f"List after insort: {data}")

Insertion index for price 7: 3
List after insort: [('black', 0), ('blue', 1), ('red', 5), ('brown', 7), ('yellow', 8)]


In [13]:
# More performant alternative for many searches
data = [('black', 0), ('blue', 1), ('red', 5), ('yellow', 8)]
keys = [r[1] for r in data]  # Precomputed keys

search_price = 7
i = bisect.bisect_left(keys, search_price)
print(f"The item at index {i} in the original list is: {data[i]}")

The item at index 3 in the original list is: ('yellow', 8)


### Use a binary search

In [14]:
import bisect


def binary_search_with_bisect(nums, target):
    i = bisect.bisect_left(nums, target)
    # Check two conditions to confirm the target is actually in the list:
    # 1. 'i' must be a valid index (not equal to the length of the list)
    # 2. The element at index 'i' must exactly match the target value
    return i if i != len(nums) and nums[i] == target else -1


sorted_list = [1, 4, 15, 23, 30, 45, 55, 60]
target_present = 23
target_absent = 50

print(
    f"Index of {target_present}: {binary_search_with_bisect(sorted_list, target_present)}")
# Output: Index of 23: 5

print(
    f"Index of {target_absent}: {binary_search_with_bisect(sorted_list, target_absent)}")
# Output: Index of 50: -1

Index of 23: 3
Index of 50: -1


### bisect and bisect_right
- bisect(a, x, lo=0, hi=len(a), *, key=None)
- The returned insertion point ip partitions the array a into two slices such that all(elem <= x for elem in a[lo : ip]) is true for the left slice and all(elem > x for elem in a[ip : hi]) is true for the right slice.
- **Returns an index**

In [5]:
import bisect 
sorted_list = [10, 20, 30, 40, 50]

print(bisect.bisect(sorted_list, 24))  # same as bisect_right


2


### bisect_left
- Locate the insertion point for x in a to maintain sorted order. 
- The parameters lo and hi may be used to specify a subset of the list which should be considered; by default the entire list is used. 
- If x is already present in a, the insertion point will be before (to the left of) any existing entries. 
- The return value is suitable for use as the first parameter to list.insert() assuming that a is already sorted.
- **Returns an index**

In [6]:
import bisect
sorted_list = [10, 20, 30, 40, 50]

print(bisect.bisect_left(sorted_list, 24))  # left


2


### insort and insort_right
- actually does the inserting and modify the original list
- This function first runs bisect_right() to locate an insertion point. Next, it runs the insert() method on a to insert x at the appropriate position to maintain sort order.

- To support inserting records in a table, the key function (if any) is applied to x for the search step but not for the insertion step.
- In case of matches, it puts them on the right side.

In [7]:
import bisect
sorted_list = [10, 20, 30, 40, 50]

print(bisect.insort(sorted_list, 24))  # left
print(sorted_list)

None
[10, 20, 24, 30, 40, 50]


### insort_left
- Insert x in a in sorted order.
- This function first runs bisect_left() to locate an insertion point. Next, it runs the insert() method on a to insert x at the appropriate position to maintain sort order.
- To support inserting records in a table, the key function (if any) is applied to x for the search step but not for the insertion step.
- In case of matches, it puts them on the left side

In [8]:
import bisect
sorted_list = [10, 20, 30, 40, 50]

print(bisect.insort_left(sorted_list, 24))  # left
print(sorted_list)

None
[10, 20, 24, 30, 40, 50]


## defaultdict - safe dictionary
- 

### clear

### copy

### default_factory

### fromkeys

### get

### items

### keys

### pop

### popitem

### setdefault

### update

### values

## functools
- The functools module provides higher-order functions and operations on callable objects.
- Use it for caching, partial function application, decorators, and tools that help build function-based utilities.

### cache
- Simple unbounded cache decorator (like lru_cache with no size limit).
- **decorator**
- Will cache all the arguments for the function
- If you only want to cache 1 or 2 arguments, but your functions required more, then use an inner function

In [4]:
from functools import cache 

@cache
def fib(n):
    if n <= 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n - 1) + fib(n - 2)

print(fib(5))
print(fib(20))
print(fib(30))
print(fib(300))

5
6765
832040
222232244629420445529739893461909967206666939096499764990979600


### cached_property
- Descriptor that caches the result of a method as a property.
- transforms a class method into a property that is computed only once. 
- After the first access, the result is stored as a normal attribute on the instance, and subsequent lookups are significantly faster because they bypass the method entirely. 
- **Thread Safety**: It is thread-safe for the initial computation, though in high-concurrency scenarios, the method might theoretically run more than once if multiple threads access it at the exact same millisecond before the first one finishes.

In [5]:
import time
from functools import cached_property


class DataAnalyzer:
    def __init__(self, data):
        self.data = data

    @cached_property
    def complex_analysis(self):
        print("Computing expensive analysis...")
        time.sleep(2)  # Simulate a heavy 2-second task
        return sum(self.data) / len(self.data)


# Usage
analyzer = DataAnalyzer([10, 20, 30, 40, 50])

# First access: Runs the method (takes 2 seconds)
print(analyzer.complex_analysis)

# Second access: Returns the cached value instantly
print(analyzer.complex_analysis)

Computing expensive analysis...
30.0
30.0


### cmp_to_key
- Convert an old-style comparison function to a key function.

In [7]:
from functools import cmp_to_key
def compare_len(a, b): return len(a) - len(b)


sorted_list = sorted(["apple", "cat", "banana"],
                     key=cmp_to_key(compare_len))
print(sorted_list)

['cat', 'apple', 'banana']


In [14]:
def compare_len2(x):
    return len(x)

print(sorted(["apple", "cat", "banana"],
       key=compare_len2))

['cat', 'apple', 'banana']


### lru_cache
- Decorator to wrap a function with a Least-Recently-Used cache.
- @functools.lru_cache(maxsize=128): Similar to @cache, but it limits memory usage by only keeping the "Least Recently Used" results. It can be tuned via maxsize.

In [15]:
from functools import lru_cache


@lru_cache(maxsize=128)
def fib(n):
    if n <= 0:
        return 0
    elif n == 1:
        return 1
    else:
        return fib(n - 1) + fib(n - 2)


print(fib(5))
print(fib(20))
print(fib(30))
print(fib(300))

5
6765
832040
222232244629420445529739893461909967206666939096499764990979600


### partial
- Create a new function with partial application of the given arguments.

In [17]:
from functools import partial
int2 = partial(int, base=2)  # "Freezes" base at 2
print(int2('101'))  # Outputs: 5

5


### partialmethod
- A version of partial specifically for methods inside a class, allowing you to create new methods with pre-filled arguments.

In [18]:
import functools


class Switch:
    def __init__(self):
        self._state = False  # Initial state is off

    @property
    def alive(self):
        """Returns the current state of the switch."""
        return self._state

    def set_state(self, state):
        """Generic method to set the switch's state."""
        self._state = bool(state)
        print(f"State set to {self._state}")

    # Create specialized methods using partialmethod
    # These automatically bind 'self' and a specific state argument
    turn_on = functools.partialmethod(set_state, True)
    turn_off = functools.partialmethod(set_state, False)


# Demonstration
my_switch = Switch()

print(f"Initial state: {my_switch.alive}")

# Use the partial methods
my_switch.turn_on()
print(f"State after turning on: {my_switch.alive}")

my_switch.turn_off()
print(f"State after turning off: {my_switch.alive}")

Initial state: False
State set to True
State after turning on: True
State set to False
State after turning off: False


### reduce
- Apply a function of two arguments cumulatively to the items of a sequence.
- Repeatedly applies a function to elements in an iterable to reduce them to a single cumulative value.

In [19]:
from functools import reduce
product = reduce(lambda x, y: x * y, [1, 2, 3, 4])  # 1*2*3*4 = 24

### singledispatch
- Decorator for single-dispatch generic functions.
- a primary tool for implementing polymorphism in Python. Specifically, it provides a form of ad-hoc polymorphism (also called function overloading), allowing a single function name to have multiple implementations chosen at runtime based on the type of its input. 
- **function overload**

In [23]:
from functools import singledispatch

# 1. Define the default function and decorate it with @singledispatch


@singledispatch
def process_data(data):
    """Default implementation if the type is not registered."""
    return f"Processing generic data: {data}"

# 2. Register a specific function for the 'int' type


@process_data.register(int)
def _(data):
    """Handles integer data."""
    return f"Integer: {data}"

# 3. Register a specific function for the 'str' type


@process_data.register(str)
def _(data):
    """Handles string data."""
    return f"String: '{data}'"

# 4. Register a specific function for the 'list' type


@process_data.register(list)
def _(data):
    """Handles list data by enumerating elements."""
    output = "List items:\n"
    for i, elem in enumerate(data):
        output += f"  {i}: {elem}\n"
    return output


# Example Usage
print(process_data(25))
print("-" * 20)
print(process_data("Welcome to singledispatch"))
print("-" * 20)
print(process_data([4, 3, 6]))
print("-" * 20)
print(process_data({"key": "value"}))  # Uses the default implementation

Integer: 25
--------------------
String: 'Welcome to singledispatch'
--------------------
List items:
  0: 4
  1: 3
  2: 6

--------------------
Processing generic data: {'key': 'value'}


### singledispatchmethod
- Single-dispatch generic method descriptor.
- **decorator**
- One method to use for polymorphism with abstractmethod from module abc

In [24]:
from functools import singledispatchmethod


class DataManager:
    def __init__(self):
        self.history = []

    @singledispatchmethod
    def add_record(self, data):
        """The base implementation (fallback)."""
        raise TypeError(f"Unsupported type: {type(data)}")

    @add_record.register
    def _(self, data: int):
        """Specific logic for integers."""
        print(f"Adding integer record: {data}")
        self.history.append(data)

    @add_record.register
    def _(self, data: list):
        """Specific logic for lists."""
        print(f"Processing batch of {len(data)} items")
        self.history.extend(data)


# Usage in 2025
manager = DataManager()
manager.add_record(42)          # Dispatches to integer method
manager.add_record([10, 20])    # Dispatches to list method
# manager.add_record("string")  # Raises TypeError (fallback)

Adding integer record: 42
Processing batch of 2 items


### total_ordering
- Class decorator that fills in missing ordering methods.
- **decorator**

In [25]:
from functools import total_ordering


@total_ordering
class Student:
    def __init__(self, name, grade):
        self.name = name
        self.grade = grade

    def __eq__(self, other):
        if not isinstance(other, Student):
            return NotImplemented
        return self.grade == other.grade

    def __lt__(self, other):
        if not isinstance(other, Student):
            return NotImplemented
        return self.grade < other.grade


# Testing the ordering
alice = Student("Alice", 90)
bob = Student("Bob", 85)

print(f"Is Alice better than Bob? {alice > bob}")   # True (generated)
print(f"Is Bob worse than Alice? {bob < alice}")   # True (defined)
print(f"Is Alice >= Bob? {alice >= bob}")         # True (generated)
print(f"Is Bob <= Alice? {bob <= alice}")         # True (generated)

Is Alice better than Bob? True
Is Bob worse than Alice? True
Is Alice >= Bob? True
Is Bob <= Alice? True


### update_wrapper
- Update a wrapper function to look like the wrapped function.
-  a function call rather than a decorator. It is primarily used when implementing decorators as classes, as you cannot easily use @wraps on a class instance or method in the same way. 

In [26]:
from functools import update_wrapper


class CountCalls:
    def __init__(self, func):
        update_wrapper(self, func)  # Copy metadata from func to 'self'
        self.func = func
        self.num_calls = 0

    def __call__(self, *args, **kwargs):
        self.num_calls += 1
        return self.func(*args, **kwargs)


@CountCalls
def add(a, b):
    """Adds two numbers."""
    return a + b


print(add.__name__)  # Output: 'add'
print(add.__doc__)   # Output: 'Adds two numbers.'

add
Adds two numbers.


### wraps
- a convenience decorator used inside other decorators. It is the most common way to preserve metadata. 

- udpate_wrapper and wraps
    - a wrapper function "inherits" the metadata (like the name, docstring, and annotations) of the original function it is wrapping. Without them, a decorated function appears to have the identity of the wrapper rather than the original. 

In [27]:
from functools import wraps


def my_logger(func):
    @wraps(func)  # Ensures 'say_hello' keeps its name and docstring
    def wrapper(*args, **kwargs):
        print(f"Logging: Calling {func.__name__}")
        return func(*args, **kwargs)
    return wrapper


@my_logger
def say_hello(name: str):
    """Greets a person by name."""
    print(f"Hello, {name}!")


# Metadata is preserved
print(say_hello.__name__)  # Output: 'say_hello' (not 'wrapper')
print(say_hello.__doc__)   # Output: 'Greets a person by name.'

say_hello
Greets a person by name.


## itertools
- An iterator, by definition, maintains its state as it is consumed and cannot be "reset" to its starting position after it has been exhausted. Once next() raises StopIteration, the process is complete.
- **BE CAREFUL WHEN USING AN ITERATOR FOR MULTIPLE OPERATIONS**

### accumulate
- accumulate(iterable[, function, *, initial=None])
- Make an iterator that returns accumulated sums or accumulated results from other binary functions.
- The function defaults to addition. The function should accept two arguments, an accumulated total and a value from the iterable.
- If an initial value is provided, the accumulation will start with that value and the output will have one more element than the input iterable.

In [22]:
from itertools import accumulate, repeat
import operator 


data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8]
print(list(accumulate(data, operator.add))   )           # running maximum

print(list(accumulate(data, operator.mul))   )  # running product


def update(balance, payment): return round(balance * 1.05) - payment


print(list(accumulate(repeat(90, 10), update, initial=1_000)))

[3, 7, 13, 15, 16, 25, 25, 32, 37, 45]
[3, 12, 72, 144, 144, 1296, 0, 0, 0, 0]
[1000, 960, 918, 874, 828, 779, 728, 674, 618, 559, 497]


In [47]:
print(list(accumulate([1, 2, 3, 4, 5])))

[1, 3, 6, 10, 15]


### chain
- Make an iterator that returns elements from the first iterable until it is exhausted, then proceeds to the next iterable, until all of the iterables are exhausted. This combines multiple data sources into a single iterator. Roughly equivalent to:
```python
def chain(*iterables):
    # chain('ABC', 'DEF') → A B C D E F
    for iterable in iterables:
        yield from iterable
```

In [24]:
from itertools import chain

list1 = [1, 2, 3]
tuple1 = ('a', 'b', 'c')
string1 = "DEF"

# Chain the iterables together
combined_iterator = chain(list1, tuple1, string1)

# Iterate over the combined sequence
print("Chained sequence:")
for item in combined_iterator:
    print(item, end=' ')
# Output: 1 2 3 a b c D E F

# To get a list from the iterator (consumes the iterator)
combined_list = list(chain(list1, tuple1, string1))
print(f"\n\nCombined list: {combined_list}")
# Output: Combined list: [1, 2, 3, 'a', 'b', 'c', 'D', 'E', 'F']
print("type of chain: ", combined_iterator)

Chained sequence:
1 2 3 a b c D E F 

Combined list: [1, 2, 3, 'a', 'b', 'c', 'D', 'E', 'F']
type of chain:  <itertools.chain object at 0x112657ca0>


In [25]:
from itertools import chain

nested_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

# Unpack the nested list into arguments for chain()
flattened_list = list(chain(*nested_list))

print(f"Flattened list: {flattened_list}")
# Output: Flattened list: [1, 2, 3, 4, 5, 6, 7, 8, 9]

Flattened list: [1, 2, 3, 4, 5, 6, 7, 8, 9]


### combinations
- rearange of elements in a set where order matters. (1, 2, 3) == (3, 2, 1)
- **combinations(iterable, r)**
- Return r length subsequences of elements from the input iterable.

- The output is a subsequence of product() keeping only entries that are subsequences of the iterable. The length of the output is given by math.comb() which computes n! / r! / (n - r)! when 0 ≤ r ≤ n or zero when r > n.

- The combination tuples are emitted in lexicographic order according to the order of the input iterable. If the input iterable is sorted, the output tuples will be produced in sorted order.

- Elements are treated as unique based on their position, not on their value. If the input elements are unique, there will be no repeated values within each combination.

- Roughly equivalent to:
```python
def combinations(iterable, r):
    # combinations('ABCD', 2) → AB AC AD BC BD CD
    # combinations(range(4), 3) → 012 013 023 123

    pool = tuple(iterable)
    n = len(pool)
    if r > n:
        return
    indices = list(range(r))

    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + n - r:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        yield tuple(pool[i] for i in indices)
```

In [45]:
from itertools import combinations

print(list(combinations([1, 2, 3, 4, 5], 3)))

[(1, 2, 3), (1, 2, 4), (1, 2, 5), (1, 3, 4), (1, 3, 5), (1, 4, 5), (2, 3, 4), (2, 3, 5), (2, 4, 5), (3, 4, 5)]


### combinations_with_replacement
- **combinations_with_replacement(iterable, r)**
- Return r length subsequences of elements from the input iterable allowing individual elements to be repeated more than once.

- The output is a subsequence of product() that keeps only entries that are subsequences (with possible repeated elements) of the iterable. The number of subsequence returned is (n + r - 1)! / r! / (n - 1)! when n > 0.

- The combination tuples are emitted in lexicographic order according to the order of the input iterable. if the input iterable is sorted, the output tuples will be produced in sorted order.

- Elements are treated as unique based on their position, not on their value. If the input elements are unique, the generated combinations will also be unique.

- Roughly equivalent to:
```python
def combinations_with_replacement(iterable, r):
    # combinations_with_replacement('ABC', 2) → AA AB AC BB BC CC

    pool = tuple(iterable)
    n = len(pool)
    if not n and r:
        return
    indices = [0] * r

    yield tuple(pool[i] for i in indices)
    while True:
        for i in reversed(range(r)):
            if indices[i] != n - 1:
                break
        else:
            return
        indices[i:] = [indices[i] + 1] * (r - i)
        yield tuple(pool[i] for i in indices)
```

In [48]:
from itertools import combinations_with_replacement

print(list(combinations_with_replacement(['a', 'b', 'c', 'd', 'e'], 3)))

[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'a', 'c'), ('a', 'a', 'd'), ('a', 'a', 'e'), ('a', 'b', 'b'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'b', 'e'), ('a', 'c', 'c'), ('a', 'c', 'd'), ('a', 'c', 'e'), ('a', 'd', 'd'), ('a', 'd', 'e'), ('a', 'e', 'e'), ('b', 'b', 'b'), ('b', 'b', 'c'), ('b', 'b', 'd'), ('b', 'b', 'e'), ('b', 'c', 'c'), ('b', 'c', 'd'), ('b', 'c', 'e'), ('b', 'd', 'd'), ('b', 'd', 'e'), ('b', 'e', 'e'), ('c', 'c', 'c'), ('c', 'c', 'd'), ('c', 'c', 'e'), ('c', 'd', 'd'), ('c', 'd', 'e'), ('c', 'e', 'e'), ('d', 'd', 'd'), ('d', 'd', 'e'), ('d', 'e', 'e'), ('e', 'e', 'e')]


### compress
- **compress(data, selectors)**
- Make an iterator that returns elements from data where the corresponding element in selectors is true. Stops when either the data or selectors iterables have been exhausted. Roughly equivalent to:
```python
def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) → A C E F
    return (datum for datum, selector in zip(data, selectors) if selector)
```

In [49]:
from itertools import compress

data = ['A', 'B', 'C', 'D', 'E', 'F']
# Boolean values or truthy/falsy values
selectors = [True, False, True, False, True, True]

# Create the compressed iterator
filtered_data = compress(data, selectors)

# Iterate over the result (or convert to a list to view all at once)
result_list = list(filtered_data)
print(result_list)

['A', 'C', 'E', 'F']


### count
- **itertools.count(start=0, step=1)**
- Make an iterator that returns evenly spaced values beginning with start. Can be used with map() to generate consecutive data points or with zip() to add sequence numbers. Roughly equivalent to:
```python
def count(start=0, step=1):
    # count(10) → 10 11 12 13 14 ...
    # count(2.5, 0.5) → 2.5 3.0 3.5 ...
    n = start
    while True:
        yield n
        n += step
```

In [30]:
from itertools import count 

counter = count(start=0, step = -10)
next_val = next(counter)
results = []
while next_val > -200:
    results.append(next_val)
    next_val = next(counter)
print(results)

[0, -10, -20, -30, -40, -50, -60, -70, -80, -90, -100, -110, -120, -130, -140, -150, -160, -170, -180, -190]


### cycle
- **cycle(iterable)**
- Make an iterator returning elements from the iterable and saving a copy of each. When the iterable is exhausted, return elements from the saved copy. Repeats indefinitely. Roughly equivalent to:
```python
def cycle(iterable):
    # cycle('ABCD') → A B C D A B C D A B C D ...

    saved = []
    for element in iterable:
        yield element
        saved.append(element)

    while saved:
        for element in saved:
            yield element
```


In [33]:
from itertools import cycle 

circular_counter = cycle([11, 22, 33])
results = []
for _ in range(12):
    results.append(next(circular_counter))
print(results)

[11, 22, 33, 11, 22, 33, 11, 22, 33, 11, 22, 33]


### dropwhile
- **dropwhile(predicate, iterable)**
- Make an iterator that drops elements from the iterable while the predicate is true and afterwards returns every element. Roughly equivalent to:
```python
def dropwhile(predicate, iterable):
    # dropwhile(lambda x: x<5, [1,4,6,3,8]) → 6 3 8

    iterator = iter(iterable)
    for x in iterator:
        if not predicate(x):
            yield x
            break

    for x in iterator:
        yield x
```
- Good for dropping leading stuff only.

In [50]:
from itertools import dropwhile
# Good for dropping leading stuff
text = "123ABC456"

# str.isdigit is used as the predicate
alpha_part = dropwhile(str.isdigit, text)

print(''.join(alpha_part))

ABC456


### filterfalse
- **filterfalse(predicate, iterable)**
- Make an iterator that filters elements from the iterable returning only those for which the predicate returns a false value. If predicate is None, returns the items that are false. Roughly equivalent to:
```python
def filterfalse(predicate, iterable):
    # filterfalse(lambda x: x<5, [1,4,6,3,8]) → 6 8

    if predicate is None:
        predicate = bool

    for x in iterable:
        if not predicate(x):
            yield x
```

In [52]:
from itertools import filterfalse

print(list(filterfalse(lambda x: x<5, [1,4,6,3,8])))

[6, 8]


### groupby
- **groupby(iterable, key=None)**

- Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is None, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.

- The operation of groupby() is similar to the uniq filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.

- The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible. So, if that data is needed later, it should be stored as a list:
```python

def groupby(iterable, key=None):
    # [k for k, g in groupby('AAAABBBCCDAABBB')] → A B C D A B
    # [list(g) for k, g in groupby('AAAABBBCCD')] → AAAA BBB CC D

    keyfunc = (lambda x: x) if key is None else key
    iterator = iter(iterable)
    exhausted = False

    def _grouper(target_key):
        nonlocal curr_value, curr_key, exhausted
        yield curr_value
        for curr_value in iterator:
            curr_key = keyfunc(curr_value)
            if curr_key != target_key:
                return
            yield curr_value
        exhausted = True

    try:
        curr_value = next(iterator)
    except StopIteration:
        return
    curr_key = keyfunc(curr_value)

    while not exhausted:
        target_key = curr_key
        curr_group = _grouper(target_key)
        yield curr_key, curr_group
        if curr_key == target_key:
            for _ in curr_group:
                pass
```

In [54]:
from itertools import groupby

print([k for k, g in groupby('AAAABBBCCDAABBB')])

['A', 'B', 'C', 'D', 'A', 'B']


### islice
- **islice(iterable, start, stop[, step])**
- Make an iterator that returns selected elements from the iterable. Works like sequence slicing but does not support negative values for start, stop, or step.

- If start is zero or None, iteration starts at zero. Otherwise, elements from the iterable are skipped until start is reached.

- If stop is None, iteration continues until the input is exhausted, if at all. Otherwise, it stops at the specified position.

- If step is None, the step defaults to one. Elements are returned consecutively unless step is set higher than one which results in items being skipped.

- Roughly equivalent to:
```python
def islice(iterable, *args):
    # islice('ABCDEFG', 2) → A B
    # islice('ABCDEFG', 2, 4) → C D
    # islice('ABCDEFG', 2, None) → C D E F G
    # islice('ABCDEFG', 0, None, 2) → A C E G

    s = slice(*args)
    start = 0 if s.start is None else s.start
    stop = s.stop
    step = 1 if s.step is None else s.step
    if start < 0 or (stop is not None and stop < 0) or step <= 0:
        raise ValueError

    indices = count() if stop is None else range(max(start, stop))
    next_i = start
    for i, element in zip(indices, iterable):
        if i == next_i:
            yield element
            next_i += step
```

In [57]:
from itertools import islice

print(list(islice('ABCDEFG', 2)))

['A', 'B']


### pairwise
- **pairwise(iterable)**
- Return successive overlapping pairs taken from the input iterable.

- The number of 2-tuples in the output iterator will be one fewer than the number of inputs. It will be empty if the input iterable has fewer than two values.

- Roughly equivalent to:
```python
def pairwise(iterable):
    # pairwise('ABCDEFG') → AB BC CD DE EF FG

    iterator = iter(iterable)
    a = next(iterator, None)

    for b in iterator:
        yield a, b
        a = b
```

In [58]:
from itertools import pairwise 

print(list(pairwise('abcdef')))

[('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', 'e'), ('e', 'f')]


### permutations
- **permutations(iterable, r=None)**
- Return successive r length permutations of elements from the iterable.

- If r is not specified or is None, then r defaults to the length of the iterable and all possible full-length permutations are generated.

- The output is a subsequence of product() where entries with repeated elements have been filtered out. The length of the output is given by math.perm() which computes n! / (n - r)! when 0 ≤ r ≤ n or zero when r > n.

- The permutation tuples are emitted in lexicographic order according to the order of the input iterable. If the input iterable is sorted, the output tuples will be produced in sorted order.

- Elements are treated as unique based on their position, not on their value. If the input elements are unique, there will be no repeated values within a permutation.

- Roughly equivalent to:
```python
def permutations(iterable, r=None):
    # permutations('ABCD', 2) → AB AC AD BA BC BD CA CB CD DA DB DC
    # permutations(range(3)) → 012 021 102 120 201 210

    pool = tuple(iterable)
    n = len(pool)
    r = n if r is None else r
    if r > n:
        return

    indices = list(range(n))
    cycles = list(range(n, n-r, -1))
    yield tuple(pool[i] for i in indices[:r])

    while n:
        for i in reversed(range(r)):
            cycles[i] -= 1
            if cycles[i] == 0:
                indices[i:] = indices[i+1:] + indices[i:i+1]
                cycles[i] = n - i
            else:
                j = cycles[i]
                indices[i], indices[-j] = indices[-j], indices[i]
                yield tuple(pool[i] for i in indices[:r])
                break
        else:
            return
```

In [59]:
from itertools import permutations 

print(list(permutations([1, 2, 3])))

[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]


### product
- **product(*iterables, repeat=1)**
- Cartesian product of the input iterables.

- Roughly equivalent to nested for-loops in a generator expression. For example, product(A, B) returns the same as ((x,y) for x in A for y in B).

- The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.

- To compute the product of an iterable with itself, specify the number of repetitions with the optional repeat keyword argument. For example, product(A, repeat=4) means the same as product(A, A, A, A).

- This function is roughly equivalent to the following code, except that the actual implementation does not build up intermediate results in memory:
```python
def product(*iterables, repeat=1):
    # product('ABCD', 'xy') → Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) → 000 001 010 011 100 101 110 111

    if repeat < 0:
        raise ValueError('repeat argument cannot be negative')
    pools = [tuple(pool) for pool in iterables] * repeat

    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]

    for prod in result:
        yield tuple(prod)
```

In [62]:
from itertools import product 

print(list(product([3, 4], repeat=3)))

[(3, 3, 3), (3, 3, 4), (3, 4, 3), (3, 4, 4), (4, 3, 3), (4, 3, 4), (4, 4, 3), (4, 4, 4)]


### repeat
- repeat(object[, times])¶
- Make an iterator that returns object over and over again. Runs indefinitely unless the times argument is specified.
- Roughly equivalent to:
```python
def repeat(object, times=None):
    # repeat(10, 3) → 10 10 10
    if times is None:
        while True:
            yield object
    else:
        for i in range(times):
            yield object
```
- A common use for repeat is to supply a stream of constant values to map or zip:

In [None]:
from itertools import repeat 

# returns an nested iterable 
counter = repeat([2, 3], times=4)
# list(counter) will print a list of list
# using chain*, we unflatten the list
print(list(chain(*list(counter))))
print(list(map(pow, range(10), repeat(2))))

[2, 3, 2, 3, 2, 3, 2, 3]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


### starmap
- starmap(function, iterable)
- Make an iterator that computes the function using arguments obtained from the iterable. Used instead of map() when argument parameters have already been “pre-zipped” into tuples.
- The difference between map() and starmap() parallels the distinction between function(a,b) and function(*c). Roughly equivalent to:
```python
def starmap(function, iterable):
    # starmap(pow, [(2,5), (3,2), (10,3)]) → 32 9 1000
    for args in iterable:
        yield function(*args)

```

In [41]:
from itertools import starmap 

print(list(starmap(pow, [(2, 4), (3, 2), (10, 3)])))

[16, 9, 1000]


### takewhile
- **takewhile(predicate, iterable)**
- Make an iterator that returns elements from the iterable as long as the predicate is true. Roughly equivalent to:
```python
def takewhile(predicate, iterable):
    # takewhile(lambda x: x<5, [1,4,6,3,8]) → 1 4
    for x in iterable:
        if not predicate(x):
            break
        yield x
```

- Also good for the prefix or take the beginning of an iterable

In [64]:
from itertools import takewhile 

print(list(takewhile(lambda x: x < 5, [1, 4, 5, 2, 3])))

[1, 4]


### tee
- **tee(iterable, n=2)**
- Return n independent iterators from a single iterable.

- Roughly equivalent to:
```python
def tee(iterable, n=2):
    if n < 0:
        raise ValueError
    if n == 0:
        return ()
    iterator = _tee(iterable)
    result = [iterator]
    for _ in range(n - 1):
        result.append(_tee(iterator))
    return tuple(result)

class _tee:

    def __init__(self, iterable):
        it = iter(iterable)
        if isinstance(it, _tee):
            self.iterator = it.iterator
            self.link = it.link
        else:
            self.iterator = it
            self.link = [None, None]

    def __iter__(self):
        return self

    def __next__(self):
        link = self.link
        if link[1] is None:
            link[0] = next(self.iterator)
            link[1] = [None, None]
        value, self.link = link
        return value

```

In [65]:
from itertools import tee

# An iterable (list works, but generators show the value of tee best)
numbers = [1, 2, 3, 4, 5]

# Create two independent iterators
iter1, iter2 = tee(numbers)

print("Iterator 1:", list(iter1))
print("Iterator 2:", list(iter2))

Iterator 1: [1, 2, 3, 4, 5]
Iterator 2: [1, 2, 3, 4, 5]


In [None]:
from itertools import tee
# Use multiple operations for ONE iterable
data = [10, 20, 30, 40, 50]

# Create two independent iterators
iter1, iter2 = tee(data)

# Use the first iterator to find the sum
sum_values = sum(iter1)
print(f"Sum of numbers: {sum_values}")

# Use the second iterator to find the maximum value
max_value = max(iter2)
print(f"Maximum value: {max_value}")

Sum of numbers: 150
Maximum value: 50


### zip_longest
- the zip function but it doesn't not stop with the shortest argument, but the longest one

In [32]:
from itertools import zip_longest
data = [11, 22, 33, 44]
iter_range = range(10)
print(list(zip(iter_range, data)))
print(list(zip_longest(iter_range, data)))

[(0, 11), (1, 22), (2, 33), (3, 44)]
[(0, 11), (1, 22), (2, 33), (3, 44), (4, None), (5, None), (6, None), (7, None), (8, None), (9, None)]
