# Decorators and Generators

This notebook covers two advanced Python concepts that are essential for writing efficient and elegant code, especially in data science and NLP applications.

## Topics Covered:
- Function decorators
- Class decorators
- Built-in decorators (@property, @staticmethod, @classmethod)
- Generator functions and yield
- Generator expressions
- Iterators vs Generators
- Memory efficiency and performance

## Function Decorators

Decorators are a way to modify or enhance functions without permanently modifying their code. They're essentially functions that take another function as an argument and return a modified version.

In [None]:
# Basic decorator example
def my_decorator(func):
    def wrapper():
        print("Something is happening before the function is called.")
        func()
        print("Something is happening after the function is called.")
    return wrapper

# Using the decorator
@my_decorator
def say_hello():
    print("Hello!")

# Call the decorated function
say_hello()

In [None]:
# Decorator with arguments and return values
import functools

def debug(func):
    @functools.wraps(func)  # Preserves function metadata
    def wrapper(*args, **kwargs):
        # Print function call info
        args_repr = [repr(a) for a in args]
        kwargs_repr = [f"{k}={v!r}" for k, v in kwargs.items()]
        signature = ", ".join(args_repr + kwargs_repr)
        print(f"Calling {func.__name__}({signature})")
        
        # Call the function
        result = func(*args, **kwargs)
        print(f"{func.__name__!r} returned {result!r}")
        return result
    return wrapper

@debug
def add_numbers(a, b):
    """Add two numbers together."""
    return a + b

# Test the decorated function
result = add_numbers(5, 3)
print(f"Final result: {result}")
print(f"Function name: {add_numbers.__name__}")
print(f"Function docstring: {add_numbers.__doc__}")

In [None]:
# Practical decorator: timing function execution
import time
import functools

def timer(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = func(*args, **kwargs)
        end_time = time.time()
        print(f"{func.__name__} executed in {end_time - start_time:.4f} seconds")
        return result
    return wrapper

@timer
def slow_function():
    """A function that takes some time to execute."""
    time.sleep(0.1)
    return sum(range(1000000))

@timer
def fibonacci(n):
    """Calculate the nth Fibonacci number."""
    if n <= 1:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# Test the timed functions
result1 = slow_function()
print(f"Slow function result: {result1}")
print()

result2 = fibonacci(10)
print(f"Fibonacci(10) = {result2}")

In [None]:
# Decorator with parameters
def repeat(times):
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            for _ in range(times):
                result = func(*args, **kwargs)
            return result
        return wrapper
    return decorator

@repeat(times=3)
def greet(name):
    print(f"Hello {name}!")
    return f"Greeting for {name}"

# This will print the greeting 3 times
result = greet("Alice")
print(f"Return value: {result}")

## Built-in Decorators

In [None]:
# @property decorator
class Circle:
    def __init__(self, radius):
        self._radius = radius
    
    @property
    def radius(self):
        return self._radius
    
    @radius.setter
    def radius(self, value):
        if value < 0:
            raise ValueError("Radius cannot be negative")
        self._radius = value
    
    @property
    def area(self):
        return 3.14159 * self._radius ** 2
    
    @property
    def circumference(self):
        return 2 * 3.14159 * self._radius

# Usage
circle = Circle(5)
print(f"Radius: {circle.radius}")
print(f"Area: {circle.area:.2f}")
print(f"Circumference: {circle.circumference:.2f}")

# Modify radius
circle.radius = 7
print(f"\nNew radius: {circle.radius}")
print(f"New area: {circle.area:.2f}")

# This will raise an error
try:
    circle.radius = -3
except ValueError as e:
    print(f"Error: {e}")

In [None]:
# @staticmethod and @classmethod
class MathUtils:
    pi = 3.14159
    
    @staticmethod
    def add(a, b):
        """Static method - doesn't need class or instance."""
        return a + b
    
    @classmethod
    def circle_area(cls, radius):
        """Class method - has access to class variables."""
        return cls.pi * radius ** 2
    
    @classmethod
    def from_diameter(cls, diameter):
        """Alternative constructor using diameter."""
        radius = diameter / 2
        return cls.circle_area(radius)

# Usage
print(f"Static method: 5 + 3 = {MathUtils.add(5, 3)}")
print(f"Class method: Circle area (r=4) = {MathUtils.circle_area(4):.2f}")
print(f"Alternative constructor: Circle area (d=10) = {MathUtils.from_diameter(10):.2f}")

# Can also call on instances
math_utils = MathUtils()
print(f"Called on instance: {math_utils.add(2, 7)}")

## Generators and yield

Generators are a special type of iterator that generate values on-demand, making them memory efficient for large datasets.

In [None]:
# Basic generator function
def countdown(n):
    """Generator that counts down from n to 1."""
    print(f"Starting countdown from {n}")
    while n > 0:
        yield n
        n -= 1
    print("Countdown finished!")

# Create a generator object
counter = countdown(5)
print(f"Generator object: {counter}")
print(f"Type: {type(counter)}")
print()

# Use the generator
print("Using the generator:")
for number in counter:
    print(f"Count: {number}")
    time.sleep(0.1)  # Simulate some processing

In [None]:
# Generator vs List: Memory comparison
import sys

def number_generator(n):
    """Generator that yields numbers from 0 to n-1."""
    for i in range(n):
        yield i

def number_list(n):
    """Function that returns a list of numbers from 0 to n-1."""
    return list(range(n))

# Compare memory usage
n = 1000

gen = number_generator(n)
lst = number_list(n)

print(f"Generator size: {sys.getsizeof(gen)} bytes")
print(f"List size: {sys.getsizeof(lst)} bytes")
print(f"List is {sys.getsizeof(lst) / sys.getsizeof(gen):.1f}x larger")

# Generators are lazy - values are created on demand
print(f"\nFirst 5 values from generator:")
for i, value in enumerate(gen):
    if i >= 5:
        break
    print(value)

In [None]:
# Fibonacci generator
def fibonacci_generator():
    """Infinite Fibonacci sequence generator."""
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Generate first 15 Fibonacci numbers
fib_gen = fibonacci_generator()
fibonacci_numbers = []

for i in range(15):
    fibonacci_numbers.append(next(fib_gen))

print("First 15 Fibonacci numbers:")
print(fibonacci_numbers)

# We can continue getting more numbers
print(f"\nNext 5 Fibonacci numbers:")
for i in range(5):
    print(next(fib_gen))

In [None]:
# Generator expressions (generator comprehensions)
# Similar to list comprehensions but with parentheses

# List comprehension vs Generator expression
squares_list = [x**2 for x in range(10)]
squares_gen = (x**2 for x in range(10))

print(f"List comprehension: {squares_list}")
print(f"Generator expression: {squares_gen}")
print(f"Generator type: {type(squares_gen)}")

# Convert generator to list to see values
print(f"Generator values: {list(squares_gen)}")

# Generator expressions are memory efficient
large_squares_gen = (x**2 for x in range(1000000))
print(f"\nLarge generator size: {sys.getsizeof(large_squares_gen)} bytes")

# Get sum of first 100 squares using generator
squares_gen_100 = (x**2 for x in range(100))
sum_of_squares = sum(squares_gen_100)
print(f"Sum of first 100 squares: {sum_of_squares}")

## Practical Examples: Data Processing with Generators

In [None]:
# File processing with generators (memory efficient)
def process_large_file(filename):
    """Generator to process large files line by line."""
    try:
        with open(filename, 'r') as file:
            for line_number, line in enumerate(file, 1):
                # Process each line (strip whitespace, convert to uppercase)
                processed_line = line.strip().upper()
                if processed_line:  # Skip empty lines
                    yield line_number, processed_line
    except FileNotFoundError:
        print(f"File {filename} not found. Creating sample data...")
        # Create sample data if file doesn't exist
        sample_lines = [
            "This is line 1",
            "This is line 2",
            "",  # Empty line
            "This is line 4",
            "Final line"
        ]
        for i, line in enumerate(sample_lines, 1):
            if line.strip():
                yield i, line.strip().upper()

# Use the generator
print("Processing file with generator:")
for line_num, content in process_large_file('nonexistent.txt'):
    print(f"Line {line_num}: {content}")

In [None]:
# Data pipeline using generators
def read_numbers():
    """Generate a stream of numbers."""
    for i in range(1, 21):
        print(f"Reading number: {i}")
        yield i

def filter_even(numbers):
    """Filter even numbers from a stream."""
    for num in numbers:
        if num % 2 == 0:
            print(f"  Even number found: {num}")
            yield num

def square_numbers(numbers):
    """Square numbers in a stream."""
    for num in numbers:
        squared = num ** 2
        print(f"    Squaring {num} = {squared}")
        yield squared

# Create the data processing pipeline
print("Data processing pipeline:")
pipeline = square_numbers(filter_even(read_numbers()))

# Process only the first 3 results
print("\nGetting first 3 results:")
for i, result in enumerate(pipeline):
    if i >= 3:
        break
    print(f"Final result {i+1}: {result}")
    print()

In [None]:
# Generator for batch processing
def batch_generator(data, batch_size):
    """Generate batches of data for processing."""
    for i in range(0, len(data), batch_size):
        batch = data[i:i + batch_size]
        yield batch

# Simulate a large dataset
large_dataset = list(range(1, 101))  # Numbers 1 to 100
batch_size = 15

print(f"Processing dataset of {len(large_dataset)} items in batches of {batch_size}:")

for batch_num, batch in enumerate(batch_generator(large_dataset, batch_size), 1):
    # Process each batch (example: calculate sum)
    batch_sum = sum(batch)
    print(f"Batch {batch_num}: {len(batch)} items, sum = {batch_sum}")
    print(f"  Items: {batch[:5]}{'...' if len(batch) > 5 else ''}")
    
    # Simulate some processing time
    time.sleep(0.1)

## Advanced Generator Techniques

In [None]:
# Generator with send() method
def accumulator():
    """Generator that accumulates sent values."""
    total = 0
    while True:
        value = yield total
        if value is not None:
            total += value
            print(f"Added {value}, total is now {total}")

# Use the accumulator
acc = accumulator()
next(acc)  # Prime the generator

print("Using generator with send():")
print(f"Current total: {acc.send(10)}")
print(f"Current total: {acc.send(5)}")
print(f"Current total: {acc.send(3)}")
print(f"Current total: {next(acc)}")  # Get current total without adding

In [None]:
# Generator with exception handling
def robust_generator():
    """Generator that handles exceptions gracefully."""
    try:
        for i in range(10):
            print(f"Yielding: {i}")
            yield i
    except GeneratorExit:
        print("Generator is being closed!")
    except Exception as e:
        print(f"Exception in generator: {e}")
        yield -1  # Error indicator
    finally:
        print("Generator cleanup")

# Test the robust generator
gen = robust_generator()

# Get a few values
print("Getting values:")
for i in range(3):
    print(f"Value: {next(gen)}")

# Close the generator early
print("\nClosing generator:")
gen.close()

# Try to get more values (this will raise StopIteration)
try:
    next(gen)
except StopIteration:
    print("Generator is exhausted")

## Performance Comparison: List vs Generator

In [None]:
import time
import sys

def performance_test():
    """Compare performance of list vs generator for large datasets."""
    
    n = 100000
    
    # List approach
    print("Testing list approach:")
    start_time = time.time()
    numbers_list = [x**2 for x in range(n)]
    creation_time = time.time() - start_time
    list_memory = sys.getsizeof(numbers_list)
    
    start_time = time.time()
    list_sum = sum(numbers_list)
    processing_time = time.time() - start_time
    
    print(f"  Creation time: {creation_time:.4f} seconds")
    print(f"  Processing time: {processing_time:.4f} seconds")
    print(f"  Memory usage: {list_memory:,} bytes")
    print(f"  Sum: {list_sum}")
    
    # Generator approach
    print("\nTesting generator approach:")
    start_time = time.time()
    numbers_gen = (x**2 for x in range(n))
    creation_time = time.time() - start_time
    gen_memory = sys.getsizeof(numbers_gen)
    
    start_time = time.time()
    gen_sum = sum(numbers_gen)
    processing_time = time.time() - start_time
    
    print(f"  Creation time: {creation_time:.6f} seconds")
    print(f"  Processing time: {processing_time:.4f} seconds")
    print(f"  Memory usage: {gen_memory:,} bytes")
    print(f"  Sum: {gen_sum}")
    
    print(f"\nMemory savings: {list_memory / gen_memory:.1f}x less memory")

performance_test()

## Key Takeaways

### Decorators:
1. **Function enhancement**: Modify behavior without changing the original function
2. **Reusability**: Same decorator can be applied to multiple functions
3. **Common uses**: Logging, timing, authentication, caching
4. **Built-in decorators**: `@property`, `@staticmethod`, `@classmethod`
5. **Use `@functools.wraps`** to preserve function metadata

### Generators:
1. **Memory efficient**: Generate values on-demand instead of storing all in memory
2. **Lazy evaluation**: Values are computed only when needed
3. **One-time use**: Once exhausted, need to create a new generator
4. **Perfect for**: Large datasets, streaming data, infinite sequences
5. **Generator expressions**: Syntax similar to list comprehensions but with parentheses

### When to Use:

**Decorators:**
- Cross-cutting concerns (logging, timing, validation)
- Code reuse across multiple functions
- API development (authentication, rate limiting)

**Generators:**
- Processing large files or datasets
- Streaming data processing
- Memory-constrained environments
- Creating infinite sequences

## Practice Exercises

1. Create a `@cache` decorator that memoizes function results
2. Build a `@retry` decorator that retries failed function calls
3. Write a generator that reads CSV files in chunks
4. Create a data pipeline using multiple generators
5. Build a generator that simulates real-time data streaming
6. Implement a `@validate` decorator that checks function argument types
7. Create a generator for traversing directory trees

## Next Steps

These concepts are fundamental for:
- **Data processing pipelines**: Efficient handling of large datasets
- **API development**: Decorators for authentication, logging, etc.
- **Machine learning**: Memory-efficient data loading
- **NLP applications**: Processing large text corpora
- **Web frameworks**: Many use decorators extensively (Flask, Django)