# Performance Optimization in Python

This notebook covers performance optimization techniques for Python code, focusing on offline development scenarios. Learn how to identify bottlenecks, optimize algorithms, and improve code efficiency without external tools.

## What you'll learn:
- Performance measurement techniques
- Algorithm optimization
- Memory optimization
- Profiling and benchmarking
- Common performance pitfalls
- Optimization best practices

## 1. Measuring Performance

### Using time module for basic timing

In [None]:
import time
from typing import List

# Example: Measuring execution time
def measure_time(func, *args, **kwargs):
    """Measure the execution time of a function."""
    start_time = time.time()
    result = func(*args, **kwargs)
    end_time = time.time()
    execution_time = end_time - start_time
    return result, execution_time

# Test function
def sum_squares(n: int) -> int:
    """Calculate sum of squares from 1 to n."""
    return sum(i**2 for i in range(1, n + 1))

# Measure performance
result, exec_time = measure_time(sum_squares, 10000)
print(f"Result: {result}")
print(f"Execution time: {exec_time:.6f} seconds")

### Comparing different implementations

In [None]:
# Different implementations of the same algorithm
def sum_squares_loop(n: int) -> int:
    """Using explicit loop."""
    total = 0
    for i in range(1, n + 1):
        total += i ** 2
    return total

def sum_squares_comprehension(n: int) -> int:
    """Using list comprehension."""
    return sum(i**2 for i in range(1, n + 1))

def sum_squares_math(n: int) -> int:
    """Using mathematical formula: n(n+1)(2n+1)/6"""
    return n * (n + 1) * (2 * n + 1) // 6

# Compare performance
n = 100000

print("Comparing different implementations:")
print("=" * 50)

# Test loop implementation
result1, time1 = measure_time(sum_squares_loop, n)
print(f"Loop:           {time1:.6f}s (result: {result1})")

# Test comprehension implementation
result2, time2 = measure_time(sum_squares_comprehension, n)
print(f"Comprehension:  {time2:.6f}s (result: {result2})")

# Test mathematical implementation
result3, time3 = measure_time(sum_squares_math, n)
print(f"Mathematical:   {time3:.6f}s (result: {result3})")

# Verify results are the same
assert result1 == result2 == result3
print(f"\n✅ All results match! Mathematical approach is {time1/time3:.1f}x faster.")

## 2. Algorithm Optimization

### Example: Finding duplicates in a list

In [None]:
# Inefficient approach
def find_duplicates_slow(nums: List[int]) -> List[int]:
    """Find all duplicate elements using nested loops."""
    duplicates = []
    seen = set()
    
    for i in range(len(nums)):
        if nums[i] in seen:
            if nums[i] not in duplicates:
                duplicates.append(nums[i])
        else:
            for j in range(i + 1, len(nums)):
                if nums[i] == nums[j]:
                    duplicates.append(nums[i])
                    break
            seen.add(nums[i])
    
    return duplicates

# Efficient approach
def find_duplicates_fast(nums: List[int]) -> List[int]:
    """Find all duplicate elements using hash set."""
    seen = set()
    duplicates = set()
    
    for num in nums:
        if num in seen:
            duplicates.add(num)
        else:
            seen.add(num)
    
    return list(duplicates)

# Test data
import random
test_data = [random.randint(1, 1000) for _ in range(5000)]
test_data.extend(test_data[:100])  # Add some duplicates

print("Finding duplicates in a list of 5000+ elements:")
print("=" * 50)

# Test slow approach
result1, time1 = measure_time(find_duplicates_slow, test_data)
print(f"Slow approach: {time1:.6f}s (found {len(result1)} duplicates)")

# Test fast approach
result2, time2 = measure_time(find_duplicates_fast, test_data)
print(f"Fast approach: {time2:.6f}s (found {len(result2)} duplicates)")

# Verify results
assert set(result1) == set(result2)
print(f"\n✅ Results match! Fast approach is {time1/time2:.1f}x faster.")

### Big O Notation Analysis

In [None]:
# Analyzing time complexity
def analyze_complexity():
    """Demonstrate different time complexities."""
    import math
    
    sizes = [100, 1000, 10000]
    
    print("Time Complexity Analysis:")
    print("=" * 40)
    print("n\tO(1)\tO(log n)\tO(n)\tO(n log n)\tO(n²)")
    print("-" * 60)
    
    for n in sizes:
        o1 = 1
        ologn = math.log2(n)
        on = n
        onlogn = n * math.log2(n)
        on2 = n * n
        
        print(f"{n}\t{o1}\t{ologn:.1f}\t{on}\t{onlogn:.0f}\t\t{on2}")
    
    print("\nKey insights:")
    print("- O(1): Constant time - best performance")
    print("- O(log n): Logarithmic - excellent for large datasets")
    print("- O(n): Linear - good performance")
    print("- O(n log n): Acceptable for most applications")
    print("- O(n²): Quadratic - avoid for large datasets")

analyze_complexity()

## 3. Memory Optimization

### Efficient data structures

In [None]:
import sys
from typing import Dict, List

# Compare memory usage of different approaches
def memory_comparison():
    """Compare memory usage of different data structures."""
    
    # Create test data
    data = list(range(10000))
    
    # List approach
    list_memory = sys.getsizeof(data)
    
    # Set approach (for membership testing)
    set_data = set(data)
    set_memory = sys.getsizeof(set_data)
    
    # Dictionary approach
    dict_data = {i: i**2 for i in data}
    dict_memory = sys.getsizeof(dict_data)
    
    print("Memory Usage Comparison:")
    print("=" * 30)
    print(f"List (10,000 ints): {list_memory} bytes")
    print(f"Set (10,000 ints):  {set_memory} bytes")
    print(f"Dict (10,000 pairs): {dict_memory} bytes")
    
    # Test lookup performance
    target = 5000
    
    # List lookup (O(n))
    _, list_time = measure_time(lambda: target in data)
    
    # Set lookup (O(1))
    _, set_time = measure_time(lambda: target in set_data)
    
    # Dict lookup (O(1))
    _, dict_time = measure_time(lambda: dict_data.get(target))
    
    print(f"\nLookup Performance (target: {target}):")
    print(f"List: {list_time:.8f}s")
    print(f"Set:  {set_time:.8f}s ({list_time/set_time:.0f}x faster)")
    print(f"Dict: {dict_time:.8f}s ({list_time/dict_time:.0f}x faster)")

memory_comparison()

### Generator vs List Comprehension

In [None]:
# Memory-efficient processing with generators
def process_with_list(n: int) -> List[int]:
    """Process data using list comprehension."""
    return [i**2 for i in range(n) if i % 2 == 0]

def process_with_generator(n: int):
    """Process data using generator expression."""
    return (i**2 for i in range(n) if i % 2 == 0)

# Compare memory usage
n = 100000

print("Memory Comparison: List vs Generator")
print("=" * 40)

# List approach
list_result = process_with_list(n)
list_memory = sys.getsizeof(list_result)
print(f"List memory: {list_memory} bytes")

# Generator approach
gen_result = process_with_generator(n)
gen_memory = sys.getsizeof(gen_result)
print(f"Generator memory: {gen_memory} bytes")

print(f"\nMemory savings: {((list_memory - gen_memory) / list_memory * 100):.1f}%")

# Test processing time
def sum_list():
    return sum(process_with_list(n))

def sum_generator():
    return sum(process_with_generator(n))

_, list_time = measure_time(sum_list)
_, gen_time = measure_time(sum_generator)

print(f"\nProcessing time:")
print(f"List: {list_time:.4f}s")
print(f"Generator: {gen_time:.4f}s")
print(f"Generator is {list_time/gen_time:.1f}x faster for large datasets!")

## 4. Profiling Techniques

### Simple profiling with time measurements

In [None]:
# Profile different parts of a function
def profile_function():
    """A function with multiple operations to profile."""
    
    # Operation 1: Data generation
    start = time.time()
    data = [i for i in range(10000)]
    gen_time = time.time() - start
    
    # Operation 2: Processing
    start = time.time()
    processed = [x * 2 + 1 for x in data]
    proc_time = time.time() - start
    
    # Operation 3: Filtering
    start = time.time()
    filtered = [x for x in processed if x % 3 == 0]
    filter_time = time.time() - start
    
    # Operation 4: Aggregation
    start = time.time()
    result = sum(filtered)
    agg_time = time.time() - start
    
    return {
        'result': result,
        'times': {
            'generation': gen_time,
            'processing': proc_time,
            'filtering': filter_time,
            'aggregation': agg_time,
            'total': gen_time + proc_time + filter_time + agg_time
        }
    }

# Run profiling
profile_result = profile_function()

print("Function Profiling Results:")
print("=" * 30)
print(f"Final result: {profile_result['result']}")
print("\nExecution times:")

times = profile_result['times']
total_time = times['total']

for operation, op_time in times.items():
    if operation != 'total':
        percentage = (op_time / total_time) * 100
        print(f"  {operation.capitalize()}: {op_time:.6f}s ({percentage:.1f}%)")

print(f"\nTotal time: {total_time:.6f}s")

# Identify bottleneck
bottleneck = max(times.items(), key=lambda x: x[1] if x[0] != 'total' else 0)
print(f"\n🚨 Bottleneck: {bottleneck[0]} ({bottleneck[1]:.6f}s)")

## 5. Common Performance Pitfalls

### String concatenation in loops

In [None]:
# Inefficient string concatenation
def build_string_slow(n: int) -> str:
    """Inefficient string building using + operator."""
    result = ""
    for i in range(n):
        result += str(i)
    return result

# Efficient string concatenation
def build_string_fast(n: int) -> str:
    """Efficient string building using join."""
    return "".join(str(i) for i in range(n))

# Compare performance
n = 10000

print("String Concatenation Performance:")
print("=" * 35)

_, slow_time = measure_time(build_string_slow, n)
_, fast_time = measure_time(build_string_fast, n)

print(f"Slow method (+): {slow_time:.6f}s")
print(f"Fast method (join): {fast_time:.6f}s")
print(f"\nJoin is {slow_time/fast_time:.1f}x faster!")

# Verify results are the same
assert build_string_slow(100) == build_string_fast(100)
print("✅ Results are identical")

### Unnecessary computations in loops

In [None]:
# Inefficient: Computing length in each iteration
def process_list_slow(items: List[int]) -> List[int]:
    """Process list with length calculation in loop."""
    result = []
    for i in range(len(items)):  # len() called each iteration
        if items[i] > 0:
            result.append(items[i] * 2)
    return result

# Efficient: Calculate length once
def process_list_fast(items: List[int]) -> List[int]:
    """Process list with length calculated once."""
    result = []
    n = len(items)  # Calculate once
    for i in range(n):
        if items[i] > 0:
            result.append(items[i] * 2)
    return result

# Test data
test_data = list(range(-50000, 50000))

print("Loop Optimization:")
print("=" * 20)

_, slow_time = measure_time(process_list_slow, test_data)
_, fast_time = measure_time(process_list_fast, test_data)

print(f"Slow (len in loop): {slow_time:.6f}s")
print(f"Fast (len once):    {fast_time:.6f}s")
print(f"\nOptimization gives {slow_time/fast_time:.1f}x speedup!")

# Verify results
assert process_list_slow(test_data) == process_list_fast(test_data)
print("✅ Results are identical")

## 6. Caching and Memoization

### Simple caching with dictionary

In [None]:
# Fibonacci without caching (exponential time)
def fib_slow(n: int) -> int:
    """Calculate nth Fibonacci number without caching."""
    if n <= 1:
        return n
    return fib_slow(n - 1) + fib_slow(n - 2)

# Fibonacci with caching (linear time)
def fib_fast(n: int, cache: Dict[int, int] = None) -> int:
    """Calculate nth Fibonacci number with caching."""
    if cache is None:
        cache = {}
    
    if n in cache:
        return cache[n]
    
    if n <= 1:
        cache[n] = n
        return n
    
    result = fib_fast(n - 1, cache) + fib_fast(n - 2, cache)
    cache[n] = result
    return result

# Compare performance
n = 35

print("Fibonacci Performance Comparison:")
print("=" * 35)

# Test without caching (only for small n)
if n <= 30:
    result1, time1 = measure_time(fib_slow, n)
    print(f"Without caching: {time1:.6f}s (result: {result1})")
else:
    print("Without caching: Too slow for n > 30")
    time1 = float('inf')

# Test with caching
result2, time2 = measure_time(fib_fast, n)
print(f"With caching:    {time2:.6f}s (result: {result2})")

if time1 != float('inf'):
    print(f"\nCaching gives {time1/time2:.0f}x speedup!")
else:
    print("\nCaching makes it feasible for large n!")

## 7. Performance Best Practices

### Optimization checklist

In [None]:
# Performance optimization checklist
def performance_checklist():
    """Display performance optimization checklist."""
    
    checklist = {
        "Algorithm Selection": [
            "Choose the right algorithm for the job",
            "Consider time/space complexity trade-offs",
            "Use built-in functions when possible",
            "Avoid reinventing the wheel"
        ],
        "Data Structures": [
            "Use sets for membership testing",
            "Use dictionaries for fast lookups",
            "Consider memory vs speed trade-offs",
            "Use generators for large datasets"
        ],
        "Code Optimization": [
            "Move invariant code out of loops",
            "Use list comprehensions wisely",
            "Avoid unnecessary function calls",
            "Use appropriate data types"
        ],
        "Measurement": [
            "Measure before optimizing",
            "Identify bottlenecks first",
            "Use appropriate profiling tools",
            "Test optimizations thoroughly"
        ]
    }
    
    for category, items in checklist.items():
        print(f"\n{category}:")
        for item in items:
            print(f"  ✅ {item}")

performance_checklist()

## 8. Exercise: Optimize a Function

Here's a function that can be optimized. Your task is to:
1. Identify performance bottlenecks
2. Optimize the algorithm
3. Measure the improvements
4. Ensure correctness is maintained

In [None]:
# Function to optimize
def find_common_elements_slow(lists: List[List[int]]) -> List[int]:
    """Find elements common to all lists (inefficient version)."""
    if not lists:
        return []
    
    # Start with first list
    common = lists[0][:]
    
    # Check each element against all other lists
    for element in common[:]:
        for other_list in lists[1:]:
            if element not in other_list:
                common.remove(element)
                break
    
    return common

# Optimized version
def find_common_elements_fast(lists: List[List[int]]) -> List[int]:
    """Find elements common to all lists (optimized version)."""
    if not lists:
        return []
    
    # Convert to sets for O(1) lookup
    sets = [set(lst) for lst in lists]
    
    # Start with smallest set for efficiency
    sets.sort(key=len)
    common = sets[0].copy()
    
    # Intersect with remaining sets
    for other_set in sets[1:]:
        common &= other_set
    
    return sorted(list(common))

# Test data
import random
test_lists = [
    [random.randint(1, 1000) for _ in range(500)] for _ in range(5)
]

# Add some common elements
common_elements = [42, 73, 99]
for lst in test_lists:
    lst.extend(common_elements)

print("Optimizing Common Elements Search:")
print("=" * 35)

# Test slow version (only for small data)
small_test = [[1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7]]
result1, time1 = measure_time(find_common_elements_slow, small_test)
print(f"Slow version: {time1:.6f}s (result: {result1})")

# Test fast version
result2, time2 = measure_time(find_common_elements_fast, test_lists)
print(f"Fast version: {time2:.6f}s (found {len(result2)} common elements)")

# Verify results for small test
result1_fast, _ = measure_time(find_common_elements_fast, small_test)
assert result1 == result1_fast
print("\n✅ Results are correct!")

if time1 > 0:
    print(f"Optimization gives {time1/time2:.1f}x speedup on large data!")
else:
    print("Fast version handles large datasets efficiently!")

## Summary

This notebook covered:
- ✅ Performance measurement techniques
- ✅ Algorithm optimization examples
- ✅ Memory optimization strategies
- ✅ Profiling and benchmarking
- ✅ Common performance pitfalls
- ✅ Caching and memoization
- ✅ Performance best practices

## Key Takeaways

1. **Measure First**: Always profile before optimizing
2. **Algorithm Matters**: Choose the right algorithm for the job
3. **Data Structures**: Use appropriate data structures for the task
4. **Memory vs Speed**: Consider the trade-offs
5. **Cache Wisely**: Use caching for expensive computations
6. **Avoid Pitfalls**: Watch out for common performance anti-patterns
7. **Test Thoroughly**: Ensure optimizations don't break functionality

## Next Steps

1. **Profile your code** using the techniques learned
2. **Identify bottlenecks** in your applications
3. **Apply optimizations** systematically
4. **Measure improvements** to ensure they're worthwhile
5. **Learn advanced profiling** tools when available

Remember: Premature optimization is the root of all evil. Focus on writing correct, maintainable code first, then optimize the bottlenecks!