# Python Data Structures Deep Dive

Understanding data structures is crucial for writing efficient code and solving algorithmic problems. This notebook provides an in-depth exploration of Python's built-in data structures and their computational complexities.

## Why Data Structures Matter:
- **Efficiency**: Choose the right structure for optimal performance
- **Memory usage**: Understand space complexities
- **Algorithm design**: Foundation for more complex algorithms
- **Interview preparation**: Common in technical interviews
- **Real-world applications**: Used in every software system

## Topics Covered:
- Lists: Implementation, operations, and time complexity
- Tuples: Immutable sequences and their uses
- Dictionaries: Hash tables and their applications
- Sets: Mathematical set operations
- Advanced data structures: deque, Counter, defaultdict
- Performance comparisons and best practices

In [None]:
# Import necessary libraries
import time
import sys
import random
import matplotlib.pyplot as plt
from collections import deque, Counter, defaultdict, namedtuple
import numpy as np

# Set random seed for reproducibility
random.seed(42)
np.random.seed(42)

print("Libraries imported successfully!")
print(f"Python version: {sys.version}")

## Lists: Dynamic Arrays

In [None]:
# List operations and time complexity
print("📋 Python Lists - Dynamic Arrays")
print("=" * 40)

# Create a list
numbers = [1, 2, 3, 4, 5]
print(f"Original list: {numbers}")
print(f"Memory size: {sys.getsizeof(numbers)} bytes")
print()

# List operations with time complexity analysis
operations = {
    "Access by index": "O(1)",
    "Search (linear)": "O(n)", 
    "Insert at end (append)": "O(1) amortized",
    "Insert at beginning": "O(n)",
    "Insert at middle": "O(n)",
    "Delete from end (pop)": "O(1)",
    "Delete from beginning": "O(n)",
    "Delete from middle": "O(n)",
    "Slice operation": "O(k) where k is slice length"
}

print("List Operations Time Complexity:")
print("-" * 40)
for operation, complexity in operations.items():
    print(f"{operation:<25}: {complexity}")

print()

# Demonstrate operations
print("Demonstrating operations:")

# Access
print(f"numbers[2] = {numbers[2]}")  # O(1)

# Search
print(f"3 in numbers: {3 in numbers}")  # O(n)

# Append (O(1) amortized)
numbers.append(6)
print(f"After append(6): {numbers}")

# Insert at beginning (O(n))
numbers.insert(0, 0)
print(f"After insert(0, 0): {numbers}")

# Insert at middle (O(n))
numbers.insert(3, 2.5)
print(f"After insert(3, 2.5): {numbers}")

# Remove from end (O(1))
last = numbers.pop()
print(f"After pop(): {numbers}, removed: {last}")

# Remove from beginning (O(n))
first = numbers.pop(0)
print(f"After pop(0): {numbers}, removed: {first}")

# Slice (O(k))
middle_slice = numbers[1:4]
print(f"Slice [1:4]: {middle_slice}")

In [None]:
# Performance analysis: append vs insert at beginning
def time_operation(operation_func, n_operations, description):
    """Time an operation and return the duration."""
    start_time = time.time()
    operation_func(n_operations)
    end_time = time.time()
    duration = end_time - start_time
    print(f"{description}: {duration:.4f} seconds for {n_operations:,} operations")
    return duration

def test_append(n):
    """Test list.append() performance."""
    lst = []
    for i in range(n):
        lst.append(i)

def test_insert_beginning(n):
    """Test list.insert(0, x) performance."""
    lst = []
    for i in range(n):
        lst.insert(0, i)

def test_extend_vs_append(n):
    """Compare extend vs multiple appends."""
    # Method 1: Multiple appends
    def multiple_appends():
        lst = []
        for i in range(n):
            lst.append(i)
        return lst
    
    # Method 2: Single extend
    def single_extend():
        lst = []
        lst.extend(range(n))
        return lst
    
    # Method 3: List comprehension
    def list_comprehension():
        return [i for i in range(n)]
    
    # Time each method
    start = time.time()
    multiple_appends()
    append_time = time.time() - start
    
    start = time.time()
    single_extend()
    extend_time = time.time() - start
    
    start = time.time()
    list_comprehension()
    comprehension_time = time.time() - start
    
    print(f"\nList creation methods for {n:,} elements:")
    print(f"Multiple appends: {append_time:.4f} seconds")
    print(f"Single extend: {extend_time:.4f} seconds")
    print(f"List comprehension: {comprehension_time:.4f} seconds")
    
    return append_time, extend_time, comprehension_time

# Performance comparison
print("\n⚡ Performance Analysis")
print("=" * 30)

n = 50000

# Test append vs insert at beginning
append_time = time_operation(test_append, n, "Append operations")
insert_time = time_operation(test_insert_beginning, n, "Insert at beginning")

print(f"\nInsert at beginning is {insert_time/append_time:.1f}x slower than append!")

# Test different list creation methods
test_extend_vs_append(100000)

# Memory usage comparison
print("\n💾 Memory Usage Analysis")
print("=" * 30)

sizes = [10, 100, 1000, 10000]
for size in sizes:
    lst = list(range(size))
    memory = sys.getsizeof(lst)
    per_element = memory / size if size > 0 else 0
    print(f"List of {size:5d} elements: {memory:6d} bytes ({per_element:.2f} bytes per element)")

## Dictionaries: Hash Tables

In [None]:
# Dictionary operations and hash table concepts
print("🔑 Python Dictionaries - Hash Tables")
print("=" * 40)

# Create a dictionary
student_grades = {'Alice': 95, 'Bob': 87, 'Charlie': 92, 'Diana': 98}
print(f"Student grades: {student_grades}")
print(f"Memory size: {sys.getsizeof(student_grades)} bytes")
print()

# Dictionary operations time complexity
dict_operations = {
    "Access by key": "O(1) average, O(n) worst case",
    "Insert/Update": "O(1) average, O(n) worst case",
    "Delete by key": "O(1) average, O(n) worst case",
    "Check key existence": "O(1) average, O(n) worst case",
    "Get all keys/values": "O(n)",
    "Iterate through items": "O(n)"
}

print("Dictionary Operations Time Complexity:")
print("-" * 45)
for operation, complexity in dict_operations.items():
    print(f"{operation:<20}: {complexity}")

print()

# Demonstrate operations
print("Demonstrating operations:")

# Access by key (O(1))
print(f"Alice's grade: {student_grades['Alice']}")

# Insert/Update (O(1))
student_grades['Eve'] = 89  # Insert
student_grades['Bob'] = 90  # Update
print(f"After adding Eve and updating Bob: {student_grades}")

# Check key existence (O(1))
print(f"'Alice' in grades: {'Alice' in student_grades}")
print(f"'Frank' in grades: {'Frank' in student_grades}")

# Get method with default (O(1))
print(f"Frank's grade (default 0): {student_grades.get('Frank', 0)}")

# Dictionary comprehension
high_performers = {name: grade for name, grade in student_grades.items() if grade >= 90}
print(f"High performers (>=90): {high_performers}")

# Keys, values, items
print(f"All students: {list(student_grades.keys())}")
print(f"All grades: {list(student_grades.values())}")
print(f"Average grade: {sum(student_grades.values()) / len(student_grades):.1f}")

In [None]:
# Hash collision demonstration and dictionary internals
print("\n🔍 Dictionary Internals and Hash Collisions")
print("=" * 50)

# Show hash values for different types
hash_examples = [
    "hello", "world", "python", 
    42, 3.14, 
    (1, 2, 3),  # Tuples are hashable
    frozenset([1, 2, 3])  # Frozensets are hashable
]

print("Hash values for different objects:")
for obj in hash_examples:
    print(f"hash({obj!r:15}) = {hash(obj):20d}")

print()

# Demonstrate why lists and sets can't be dictionary keys
try:
    bad_dict = {[1, 2, 3]: "list as key"}  # This will fail
except TypeError as e:
    print(f"Error using list as key: {e}")

try:
    bad_dict = {{1, 2, 3}: "set as key"}  # This will also fail
except TypeError as e:
    print(f"Error using set as key: {e}")

print()

# Performance comparison: dictionary vs list for lookups
def compare_lookup_performance():
    """Compare dictionary vs list lookup performance."""
    n = 10000
    
    # Create test data
    data_list = list(range(n))
    data_dict = {i: i for i in range(n)}
    
    # Test lookups
    search_items = random.sample(range(n), 1000)
    
    # List lookup (O(n))
    start_time = time.time()
    for item in search_items:
        item in data_list
    list_time = time.time() - start_time
    
    # Dictionary lookup (O(1))
    start_time = time.time()
    for item in search_items:
        item in data_dict
    dict_time = time.time() - start_time
    
    print(f"Lookup Performance (1000 lookups in {n:,} items):")
    print(f"List lookup time: {list_time:.4f} seconds")
    print(f"Dict lookup time: {dict_time:.4f} seconds")
    print(f"Dictionary is {list_time/dict_time:.1f}x faster!")

compare_lookup_performance()

# Memory usage: dictionary vs list
print("\n💾 Memory Usage: Dictionary vs List")
print("=" * 40)

sizes = [100, 1000, 10000]
for size in sizes:
    # List of integers
    test_list = list(range(size))
    list_memory = sys.getsizeof(test_list)
    
    # Dictionary with integers as keys and values
    test_dict = {i: i for i in range(size)}
    dict_memory = sys.getsizeof(test_dict)
    
    print(f"Size {size:5d}: List = {list_memory:7d} bytes, Dict = {dict_memory:7d} bytes, Ratio = {dict_memory/list_memory:.2f}x")

## Sets: Mathematical Set Operations

In [None]:
# Set operations and mathematical set theory
print("🔵 Python Sets - Mathematical Set Operations")
print("=" * 45)

# Create sets
set_a = {1, 2, 3, 4, 5}
set_b = {4, 5, 6, 7, 8}
set_c = {1, 2, 3}

print(f"Set A: {set_a}")
print(f"Set B: {set_b}")
print(f"Set C: {set_c}")
print(f"Memory size of set A: {sys.getsizeof(set_a)} bytes")
print()

# Set operations time complexity
set_operations = {
    "Add element": "O(1) average",
    "Remove element": "O(1) average", 
    "Check membership": "O(1) average",
    "Union (A | B)": "O(len(A) + len(B))",
    "Intersection (A & B)": "O(min(len(A), len(B)))",
    "Difference (A - B)": "O(len(A))",
    "Symmetric difference (A ^ B)": "O(len(A) + len(B))"
}

print("Set Operations Time Complexity:")
print("-" * 35)
for operation, complexity in set_operations.items():
    print(f"{operation:<25}: {complexity}")

print()

# Demonstrate mathematical set operations
print("Mathematical Set Operations:")
print("-" * 30)

# Union: elements in either set
union = set_a | set_b
print(f"A ∪ B (union): {union}")
print(f"Also: A.union(B) = {set_a.union(set_b)}")

# Intersection: elements in both sets
intersection = set_a & set_b
print(f"A ∩ B (intersection): {intersection}")
print(f"Also: A.intersection(B) = {set_a.intersection(set_b)}")

# Difference: elements in A but not in B
difference = set_a - set_b
print(f"A - B (difference): {difference}")
print(f"B - A (reverse): {set_b - set_a}")

# Symmetric difference: elements in A or B but not both
sym_diff = set_a ^ set_b
print(f"A ⊕ B (symmetric difference): {sym_diff}")
print(f"Also: A.symmetric_difference(B) = {set_a.symmetric_difference(set_b)}")

print()

# Set relationships
print("Set Relationships:")
print("-" * 20)
print(f"C ⊆ A (C is subset of A): {set_c.issubset(set_a)}")
print(f"A ⊇ C (A is superset of C): {set_a.issuperset(set_c)}")
print(f"A ∩ B = ∅ (A and B are disjoint): {set_a.isdisjoint(set_b)}")

# Create disjoint sets for demonstration
set_x = {1, 2, 3}
set_y = {4, 5, 6}
print(f"X = {set_x}, Y = {set_y}")
print(f"X ∩ Y = ∅ (X and Y are disjoint): {set_x.isdisjoint(set_y)}")

In [None]:
# Practical applications of sets
print("\n🎯 Practical Applications of Sets")
print("=" * 40)

# Example 1: Remove duplicates from a list
numbers_with_duplicates = [1, 2, 2, 3, 3, 3, 4, 4, 5]
unique_numbers = list(set(numbers_with_duplicates))
print(f"Original: {numbers_with_duplicates}")
print(f"Unique: {unique_numbers}")
print()

# Example 2: Find common elements between lists
students_math = ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
students_physics = ['Bob', 'Diana', 'Frank', 'Grace', 'Alice']

both_subjects = set(students_math) & set(students_physics)
only_math = set(students_math) - set(students_physics)
only_physics = set(students_physics) - set(students_math)

print("Students taking both Math and Physics:", both_subjects)
print("Students taking only Math:", only_math)
print("Students taking only Physics:", only_physics)
print()

# Example 3: Validate data uniqueness
def has_duplicates(items):
    """Check if a list has duplicates using sets."""
    return len(items) != len(set(items))

test_lists = [
    [1, 2, 3, 4, 5],  # No duplicates
    [1, 2, 2, 3, 4],  # Has duplicates
    ['a', 'b', 'c'],  # No duplicates
    ['a', 'b', 'a']   # Has duplicates
]

for test_list in test_lists:
    has_dups = has_duplicates(test_list)
    print(f"{test_list} has duplicates: {has_dups}")

print()

# Performance comparison: set vs list for membership testing
def compare_membership_performance():
    """Compare set vs list for membership testing."""
    n = 10000
    
    # Create test data
    data_list = list(range(n))
    data_set = set(range(n))
    
    # Test membership
    search_items = random.sample(range(n), 1000)
    
    # List membership (O(n))
    start_time = time.time()
    for item in search_items:
        item in data_list
    list_time = time.time() - start_time
    
    # Set membership (O(1))
    start_time = time.time()
    for item in search_items:
        item in data_set
    set_time = time.time() - start_time
    
    print(f"Membership Testing Performance (1000 tests in {n:,} items):")
    print(f"List membership time: {list_time:.4f} seconds")
    print(f"Set membership time: {set_time:.4f} seconds")
    print(f"Set is {list_time/set_time:.1f}x faster!")

compare_membership_performance()

# Set vs list memory usage
print("\n💾 Memory Usage: Set vs List")
print("=" * 35)

sizes = [100, 1000, 10000]
for size in sizes:
    test_list = list(range(size))
    test_set = set(range(size))
    
    list_memory = sys.getsizeof(test_list)
    set_memory = sys.getsizeof(test_set)
    
    print(f"Size {size:5d}: List = {list_memory:7d} bytes, Set = {set_memory:7d} bytes, Ratio = {set_memory/list_memory:.2f}x")

## Advanced Data Structures from Collections

In [None]:
# Advanced collections module data structures
print("🚀 Advanced Data Structures from Collections Module")
print("=" * 55)

# 1. deque (double-ended queue)
print("1. deque - Double-ended Queue")
print("-" * 30)

# Create a deque
dq = deque([1, 2, 3, 4, 5])
print(f"Original deque: {dq}")

# deque operations (all O(1) at both ends)
dq.appendleft(0)  # Add to left end
dq.append(6)      # Add to right end
print(f"After appendleft(0) and append(6): {dq}")

left = dq.popleft()  # Remove from left end
right = dq.pop()     # Remove from right end
print(f"After popleft() and pop(): {dq} (removed {left} and {right})")

# Rotation
dq.rotate(2)  # Rotate right by 2 positions
print(f"After rotate(2): {dq}")
dq.rotate(-2)  # Rotate left by 2 positions
print(f"After rotate(-2): {dq}")

# Limited length deque
limited_dq = deque([1, 2, 3], maxlen=3)
print(f"\nLimited deque (maxlen=3): {limited_dq}")
limited_dq.append(4)  # This will remove the leftmost element
print(f"After append(4): {limited_dq}")
limited_dq.appendleft(0)  # This will remove the rightmost element
print(f"After appendleft(0): {limited_dq}")

print()

# Performance comparison: deque vs list for operations at both ends
def compare_deque_vs_list():
    """Compare deque vs list for operations at both ends."""
    n = 10000
    
    # Test appendleft performance
    start_time = time.time()
    test_list = []
    for i in range(n):
        test_list.insert(0, i)  # O(n) for each insertion
    list_time = time.time() - start_time
    
    start_time = time.time()
    test_deque = deque()
    for i in range(n):
        test_deque.appendleft(i)  # O(1) for each insertion
    deque_time = time.time() - start_time
    
    print(f"Left-end insertions ({n:,} operations):")
    print(f"List time: {list_time:.4f} seconds")
    print(f"Deque time: {deque_time:.4f} seconds")
    print(f"Deque is {list_time/deque_time:.1f}x faster for left-end insertions!")

compare_deque_vs_list()
print()

In [None]:
# 2. Counter - counting hashable objects
print("2. Counter - Counting Hashable Objects")
print("-" * 40)

# Count characters in a string
text = "hello world"
char_count = Counter(text)
print(f"Character counts in '{text}': {char_count}")

# Count words in a sentence
sentence = "the quick brown fox jumps over the lazy dog the fox is quick"
word_count = Counter(sentence.split())
print(f"\nWord counts: {word_count}")
print(f"Most common words: {word_count.most_common(3)}")

# Counter arithmetic
counter1 = Counter(['a', 'b', 'c', 'a', 'b'])
counter2 = Counter(['a', 'b', 'b', 'd'])

print(f"\nCounter1: {counter1}")
print(f"Counter2: {counter2}")
print(f"Addition: {counter1 + counter2}")
print(f"Subtraction: {counter1 - counter2}")
print(f"Intersection: {counter1 & counter2}")
print(f"Union: {counter1 | counter2}")

# Practical example: analyze text statistics
def analyze_text(text):
    """Analyze text and provide statistics."""
    # Clean and split text
    words = text.lower().replace('.', '').replace(',', '').split()
    
    word_counter = Counter(words)
    char_counter = Counter(text.lower().replace(' ', '').replace('.', '').replace(',', ''))
    
    print(f"Text analysis for: '{text[:50]}...'")
    print(f"Total words: {len(words)}")
    print(f"Unique words: {len(word_counter)}")
    print(f"Most common words: {word_counter.most_common(5)}")
    print(f"Most common letters: {char_counter.most_common(5)}")
    
    return word_counter, char_counter

sample_text = "Python is a powerful programming language. Python is easy to learn and Python is versatile."
word_stats, char_stats = analyze_text(sample_text)
print()

# 3. defaultdict - dictionary with default values
print("3. defaultdict - Dictionary with Default Values")
print("-" * 45)

# Group words by their first letter
words = ['apple', 'banana', 'cherry', 'apricot', 'blueberry', 'avocado']

# Using regular dictionary (more verbose)
grouped_regular = {}
for word in words:
    first_letter = word[0]
    if first_letter not in grouped_regular:
        grouped_regular[first_letter] = []
    grouped_regular[first_letter].append(word)

print(f"Regular dict grouping: {dict(grouped_regular)}")

# Using defaultdict (more concise)
grouped_default = defaultdict(list)
for word in words:
    grouped_default[word[0]].append(word)

print(f"defaultdict grouping: {dict(grouped_default)}")

# Different default factories
dd_int = defaultdict(int)  # Default value is 0
dd_list = defaultdict(list)  # Default value is []
dd_set = defaultdict(set)   # Default value is set()

# Count occurrences using defaultdict(int)
numbers = [1, 2, 2, 3, 3, 3, 4, 4, 4, 4]
count_dd = defaultdict(int)
for num in numbers:
    count_dd[num] += 1  # No need to check if key exists

print(f"\nCounting with defaultdict(int): {dict(count_dd)}")
print(f"Access non-existent key: {count_dd[999]} (automatically 0)")
print()

In [None]:
# 4. namedtuple - tuple with named fields
print("4. namedtuple - Tuple with Named Fields")
print("-" * 40)

# Define a namedtuple
Person = namedtuple('Person', ['name', 'age', 'city'])
Point = namedtuple('Point', ['x', 'y'])

# Create instances
person1 = Person('Alice', 30, 'New York')
person2 = Person('Bob', 25, 'Los Angeles')
point1 = Point(10, 20)

print(f"Person 1: {person1}")
print(f"Name: {person1.name}, Age: {person1.age}, City: {person1.city}")
print(f"Access by index: {person1[0]}, {person1[1]}, {person1[2]}")
print()

# namedtuple methods
print("namedtuple methods:")
print(f"person1._asdict(): {person1._asdict()}")
print(f"person1._replace(age=31): {person1._replace(age=31)}")
print(f"Person._fields: {Person._fields}")
print()

# Practical example: representing data records
Student = namedtuple('Student', ['id', 'name', 'grades'])

students = [
    Student(1, 'Alice', [95, 87, 92]),
    Student(2, 'Bob', [78, 85, 90]),
    Student(3, 'Charlie', [88, 92, 86])
]

print("Student records:")
for student in students:
    avg_grade = sum(student.grades) / len(student.grades)
    print(f"{student.name} (ID: {student.id}): Average = {avg_grade:.1f}")

print()

# Memory comparison: namedtuple vs regular tuple vs class
class PersonClass:
    def __init__(self, name, age, city):
        self.name = name
        self.age = age
        self.city = city

# Create instances
regular_tuple = ('Alice', 30, 'New York')
named_tuple = Person('Alice', 30, 'New York')
person_obj = PersonClass('Alice', 30, 'New York')

print("Memory usage comparison:")
print(f"Regular tuple: {sys.getsizeof(regular_tuple)} bytes")
print(f"namedtuple: {sys.getsizeof(named_tuple)} bytes")
print(f"Class object: {sys.getsizeof(person_obj)} bytes")
print(f"Class object + __dict__: {sys.getsizeof(person_obj) + sys.getsizeof(person_obj.__dict__)} bytes")

# Performance comparison: attribute access
def compare_attribute_access():
    """Compare attribute access performance."""
    n = 100000
    
    nt = Person('Alice', 30, 'New York')
    obj = PersonClass('Alice', 30, 'New York')
    
    # namedtuple attribute access
    start_time = time.time()
    for _ in range(n):
        _ = nt.name
    nt_time = time.time() - start_time
    
    # Class object attribute access
    start_time = time.time()
    for _ in range(n):
        _ = obj.name
    obj_time = time.time() - start_time
    
    print(f"\nAttribute access performance ({n:,} operations):")
    print(f"namedtuple: {nt_time:.4f} seconds")
    print(f"Class object: {obj_time:.4f} seconds")
    print(f"namedtuple is {obj_time/nt_time:.1f}x faster!")

compare_attribute_access()

## Performance Summary and Best Practices

In [None]:
# Comprehensive performance comparison
def comprehensive_performance_test():
    """Compare performance of different data structures for various operations."""
    
    sizes = [1000, 10000, 50000]
    results = {}
    
    print("🏆 Comprehensive Performance Comparison")
    print("=" * 50)
    
    for size in sizes:
        print(f"\nTesting with {size:,} elements:")
        print("-" * 30)
        
        # Create test data
        data = list(range(size))
        test_list = data.copy()
        test_set = set(data)
        test_dict = {i: i for i in data}
        test_deque = deque(data)
        
        # Test searches
        search_items = random.sample(data, min(1000, size))
        
        # List search
        start = time.time()
        for item in search_items:
            item in test_list
        list_search_time = time.time() - start
        
        # Set search
        start = time.time()
        for item in search_items:
            item in test_set
        set_search_time = time.time() - start
        
        # Dict search
        start = time.time()
        for item in search_items:
            item in test_dict
        dict_search_time = time.time() - start
        
        # Print results
        print(f"Search {len(search_items)} items:")
        print(f"  List: {list_search_time:.4f}s")
        print(f"  Set:  {set_search_time:.4f}s ({list_search_time/set_search_time:.1f}x faster)")
        print(f"  Dict: {dict_search_time:.4f}s ({list_search_time/dict_search_time:.1f}x faster)")
        
        # Test insertions at beginning
        n_insertions = min(1000, size // 10)
        
        # List insert at beginning
        temp_list = []
        start = time.time()
        for i in range(n_insertions):
            temp_list.insert(0, i)
        list_insert_time = time.time() - start
        
        # Deque insert at beginning
        temp_deque = deque()
        start = time.time()
        for i in range(n_insertions):
            temp_deque.appendleft(i)
        deque_insert_time = time.time() - start
        
        print(f"\nInsert {n_insertions} items at beginning:")
        print(f"  List: {list_insert_time:.4f}s")
        print(f"  Deque: {deque_insert_time:.4f}s ({list_insert_time/deque_insert_time:.1f}x faster)")

comprehensive_performance_test()

# Best practices summary
print("\n\n🎯 Data Structure Selection Guide")
print("=" * 40)

guidelines = {
    "Use Lists when:": [
        "You need ordered, mutable sequences",
        "You frequently access elements by index",
        "You mostly append to the end",
        "You need to maintain insertion order"
    ],
    "Use Dictionaries when:": [
        "You need fast key-based lookups",
        "You're mapping keys to values", 
        "You need O(1) average access time",
        "You're counting or grouping data"
    ],
    "Use Sets when:": [
        "You need unique elements only",
        "You're doing mathematical set operations",
        "You need fast membership testing",
        "You're removing duplicates"
    ],
    "Use Deque when:": [
        "You need efficient operations at both ends",
        "You're implementing queues or stacks",
        "You need thread-safe append/pop operations",
        "You want bounded-length containers"
    ],
    "Use Counter when:": [
        "You're counting hashable objects",
        "You need frequency analysis",
        "You want multiset operations",
        "You're doing statistical analysis"
    ],
    "Use namedtuple when:": [
        "You want immutable records with named fields",
        "You need memory-efficient objects",
        "You want tuple benefits with readability",
        "You're representing simple data structures"
    ]
}

for category, items in guidelines.items():
    print(f"\n{category}")
    for item in items:
        print(f"  • {item}")

print("\n\n⚡ Performance Quick Reference")
print("=" * 35)

perf_table = [
    ["Operation", "List", "Dict", "Set", "Deque"],
    ["-" * 12, "-" * 8, "-" * 8, "-" * 8, "-" * 8],
    ["Access/Search", "O(n)", "O(1)", "O(1)", "O(n)"],
    ["Insert End", "O(1)", "O(1)", "O(1)", "O(1)"],
    ["Insert Begin", "O(n)", "N/A", "N/A", "O(1)"],
    ["Delete End", "O(1)", "O(1)", "O(1)", "O(1)"],
    ["Delete Begin", "O(n)", "N/A", "N/A", "O(1)"],
    ["Memory Usage", "Low", "High", "High", "Low"]
]

for row in perf_table:
    print(f"{row[0]:12} {row[1]:8} {row[2]:8} {row[3]:8} {row[4]:8}")

## Key Takeaways

### Data Structure Complexities:

**Lists (Dynamic Arrays):**
- ✅ **Strengths**: Ordered, indexable, flexible size
- ❌ **Weaknesses**: Slow insertion/deletion at beginning, linear search
- 🎯 **Best for**: Sequential access, frequent appends, maintaining order

**Dictionaries (Hash Tables):**
- ✅ **Strengths**: Fast O(1) lookup, insertion, deletion
- ❌ **Weaknesses**: Higher memory usage, no guaranteed order (Python 3.7+ maintains insertion order)
- 🎯 **Best for**: Key-value mappings, fast lookups, counting

**Sets (Hash Sets):**
- ✅ **Strengths**: Fast membership testing, unique elements, set operations
- ❌ **Weaknesses**: No indexing, no duplicates, unordered
- 🎯 **Best for**: Uniqueness checking, mathematical set operations

**Advanced Collections:**
- **deque**: Efficient operations at both ends
- **Counter**: Counting and frequency analysis
- **defaultdict**: Automatic default value handling
- **namedtuple**: Memory-efficient named fields

### Critical Performance Insights:

1. **Search Operations**:
   - Lists: O(n) - avoid for frequent searches
   - Dicts/Sets: O(1) - excellent for lookups

2. **Memory Usage**:
   - Lists: Most memory efficient
   - Dicts/Sets: ~3-4x more memory than lists
   - Choose based on performance needs vs memory constraints

3. **Insertion Patterns**:
   - End insertions: All structures are O(1)
   - Beginning insertions: Use deque instead of list
   - Middle insertions: Generally expensive for all

### Decision Framework:

**Ask yourself:**
1. Do you need fast lookups? → **Dictionary/Set**
2. Do you need ordering and indexing? → **List**
3. Do you insert/remove from both ends? → **Deque**
4. Are you counting things? → **Counter**
5. Do you need unique elements? → **Set**
6. Do you need named fields with immutability? → **namedtuple**

### Real-world Applications:

- **Web Development**: Dictionaries for caches, sets for user permissions
- **Data Analysis**: Lists for sequences, Counter for frequency analysis
- **Algorithms**: Deques for BFS, sets for visited nodes
- **System Programming**: namedtuples for configuration, defaultdict for grouping

## Practice Exercises:

1. **Implement a cache** using dictionary with size limits
2. **Build a word frequency analyzer** using Counter
3. **Create a graph representation** using defaultdict
4. **Implement BFS/DFS** using deque and sets
5. **Design a student record system** using namedtuples
6. **Build a duplicate detector** using sets
7. **Create a time-series data structure** using deque with maxlen
8. **Implement set operations** from scratch

Understanding these data structures deeply will make you a more effective programmer and help you write more efficient code!