# Itertools Mastery - Advanced Iteration Patterns
## 🟡 Intermediate Level

**Goal**: Master Python's itertools module for efficient iteration and combinatorial operations

**Time**: ~50 minutes

**Prerequisites**: Complete `01_builtin_functions.ipynb`

**Functions Covered**: `chain()`, `combinations()`, `permutations()`, `groupby()`, `cycle()`, `repeat()`, `product()`, `compress()`, `dropwhile()`, `takewhile()`

---

In [None]:
# Import the itertools module
import itertools
from itertools import *  # For convenience in examples

## Part 1: Chaining Iterables with chain()

**Concept**: `chain()` flattens multiple iterables into a single iterator

**Syntax**: `chain(*iterables)` or `chain.from_iterable(iterable_of_iterables)`

**Use Cases**: Combining lists, flattening nested structures, processing multiple data sources

In [None]:
# Example: Combining multiple data sources
morning_sales = [150, 200, 175]
afternoon_sales = [300, 250, 400]
evening_sales = [100, 125, 90]

# Traditional approach
all_sales_traditional = morning_sales + afternoon_sales + evening_sales
print(f"Traditional concatenation: {all_sales_traditional}")

# Using chain()
all_sales_chain = list(chain(morning_sales, afternoon_sales, evening_sales))
print(f"Using chain(): {all_sales_chain}")

# Using chain.from_iterable() with a list of lists
sales_data = [morning_sales, afternoon_sales, evening_sales]
all_sales_from_iterable = list(chain.from_iterable(sales_data))
print(f"Using chain.from_iterable(): {all_sales_from_iterable}")

# Calculate total sales
total_sales = sum(chain(morning_sales, afternoon_sales, evening_sales))
print(f"Total sales: ${total_sales}")

### Exercise 1: Multi-Source Data Aggregation

**Scenario**: Combine customer feedback from multiple platforms (email, social media, surveys).

**Tasks**:
1. Flatten feedback from different sources
2. Extract all unique keywords
3. Calculate overall sentiment scores
4. Generate unified reports

In [None]:
# Exercise 1: Multi-Source Data Aggregation

# Feedback data from different sources
email_feedback = [
    {"source": "email", "rating": 4, "keywords": ["fast", "reliable"]},
    {"source": "email", "rating": 5, "keywords": ["excellent", "support"]},
    {"source": "email", "rating": 3, "keywords": ["slow", "expensive"]}
]

social_feedback = [
    {"source": "twitter", "rating": 5, "keywords": ["amazing", "fast"]},
    {"source": "facebook", "rating": 2, "keywords": ["poor", "service"]},
    {"source": "instagram", "rating": 4, "keywords": ["good", "quality"]}
]

survey_feedback = [
    {"source": "survey", "rating": 4, "keywords": ["reliable", "good"]},
    {"source": "survey", "rating": 5, "keywords": ["excellent", "fast"]},
    {"source": "survey", "rating": 3, "keywords": ["average", "okay"]}
]

# TODO: Combine all feedback using chain()
all_feedback = list(chain(email_feedback, social_feedback, survey_feedback))
print(f"Total feedback entries: {len(all_feedback)}")

# TODO: Extract all ratings
all_ratings = list(chain.from_iterable(
    [[feedback["rating"]] for feedback in all_feedback]
))
# Simpler approach:
all_ratings_simple = [feedback["rating"] for feedback in all_feedback]

print(f"All ratings: {all_ratings_simple}")
print(f"Average rating: {sum(all_ratings_simple) / len(all_ratings_simple):.2f}")

# TODO: Extract all keywords using chain.from_iterable()
all_keywords = list(chain.from_iterable(
    feedback["keywords"] for feedback in all_feedback
))
print(f"All keywords: {all_keywords}")

# TODO: Find unique keywords
unique_keywords = list(set(all_keywords))
print(f"Unique keywords: {unique_keywords}")

# TODO: Count keyword frequency
keyword_counts = {}
for keyword in all_keywords:
    keyword_counts[keyword] = keyword_counts.get(keyword, 0) + 1

print(f"Keyword frequency: {keyword_counts}")

## Part 2: Combinatorial Functions - combinations() and permutations()

**Concepts**: 
- `combinations(iterable, r)` - r-length tuples, no repeated elements, order doesn't matter
- `permutations(iterable, r)` - r-length tuples, no repeated elements, order matters
- `combinations_with_replacement(iterable, r)` - allows repeated elements

**Use Cases**: Password generation, team formation, testing scenarios, algorithm optimization

In [None]:
# Example: Team formation scenarios
team_members = ["Alice", "Bob", "Charlie", "Diana"]

# All possible 2-person teams (combinations - order doesn't matter)
two_person_teams = list(combinations(team_members, 2))
print(f"2-person teams (combinations): {len(two_person_teams)} teams")
for i, team in enumerate(two_person_teams, 1):
    print(f"  Team {i}: {team}")

# All possible 2-person leadership pairs (permutations - order matters)
leadership_pairs = list(permutations(team_members, 2))
print(f"\nLeadership pairs (permutations): {len(leadership_pairs)} pairs")
for i, pair in enumerate(leadership_pairs[:6], 1):  # Show first 6
    print(f"  Pair {i}: {pair[0]} leads {pair[1]}")
print(f"  ... and {len(leadership_pairs) - 6} more")

# All possible 3-person committees
committees = list(combinations(team_members, 3))
print(f"\n3-person committees: {len(committees)} committees")
for i, committee in enumerate(committees, 1):
    print(f"  Committee {i}: {committee}")

### Exercise 2: Menu Planning System

**Scenario**: A restaurant needs to create meal combinations and test different menu arrangements.

**Tasks**:
1. Generate all possible 3-course meal combinations
2. Create different menu orderings for A/B testing
3. Find optimal ingredient pairings
4. Calculate pricing for all combinations

In [None]:
# Exercise 2: Menu Planning System

# Menu items by category
appetizers = [("Salad", 8), ("Soup", 6), ("Bruschetta", 9)]
mains = [("Pasta", 15), ("Steak", 25), ("Fish", 20), ("Chicken", 18)]
desserts = [("Cake", 7), ("Ice Cream", 5), ("Fruit", 4)]

# TODO: Generate all possible 3-course meals (one from each category)
three_course_meals = list(product(appetizers, mains, desserts))
print(f"Total 3-course meal combinations: {len(three_course_meals)}")

# Show first 5 combinations with prices
print("\nSample meal combinations:")
for i, meal in enumerate(three_course_meals[:5], 1):
    appetizer, main, dessert = meal
    total_price = appetizer[1] + main[1] + dessert[1]
    print(f"  Meal {i}: {appetizer[0]} + {main[0]} + {dessert[0]} = ${total_price}")

# TODO: Find meals within budget ($30)
budget_meals = []
for meal in three_course_meals:
    total_price = sum(item[1] for item in meal)
    if total_price <= 30:
        budget_meals.append((meal, total_price))

print(f"\nMeals under $30: {len(budget_meals)} options")
for meal, price in budget_meals[:3]:
    items = [item[0] for item in meal]
    print(f"  {' + '.join(items)} = ${price}")

# TODO: Generate different menu orderings for A/B testing
all_items = [item[0] for item in appetizers + mains + desserts]
menu_orderings = list(permutations(all_items[:4]))  # First 4 items only
print(f"\nDifferent menu orderings (first 4 items): {len(menu_orderings)} arrangements")
print(f"Sample ordering: {menu_orderings[0]}")

## Part 3: Grouping Data with groupby()

**Concept**: `groupby(iterable, key=None)` groups consecutive elements by a key function

**Important**: Data must be sorted by the key function first!

**Use Cases**: Data analysis, report generation, categorization, aggregation

In [None]:
# Example: Sales data analysis
sales_data = [
    {"product": "Laptop", "category": "Electronics", "amount": 1200},
    {"product": "Mouse", "category": "Electronics", "amount": 25},
    {"product": "Desk", "category": "Furniture", "amount": 300},
    {"product": "Chair", "category": "Furniture", "amount": 150},
    {"product": "Notebook", "category": "Stationery", "amount": 5},
    {"product": "Pen", "category": "Stationery", "amount": 2}
]

# IMPORTANT: Sort by category first!
sales_data_sorted = sorted(sales_data, key=lambda x: x["category"])

# Group by category
print("Sales by category:")
for category, items in groupby(sales_data_sorted, key=lambda x: x["category"]):
    items_list = list(items)  # Convert iterator to list
    total_amount = sum(item["amount"] for item in items_list)
    product_count = len(items_list)
    
    print(f"\n{category}:")
    print(f"  Products: {product_count}")
    print(f"  Total sales: ${total_amount}")
    print(f"  Items: {[item['product'] for item in items_list]}")

### Exercise 3: Log Analysis with groupby()

**Scenario**: Analyze server logs grouped by status codes and time periods.

**Tasks**:
1. Group log entries by status code
2. Calculate error rates by time period
3. Find patterns in error occurrences
4. Generate summary reports

In [None]:
# Exercise 3: Log Analysis with groupby()

# Server log data (timestamp, status_code, response_time)
log_entries = [
    ("09:00", 200, 0.1), ("09:01", 200, 0.2), ("09:02", 404, 0.05),
    ("09:03", 200, 0.15), ("09:04", 500, 1.2), ("09:05", 200, 0.1),
    ("09:06", 404, 0.05), ("09:07", 200, 0.3), ("09:08", 500, 0.8),
    ("09:09", 200, 0.12), ("09:10", 404, 0.04)
]

# TODO: Sort by status code first
log_sorted = sorted(log_entries, key=lambda x: x[1])

# TODO: Group by status code and analyze
print("Log Analysis by Status Code:")
for status_code, entries in groupby(log_sorted, key=lambda x: x[1]):
    entries_list = list(entries)
    count = len(entries_list)
    avg_response_time = sum(entry[2] for entry in entries_list) / count
    
    status_name = {
        200: "Success",
        404: "Not Found", 
        500: "Server Error"
    }.get(status_code, "Unknown")
    
    print(f"\n{status_code} ({status_name}):")
    print(f"  Count: {count}")
    print(f"  Average response time: {avg_response_time:.3f}s")
    print(f"  Times: {[entry[0] for entry in entries_list]}")

# TODO: Calculate error rate
total_requests = len(log_entries)
error_requests = len([entry for entry in log_entries if entry[1] >= 400])
error_rate = (error_requests / total_requests) * 100
print(f"\nOverall error rate: {error_rate:.1f}% ({error_requests}/{total_requests})")

## Part 4: Infinite Iterators

**Concept**: Infinite iterators generate values indefinitely - use with caution!

**Functions**: `cycle()`, `repeat()`, `count()`

**Use Cases**: Cycling through options, default values, counters, round-robin scheduling

In [None]:
# Example: cycle() - Repeats elements from an iterable infinitely
colors = ['red', 'green', 'blue']
color_cycle = cycle(colors)

print("First 10 colors from cycle:")
for i, color in enumerate(color_cycle):
    if i >= 10:  # IMPORTANT: Always have a stopping condition!
        break
    print(f"Item {i+1}: {color}")

# Practical use: Round-robin server assignment
servers = ['server1', 'server2', 'server3']
server_cycle = cycle(servers)
requests = ['req1', 'req2', 'req3', 'req4', 'req5', 'req6', 'req7']

print("\nServer assignments:")
for request in requests:
    assigned_server = next(server_cycle)
    print(f"{request} -> {assigned_server}")

In [None]:
# Example: repeat() - Repeats a single value
# repeat(value, times=None) - if times is None, repeats forever

# Finite repeat
default_values = list(repeat(0, 5))
print(f"Default values: {default_values}")

# Using with map() for initialization
names = ['Alice', 'Bob', 'Charlie']
initial_scores = list(map(lambda name, score: (name, score), names, repeat(100)))
print(f"Initial scores: {initial_scores}")

# Practical use: Padding lists to same length
data_lists = [
    [1, 2, 3],
    [4, 5],
    [6, 7, 8, 9]
]

max_length = max(len(lst) for lst in data_lists)
print(f"\nPadding lists to length {max_length}:")

padded_lists = []
for lst in data_lists:
    padding_needed = max_length - len(lst)
    padded = lst + list(repeat(0, padding_needed))
    padded_lists.append(padded)
    print(f"{lst} -> {padded}")

In [None]:
# Example: count() - Arithmetic progression
# count(start=0, step=1)

# Basic counter
counter = count(1)  # Start from 1
print("First 8 numbers:")
for i, num in enumerate(counter):
    if i >= 8:
        break
    print(num, end=' ')
print()

# Counter with step
even_numbers = count(0, 2)  # Start from 0, step by 2
print("\nFirst 6 even numbers:")
for i, num in enumerate(even_numbers):
    if i >= 6:
        break
    print(num, end=' ')
print()

# Practical use: Generating IDs with enumerate alternative
products = ['Laptop', 'Mouse', 'Keyboard', 'Monitor']
product_ids = count(1001)  # Start IDs from 1001

print("\nProduct catalog:")
for product, product_id in zip(products, product_ids):
    print(f"ID {product_id}: {product}")

### Exercise 4: Task Scheduling with Infinite Iterators

**Scenario**: Create a task scheduler that cycles through workers and assigns sequential task IDs.

**Tasks**:
1. Implement round-robin worker assignment
2. Generate sequential task IDs
3. Handle worker availability cycles
4. Create task priority queues

In [None]:
# Exercise 4: Task Scheduling with Infinite Iterators

# Available workers and their specialties
workers = [
    {'name': 'Alice', 'specialty': 'backend'},
    {'name': 'Bob', 'specialty': 'frontend'},
    {'name': 'Charlie', 'specialty': 'database'},
    {'name': 'Diana', 'specialty': 'testing'}
]

# Incoming tasks
tasks = [
    'Fix login bug', 'Update UI design', 'Optimize queries', 
    'Write unit tests', 'Add new feature', 'Database migration',
    'Code review', 'Performance testing'
]

# TODO: Create worker cycle and task ID generator
worker_cycle = cycle(workers)
task_ids = count(1001)  # Start task IDs from 1001

# TODO: Assign tasks to workers
print("Task Assignments:")
assignments = []
for task in tasks:
    worker = next(worker_cycle)
    task_id = next(task_ids)
    assignment = {
        'task_id': task_id,
        'task': task,
        'worker': worker['name'],
        'specialty': worker['specialty']
    }
    assignments.append(assignment)
    print(f"Task {task_id}: '{task}' -> {worker['name']} ({worker['specialty']})")

# TODO: Create priority queue with repeat for default priorities
priority_levels = ['high', 'medium', 'low']
default_priority = repeat('medium')  # Default to medium priority

# Assign priorities (first 3 are high, rest are default)
priorities = ['high'] * 3 + list(repeat('medium', len(tasks) - 3))

print("\nTask Priorities:")
for assignment, priority in zip(assignments, priorities):
    assignment['priority'] = priority
    print(f"Task {assignment['task_id']}: {priority} priority")

## Part 5: Filtering Iterators

**Concept**: Filter and select elements based on conditions

**Functions**: `compress()`, `dropwhile()`, `takewhile()`, `filterfalse()`

**Use Cases**: Data cleaning, conditional processing, stream filtering

In [None]:
# Example: compress() - Filter based on boolean selectors
data = ['A', 'B', 'C', 'D', 'E', 'F']
selectors = [1, 0, 1, 0, 1, 1]  # 1 = include, 0 = exclude

filtered_data = list(compress(data, selectors))
print(f"Original: {data}")
print(f"Selectors: {selectors}")
print(f"Filtered: {filtered_data}")

# Practical use: Feature selection in data
features = ['age', 'income', 'education', 'location', 'experience', 'skills']
feature_importance = [0.8, 0.9, 0.6, 0.3, 0.7, 0.85]
threshold = 0.7

# Create selector based on importance threshold
important_features_selector = [score >= threshold for score in feature_importance]
important_features = list(compress(features, important_features_selector))

print(f"\nFeature selection (threshold >= {threshold}):")
print(f"All features: {features}")
print(f"Importance scores: {feature_importance}")
print(f"Selected features: {important_features}")

In [None]:
# Example: dropwhile() and takewhile() - Conditional start/stop
numbers = [1, 3, 5, 8, 10, 12, 15, 17, 20]

# dropwhile: Skip elements while condition is true, then take all remaining
drop_odd = list(dropwhile(lambda x: x % 2 == 1, numbers))
print(f"Original: {numbers}")
print(f"Drop while odd: {drop_odd}")

# takewhile: Take elements while condition is true, then stop
take_small = list(takewhile(lambda x: x < 10, numbers))
print(f"Take while < 10: {take_small}")

# Practical use: Processing log entries
log_lines = [
    "INFO: System started",
    "INFO: Loading config", 
    "INFO: Database connected",
    "ERROR: Connection timeout",
    "ERROR: Retry failed",
    "INFO: System recovered",
    "INFO: Processing requests"
]

# Skip INFO messages until first ERROR
errors_and_after = list(dropwhile(lambda line: not line.startswith('ERROR'), log_lines))
print(f"\nFrom first error onwards: {errors_and_after}")

# Take only initial INFO messages
initial_info = list(takewhile(lambda line: line.startswith('INFO'), log_lines))
print(f"Initial INFO messages: {initial_info}")

In [None]:
# Example: filterfalse() - Opposite of filter()
# filter() keeps elements where condition is True
# filterfalse() keeps elements where condition is False

scores = [85, 92, 78, 96, 88, 74, 91, 83]
passing_grade = 80

# Using built-in filter()
passed = list(filter(lambda x: x >= passing_grade, scores))
print(f"Passed (>= {passing_grade}): {passed}")

# Using filterfalse() - keeps elements where condition is False
failed = list(filterfalse(lambda x: x >= passing_grade, scores))
print(f"Failed (< {passing_grade}): {failed}")

# Practical use: Data validation
email_addresses = [
    'user@example.com',
    'invalid-email',
    'test@domain.org',
    'bad@',
    'good@company.net',
    '@invalid.com'
]

# Simple email validation (contains @ and .)
is_valid_email = lambda email: '@' in email and '.' in email and email.count('@') == 1

valid_emails = list(filter(is_valid_email, email_addresses))
invalid_emails = list(filterfalse(is_valid_email, email_addresses))

print(f"\nValid emails: {valid_emails}")
print(f"Invalid emails: {invalid_emails}")

### Exercise 5: Data Pipeline with Filtering

**Scenario**: Process sensor data stream with various filtering requirements.

**Tasks**:
1. Filter out invalid readings using compress()
2. Skip initial calibration period with dropwhile()
3. Take readings until maintenance window with takewhile()
4. Separate normal and abnormal readings with filter/filterfalse

In [None]:
# Exercise 5: Data Pipeline with Filtering

# Sensor readings: (timestamp, temperature, humidity, is_valid)
sensor_data = [
    ('08:00', 22.1, 45.2, True),   # Calibration period
    ('08:01', 21.8, 44.8, True),   # Calibration period  
    ('08:02', 22.5, 46.1, False),  # Invalid reading
    ('08:03', 23.2, 47.3, True),   # Normal operation starts
    ('08:04', 24.1, 48.9, True),
    ('08:05', 26.8, 52.1, True),   # High temperature
    ('08:06', 23.9, 47.8, True),
    ('08:07', 28.5, 55.3, True),   # Very high temperature
    ('08:08', 24.2, 48.2, False),  # Invalid reading
    ('08:09', 23.1, 46.9, True),
    ('08:10', 22.8, 45.8, True)    # Maintenance window starts
]

print(f"Total sensor readings: {len(sensor_data)}")

# TODO: Step 1 - Filter out invalid readings using compress()
validity_flags = [reading[3] for reading in sensor_data]
valid_readings = list(compress(sensor_data, validity_flags))
print(f"Valid readings: {len(valid_readings)}")

# TODO: Step 2 - Skip calibration period (first 3 readings) using dropwhile()
calibration_end_time = '08:03'
post_calibration = list(dropwhile(lambda x: x[0] < calibration_end_time, valid_readings))
print(f"Post-calibration readings: {len(post_calibration)}")

# TODO: Step 3 - Take readings until maintenance window using takewhile()
maintenance_start_time = '08:10'
operational_readings = list(takewhile(lambda x: x[0] < maintenance_start_time, post_calibration))
print(f"Operational readings: {len(operational_readings)}")

# TODO: Step 4 - Separate normal and abnormal readings
temp_threshold = 25.0  # Temperature threshold
humidity_threshold = 50.0  # Humidity threshold

is_normal = lambda reading: reading[1] <= temp_threshold and reading[2] <= humidity_threshold

normal_readings = list(filter(is_normal, operational_readings))
abnormal_readings = list(filterfalse(is_normal, operational_readings))

print(f"\nFinal Results:")
print(f"Normal readings: {len(normal_readings)}")
print(f"Abnormal readings: {len(abnormal_readings)}")

print(f"\nAbnormal readings details:")
for time, temp, humidity, _ in abnormal_readings:
    print(f"  {time}: {temp}°C, {humidity}% humidity")

## Part 6: Advanced Combinations and Product

**Concept**: More complex combinatorial operations

**Functions**: `combinations_with_replacement()`, advanced `product()` usage

**Use Cases**: Advanced permutation problems, configuration generation, testing scenarios

In [None]:
# Example: combinations_with_replacement() - Combinations allowing repetition
colors = ['red', 'blue', 'green']

# Regular combinations (no repetition)
regular_combos = list(combinations(colors, 2))
print(f"Regular combinations (2 from {colors}): {regular_combos}")

# Combinations with replacement (repetition allowed)
replacement_combos = list(combinations_with_replacement(colors, 2))
print(f"With replacement (2 from {colors}): {replacement_combos}")

# Practical use: Dice roll combinations
dice_faces = [1, 2, 3, 4, 5, 6]
two_dice_sums = []

for combo in combinations_with_replacement(dice_faces, 2):
    dice_sum = sum(combo)
    two_dice_sums.append((combo, dice_sum))

print(f"\nTwo dice combinations and sums:")
for combo, total in two_dice_sums[:10]:  # Show first 10
    print(f"  {combo} -> sum = {total}")

# Find most common sum
sum_counts = {}
for _, total in two_dice_sums:
    sum_counts[total] = sum_counts.get(total, 0) + 1

most_common_sum = max(sum_counts, key=sum_counts.get)
print(f"\nMost common sum: {most_common_sum} (appears {sum_counts[most_common_sum]} times)")

In [None]:
# Example: Advanced product() usage - Configuration testing
# Testing different combinations of system configurations

# System configuration options
operating_systems = ['Windows', 'macOS', 'Linux']
browsers = ['Chrome', 'Firefox', 'Safari']
screen_sizes = ['1920x1080', '1366x768', '2560x1440']
connection_types = ['WiFi', 'Ethernet', '4G']

# Generate all possible test configurations
all_configs = list(product(operating_systems, browsers, screen_sizes, connection_types))
total_configs = len(all_configs)

print(f"Total test configurations: {total_configs}")
print(f"Sample configurations:")
for i, config in enumerate(all_configs[:5], 1):
    os, browser, screen, connection = config
    print(f"  Config {i}: {os} + {browser} + {screen} + {connection}")

# Filter configurations for specific testing scenarios
# Scenario 1: Mobile-like configurations (smaller screens + wireless)
mobile_configs = [
    config for config in all_configs 
    if config[2] == '1366x768' and config[3] in ['WiFi', '4G']
]

print(f"\nMobile-like configurations: {len(mobile_configs)}")

# Scenario 2: High-end configurations (high resolution + fast connection)
high_end_configs = [
    config for config in all_configs
    if config[2] == '2560x1440' and config[3] == 'Ethernet'
]

print(f"High-end configurations: {len(high_end_configs)}")

# Using product with repeat parameter
# Generate password combinations (simplified example)
digits = '0123456789'
four_digit_pins = list(product(digits, repeat=4))
print(f"\nTotal 4-digit PIN combinations: {len(four_digit_pins)}")
print(f"Sample PINs: {''.join(four_digit_pins[0])}, {''.join(four_digit_pins[1])}, {''.join(four_digit_pins[2])}")

## Part 7: Comprehensive Integration Exercise

**Scenario**: E-commerce Analytics Dashboard

**Challenge**: Build a complete analytics system that uses multiple itertools functions to process sales data, generate reports, and create recommendations.

**Skills Tested**: All itertools functions covered in this notebook

In [None]:
# Part 7: Comprehensive Integration Exercise - E-commerce Analytics

# Sales data from multiple sources
online_sales = [
    ('2024-01-15', 'Electronics', 'Laptop', 1200, 'Premium'),
    ('2024-01-15', 'Electronics', 'Mouse', 25, 'Standard'),
    ('2024-01-16', 'Clothing', 'Shirt', 45, 'Standard'),
    ('2024-01-16', 'Electronics', 'Keyboard', 80, 'Premium')
]

store_sales = [
    ('2024-01-15', 'Books', 'Novel', 15, 'Standard'),
    ('2024-01-16', 'Clothing', 'Jeans', 60, 'Premium'),
    ('2024-01-16', 'Books', 'Textbook', 120, 'Premium')
]

mobile_sales = [
    ('2024-01-15', 'Electronics', 'Phone Case', 20, 'Standard'),
    ('2024-01-16', 'Clothing', 'Hat', 25, 'Standard')
]

# Task 1: Combine all sales data using chain()
all_sales = list(chain(online_sales, store_sales, mobile_sales))
print(f"Total sales records: {len(all_sales)}")

# Task 2: Add sequential transaction IDs using count()
transaction_ids = count(10001)
sales_with_ids = [(next(transaction_ids), *sale) for sale in all_sales]

print("\nSales with Transaction IDs:")
for sale in sales_with_ids[:3]:
    tid, date, category, product, price, tier = sale
    print(f"  {tid}: {date} - {product} (${price})")

# Task 3: Group sales by category using groupby()
# First, sort by category
sales_by_category = sorted(sales_with_ids, key=lambda x: x[2])

print("\nSales Analysis by Category:")
category_stats = {}
for category, sales_group in groupby(sales_by_category, key=lambda x: x[2]):
    sales_list = list(sales_group)
    total_revenue = sum(sale[4] for sale in sales_list)
    avg_price = total_revenue / len(sales_list)
    category_stats[category] = {
        'count': len(sales_list),
        'revenue': total_revenue,
        'avg_price': avg_price
    }
    print(f"  {category}: {len(sales_list)} items, ${total_revenue} revenue, ${avg_price:.2f} avg")

In [None]:
# Task 4: Generate product recommendations using combinations()
# Find products that are frequently bought together
products = list(set(sale[3] for sale in sales_with_ids))
print(f"\nAvailable products: {products}")

# Generate all possible product pairs for "frequently bought together"
product_pairs = list(combinations(products, 2))
print(f"Possible product combinations: {len(product_pairs)}")
print(f"Sample pairs: {product_pairs[:3]}")

# Task 5: Create promotional bundles using product()
# Bundle: one item from each category
electronics = [sale[3] for sale in sales_with_ids if sale[2] == 'Electronics']
clothing = [sale[3] for sale in sales_with_ids if sale[2] == 'Clothing']
books = [sale[3] for sale in sales_with_ids if sale[2] == 'Books']

# Remove duplicates
electronics = list(set(electronics))
clothing = list(set(clothing))
books = list(set(books))

if electronics and clothing and books:
    bundle_combinations = list(product(electronics[:2], clothing[:2], books[:1]))
    print(f"\nPromotional bundles (Electronics + Clothing + Books):")
    for i, bundle in enumerate(bundle_combinations, 1):
        print(f"  Bundle {i}: {' + '.join(bundle)}")

# Task 6: Filter premium sales using compress()
is_premium = [sale[5] == 'Premium' for sale in sales_with_ids]
premium_sales = list(compress(sales_with_ids, is_premium))

print(f"\nPremium sales: {len(premium_sales)} out of {len(sales_with_ids)}")
premium_revenue = sum(sale[4] for sale in premium_sales)
print(f"Premium revenue: ${premium_revenue}")

# Task 7: Create customer segments using cycle()
customer_segments = ['Bronze', 'Silver', 'Gold', 'Platinum']
segment_cycle = cycle(customer_segments)

# Assign segments to transactions (simplified)
print(f"\nCustomer segment assignments:")
for sale in sales_with_ids[:6]:
    segment = next(segment_cycle)
    tid, date, category, product, price, tier = sale
    print(f"  Transaction {tid}: {segment} customer")

In [None]:
# Task 8: Generate final analytics report
print("\n" + "="*50)
print("E-COMMERCE ANALYTICS DASHBOARD")
print("="*50)

# Overall statistics
total_revenue = sum(sale[4] for sale in sales_with_ids)
total_transactions = len(sales_with_ids)
avg_transaction_value = total_revenue / total_transactions

print(f"\n📊 OVERALL PERFORMANCE:")
print(f"   Total Transactions: {total_transactions}")
print(f"   Total Revenue: ${total_revenue:,.2f}")
print(f"   Average Transaction: ${avg_transaction_value:.2f}")

# Category performance
print(f"\n📈 CATEGORY PERFORMANCE:")
sorted_categories = sorted(category_stats.items(), key=lambda x: x[1]['revenue'], reverse=True)
for category, stats in sorted_categories:
    print(f"   {category}: ${stats['revenue']:,.2f} ({stats['count']} items)")

# Premium vs Standard analysis
standard_sales = [sale for sale in sales_with_ids if sale[5] == 'Standard']
standard_revenue = sum(sale[4] for sale in standard_sales)

print(f"\n💎 TIER ANALYSIS:")
print(f"   Premium: ${premium_revenue:,.2f} ({len(premium_sales)} transactions)")
print(f"   Standard: ${standard_revenue:,.2f} ({len(standard_sales)} transactions)")
print(f"   Premium Ratio: {(premium_revenue/total_revenue)*100:.1f}%")

# Recommendations
top_category = sorted_categories[0][0]
print(f"\n🎯 RECOMMENDATIONS:")
print(f"   • Focus marketing on {top_category} (highest revenue)")
print(f"   • Promote premium tier (currently {(len(premium_sales)/total_transactions)*100:.1f}% of transactions)")
print(f"   • Consider cross-category bundles for increased sales")

print("\n" + "="*50)

## Summary and Next Steps

**🎉 Congratulations!** You've mastered Python's itertools module!

### Functions Covered:

**🔗 Chaining & Flattening:**
- `chain()` - Combine multiple iterables
- `chain.from_iterable()` - Flatten nested structures

**🎲 Combinatorics:**
- `combinations()` - Choose r items without repetition
- `combinations_with_replacement()` - Choose r items with repetition
- `permutations()` - All arrangements of r items
- `product()` - Cartesian product of iterables

**📊 Grouping & Analysis:**
- `groupby()` - Group consecutive elements by key
- `enumerate()` - Add indices to iterables

**♾️ Infinite Iterators:**
- `cycle()` - Repeat elements infinitely
- `repeat()` - Repeat single value
- `count()` - Arithmetic progression

**🔍 Filtering:**
- `compress()` - Filter by boolean selectors
- `dropwhile()` - Skip elements while condition is true
- `takewhile()` - Take elements while condition is true
- `filterfalse()` - Keep elements where condition is false

### Key Takeaways:
- ✅ Itertools functions are memory-efficient (lazy evaluation)
- ✅ Always use stopping conditions with infinite iterators
- ✅ Sort data before using `groupby()`
- ✅ Convert iterators to lists when you need to reuse results
- ✅ Combine multiple itertools functions for complex data processing

### Next Steps:
Ready for more advanced Python? Try:
- **Collections module** - `Counter`, `defaultdict`, `deque`
- **Functools module** - `partial`, `reduce`, `lru_cache`
- **Advanced data structures** - Custom classes and algorithms

---

**🚀 Pro Tip**: Practice combining itertools functions in real projects - they're incredibly powerful for data processing, analysis, and algorithm implementation!