# Collections and Automation

Welcome to the second half of Day One! In the previous lesson, you learned about different data types and how to make decisions. Now you'll discover one of programming's greatest superpowers: **working with collections of data** and **automating repetitive tasks**.

By the end of this lesson, you'll be able to store multiple pieces of information together and process them automatically - no more writing the same code over and over again!

**Learning Objectives**:
- Create and manipulate Python lists
- Access and modify list elements
- Use for loops to process collections automatically
- Combine lists and loops with conditional logic
- Build programs that handle multiple data items efficiently
- Apply automation to solve real-world problems

**Time**: 90 minutes

* * * * *

## The Problem: Managing Multiple Pieces of Data

Imagine you're tracking test scores for your class. With what you know so far, you might try this:

In [None]:
# The tedious way - individual variables for each score
score1 = 85
score2 = 92
score3 = 78
score4 = 96
score5 = 88

print("Test Scores:")
print(f"Test 1: {score1}")
print(f"Test 2: {score2}")
print(f"Test 3: {score3}")
print(f"Test 4: {score4}")
print(f"Test 5: {score5}")

# Calculate average the hard way
average = (score1 + score2 + score3 + score4 + score5) / 5
print(f"\nAverage: {average:.1f}")

# Find the highest score the hard way
highest = score1
if score2 > highest:
    highest = score2
if score3 > highest:
    highest = score3
if score4 > highest:
    highest = score4
if score5 > highest:
    highest = score5
    
print(f"Highest score: {highest}")

**Problems with this approach:**
- What if you have 100 test scores? 1000?
- Adding a new score requires changing multiple places in the code
- The code becomes unmanageable very quickly
- You have to repeat similar operations many times

**Lists and loops solve all these problems!**

In [None]:
# The powerful way - using a list
scores = [85, 92, 78, 96, 88]

print("Test Scores:", scores)
print(f"Number of tests: {len(scores)}")

# Calculate average the easy way
average = sum(scores) / len(scores)
print(f"Average: {average:.1f}")

# Find highest and lowest the easy way
highest = max(scores)
lowest = min(scores)
print(f"Highest score: {highest}")
print(f"Lowest score: {lowest}")

# Add a new score easily
scores.append(91)
print(f"\nAfter adding new score: {scores}")
print(f"New average: {sum(scores) / len(scores):.1f}")

## Introduction to Lists: Storing Multiple Items

A **list** is a collection of items stored in a specific order. Think of it as a container that can hold multiple values:

In [None]:
# Creating different types of lists
student_names = ["Alice", "Bob", "Charlie", "Diana"]
test_scores = [95, 87, 92, 78, 88]
temperatures = [72.5, 68.0, 75.3, 71.8]
is_present = [True, True, False, True]

print("Student Names:", student_names)
print("Test Scores:", test_scores)
print("Temperatures:", temperatures)
print("Present Today:", is_present)

# Lists can be empty
empty_list = []
print("Empty list:", empty_list)

# Lists can contain mixed types (though usually we keep them consistent)
mixed_data = ["Alice", 16, 3.8, True]
print("Mixed data:", mixed_data)

# Check the type
print(f"\nType of student_names: {type(student_names)}")

## Accessing List Elements: Getting Individual Items

Lists are **indexed** starting from 0. You can access individual items using square brackets:

In [None]:
colors = ["red", "blue", "green", "yellow", "purple"]

print("Colors list:", colors)
print(f"List has {len(colors)} items")

# Positive indexing (from the beginning)
print("\nAccessing by positive index:")
print(f"First color (index 0): {colors[0]}")
print(f"Second color (index 1): {colors[1]}")
print(f"Third color (index 2): {colors[2]}")
print(f"Last color (index 4): {colors[4]}")

# Negative indexing (from the end)
print("\nAccessing by negative index:")
print(f"Last color (index -1): {colors[-1]}")
print(f"Second to last (index -2): {colors[-2]}")
print(f"First color (index -5): {colors[-5]}")

# Using variables as indices
first_index = 0
last_index = len(colors) - 1
print(f"\nUsing variables as indices:")
print(f"First: {colors[first_index]}")
print(f"Last: {colors[last_index]}")

### Index Visualization
Here's how list indexing works:

```
colors = ["red", "blue", "green", "yellow", "purple"]
           0      1       2        3         4      ← positive indices
          -5     -4      -3       -2        -1      ← negative indices
```

### Challenge 1: List Access Practice

In [None]:
# Practice with a list of your favorite subjects
subjects = ["Math", "Science", "English", "History", "Art", "Music", "PE"]

print("School subjects:", subjects)
print(f"Total subjects: {len(subjects)}")

# Your tasks - complete these:
print(f"\nFirst subject: {subjects[0]}")
print(f"Last subject: {subjects[-1]}")
print(f"Middle subject: {subjects[len(subjects)//2]}")

# Find your favorite and least favorite
favorite_index = 1  # Change this to your actual favorite
least_favorite_index = -2  # Change this to your actual least favorite

print(f"\nMy favorite subject: {subjects[favorite_index]}")
print(f"My least favorite subject: {subjects[least_favorite_index]}")

# What happens if you try to access an index that doesn't exist?
# print(subjects[10])  # This would cause an IndexError - don't uncomment unless you want to see the error!

## Modifying Lists: Adding, Changing, and Removing Items

Unlike strings, lists are **mutable** - you can change them after creation:

In [None]:
# Start with a shopping list
shopping_list = ["milk", "bread", "eggs"]
print("Original shopping list:", shopping_list)

# Add items to the end
shopping_list.append("butter")
print("After adding butter:", shopping_list)

shopping_list.append("cheese")
print("After adding cheese:", shopping_list)

# Insert item at specific position
shopping_list.insert(1, "yogurt")
print("After inserting yogurt at position 1:", shopping_list)

# Change an existing item
shopping_list[2] = "whole wheat bread"
print("After changing bread:", shopping_list)

# Remove items by value
shopping_list.remove("eggs")
print("After removing eggs:", shopping_list)

# Remove items by index
removed_item = shopping_list.pop(0)
print(f"Removed '{removed_item}' from position 0:", shopping_list)

# Remove the last item
last_item = shopping_list.pop()
print(f"Removed last item '{last_item}':", shopping_list)

### More List Operations

In [None]:
# Working with a list of grades
grades = [85, 92, 78, 92, 88, 95, 78]
print("Original grades:", grades)

# Count how many times a value appears
count_92 = grades.count(92)
count_78 = grades.count(78)
print(f"\nNumber of 92s: {count_92}")
print(f"Number of 78s: {count_78}")

# Find the position of a value
position_of_95 = grades.index(95)
print(f"Position of first 95: {position_of_95}")

# Sort the list
grades.sort()
print(f"Sorted grades: {grades}")

# Reverse the list
grades.reverse()
print(f"Reversed grades: {grades}")

# Copy a list
grades_copy = grades.copy()
print(f"Copy of grades: {grades_copy}")

# Clear all items
grades_copy.clear()
print(f"After clearing copy: {grades_copy}")
print(f"Original still intact: {grades}")

## Introduction to Loops: Automating Repetitive Tasks

Now comes the real power! Instead of processing each list item individually, we can use **loops** to automate repetitive tasks:

In [None]:
# Without loops - tedious and doesn't scale
students = ["Alice", "Bob", "Charlie", "Diana"]

print("Without loops (the hard way):")
print(f"Welcome, {students[0]}!")
print(f"Welcome, {students[1]}!")
print(f"Welcome, {students[2]}!")
print(f"Welcome, {students[3]}!")

print("\n" + "="*30 + "\n")

# With loops - elegant and scales to any size
print("With loops (the smart way):")
for student in students:
    print(f"Welcome, {student}!")

# Add more students and the loop still works!
students.extend(["Eve", "Frank", "Grace"])
print("\nAfter adding more students:")
for student in students:
    print(f"Welcome, {student}!")

### For Loop Syntax

The basic structure of a for loop is:
```python
for item in collection:
    # do something with item
```

Let's explore different ways to use for loops:

In [None]:
# Loop through numbers
scores = [85, 92, 78, 96, 88]

print("Processing test scores:")
total_points = 0
for score in scores:
    print(f"Score: {score}")
    total_points += score
    
    # Give feedback based on score
    if score >= 90:
        print("  Excellent! 🌟")
    elif score >= 80:
        print("  Good job! 👍")
    else:
        print("  Keep working! 📚")

average = total_points / len(scores)
print(f"\nTotal points: {total_points}")
print(f"Average score: {average:.1f}")

### Loops with Indices: When You Need Position Information

In [None]:
# Sometimes you need both the item and its position
subjects = ["Math", "Science", "English", "History"]
grades = ["A", "B+", "A-", "B"]

print("Grade Report:")
print("=" * 25)

# Method 1: Using range(len(list))
for i in range(len(subjects)):
    subject = subjects[i]
    grade = grades[i]
    print(f"{i+1:2}. {subject:<10}: {grade}")

print("\nOr using enumerate():")
print("=" * 25)

# Method 2: Using enumerate() - more Pythonic
for i, subject in enumerate(subjects):
    grade = grades[i]
    print(f"{i+1:2}. {subject:<10}: {grade}")

### Challenge 2: Temperature Converter

In [None]:
# Temperature data in Fahrenheit
temps_fahrenheit = [32, 68, 86, 104, 50, 75, 91]
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]

print("🌡️  WEEKLY TEMPERATURE REPORT")
print("=" * 40)

# Convert each temperature to Celsius and provide commentary
celsius_temps = []
for i in range(len(temps_fahrenheit)):
    day = days[i]
    temp_f = temps_fahrenheit[i]
    temp_c = (temp_f - 32) * 5/9  # Conversion formula
    celsius_temps.append(temp_c)
    
    # Provide weather commentary
    if temp_f >= 90:
        comment = "🔥 Very hot!"
    elif temp_f >= 75:
        comment = "☀️ Warm"
    elif temp_f >= 60:
        comment = "😊 Pleasant"
    elif temp_f >= 40:
        comment = "🧥 Cool"
    else:
        comment = "❄️ Cold"
    
    print(f"{day:<10}: {temp_f:3}°F ({temp_c:4.1f}°C) {comment}")

# Calculate weekly statistics
avg_f = sum(temps_fahrenheit) / len(temps_fahrenheit)
avg_c = sum(celsius_temps) / len(celsius_temps)
max_temp = max(temps_fahrenheit)
min_temp = min(temps_fahrenheit)

print("\n📊 WEEKLY SUMMARY:")
print(f"Average: {avg_f:.1f}°F ({avg_c:.1f}°C)")
print(f"Highest: {max_temp}°F")
print(f"Lowest: {min_temp}°F")

# Count pleasant days (60-80°F)
pleasant_days = 0
for temp in temps_fahrenheit:
    if 60 <= temp <= 80:
        pleasant_days += 1

print(f"Pleasant days (60-80°F): {pleasant_days} out of {len(temps_fahrenheit)}")

## Combining Lists, Loops, and Logic

The real power comes from combining everything you've learned:

In [None]:
# Student performance analysis system
student_names = ["Alice", "Bob", "Charlie", "Diana", "Eve"]
test_scores = [95, 67, 88, 92, 78]
homework_scores = [98, 75, 85, 94, 82]
attendance_rates = [98, 85, 92, 96, 88]  # Percentage

print("📚 STUDENT PERFORMANCE ANALYSIS")
print("=" * 50)

# Process each student
honor_roll_students = []
at_risk_students = []
class_averages = []

for i in range(len(student_names)):
    name = student_names[i]
    test = test_scores[i]
    homework = homework_scores[i]
    attendance = attendance_rates[i]
    
    # Calculate weighted average (tests 60%, homework 30%, attendance 10%)
    weighted_average = (test * 0.6) + (homework * 0.3) + (attendance * 0.1)
    class_averages.append(weighted_average)
    
    # Determine letter grade
    if weighted_average >= 90:
        letter_grade = "A"
    elif weighted_average >= 80:
        letter_grade = "B"
    elif weighted_average >= 70:
        letter_grade = "C"
    elif weighted_average >= 60:
        letter_grade = "D"
    else:
        letter_grade = "F"
    
    # Check for honors and at-risk status
    if weighted_average >= 85 and attendance >= 95:
        honor_roll_students.append(name)
        status = "🌟 Honor Roll"
    elif weighted_average < 70 or attendance < 85:
        at_risk_students.append(name)
        status = "⚠️ At Risk"
    else:
        status = "✅ Good Standing"
    
    # Print individual report
    print(f"\n👤 {name}:")
    print(f"   Test: {test:2}% | Homework: {homework:2}% | Attendance: {attendance:2}%")
    print(f"   Average: {weighted_average:5.1f}% | Grade: {letter_grade} | {status}")
    
    # Specific recommendations
    recommendations = []
    if test < 75:
        recommendations.append("Focus on test preparation")
    if homework < 80:
        recommendations.append("Complete all homework assignments")
    if attendance < 90:
        recommendations.append("Improve attendance")
    
    if recommendations:
        print(f"   💡 Recommendations: {', '.join(recommendations)}")

# Class summary
print("\n" + "=" * 50)
print("📊 CLASS SUMMARY")

class_average = sum(class_averages) / len(class_averages)
highest_average = max(class_averages)
lowest_average = min(class_averages)

print(f"\nClass average: {class_average:.1f}%")
print(f"Highest average: {highest_average:.1f}%")
print(f"Lowest average: {lowest_average:.1f}%")

print(f"\n🌟 Honor Roll ({len(honor_roll_students)} students):")
if honor_roll_students:
    for student in honor_roll_students:
        print(f"   • {student}")
else:
    print("   None this semester")

print(f"\n⚠️ At Risk ({len(at_risk_students)} students):")
if at_risk_students:
    for student in at_risk_students:
        print(f"   • {student}")
else:
    print("   None - great job!")

# Grade distribution
grade_counts = {"A": 0, "B": 0, "C": 0, "D": 0, "F": 0}
for avg in class_averages:
    if avg >= 90:
        grade_counts["A"] += 1
    elif avg >= 80:
        grade_counts["B"] += 1
    elif avg >= 70:
        grade_counts["C"] += 1
    elif avg >= 60:
        grade_counts["D"] += 1
    else:
        grade_counts["F"] += 1

print(f"\n📈 GRADE DISTRIBUTION:")
for grade, count in grade_counts.items():
    if count > 0:
        percentage = (count / len(student_names)) * 100
        print(f"   {grade}: {count} students ({percentage:.0f}%)")

## List Slicing: Working with Portions of Lists

Sometimes you need just part of a list:

In [None]:
# Daily sales data for a week
daily_sales = [120, 95, 180, 165, 210, 245, 198]
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]

print("📊 WEEKLY SALES ANALYSIS")
print("=" * 30)

print("Full week sales:", daily_sales)

# Slice different parts of the week
weekday_sales = daily_sales[0:5]  # Monday through Friday
weekend_sales = daily_sales[5:7]  # Saturday and Sunday
first_half = daily_sales[:4]      # First 4 days
second_half = daily_sales[3:]     # Last 4 days

print(f"\nWeekday sales (Mon-Fri): {weekday_sales}")
print(f"Weekend sales (Sat-Sun): {weekend_sales}")
print(f"First half of week: {first_half}")
print(f"Second half of week: {second_half}")

# Calculate averages for different periods
weekday_avg = sum(weekday_sales) / len(weekday_sales)
weekend_avg = sum(weekend_sales) / len(weekend_sales)

print(f"\n📈 ANALYSIS:")
print(f"Weekday average: ${weekday_avg:.2f}")
print(f"Weekend average: ${weekend_avg:.2f}")

if weekend_avg > weekday_avg:
    print("💡 Weekends are more profitable!")
    difference = weekend_avg - weekday_avg
    print(f"   Weekend sales are ${difference:.2f} higher on average")
else:
    print("💡 Weekdays are more profitable!")
    difference = weekday_avg - weekend_avg
    print(f"   Weekday sales are ${difference:.2f} higher on average")

# Find best and worst days
best_day_index = daily_sales.index(max(daily_sales))
worst_day_index = daily_sales.index(min(daily_sales))

print(f"\n🏆 Best day: {days[best_day_index]} (${daily_sales[best_day_index]})")
print(f"📉 Worst day: {days[worst_day_index]} (${daily_sales[worst_day_index]})")

## Practical Applications: Real-World Problems

Let's solve some real-world problems using lists and loops:

### Application 1: Survey Data Processor

In [None]:
# Survey: "What's your favorite social media platform?"
responses = ["Instagram", "TikTok", "Instagram", "Snapchat", "TikTok", "YouTube", 
             "Instagram", "TikTok", "Discord", "Instagram", "YouTube", "TikTok", 
             "Snapchat", "Instagram", "Discord", "YouTube", "TikTok", "Instagram"]

print("📱 SOCIAL MEDIA SURVEY ANALYSIS")
print("=" * 40)
print(f"Total responses: {len(responses)}")

# Find unique platforms
unique_platforms = []
for response in responses:
    if response not in unique_platforms:
        unique_platforms.append(response)

print(f"Platforms mentioned: {unique_platforms}")

# Count votes for each platform
print("\n📊 VOTE BREAKDOWN:")
vote_data = []  # Will store (platform, count, percentage) tuples

for platform in unique_platforms:
    count = responses.count(platform)
    percentage = (count / len(responses)) * 100
    vote_data.append((platform, count, percentage))
    
    # Create a simple bar chart with asterisks
    bar = "█" * count
    print(f"{platform:<12}: {count:2} votes ({percentage:4.1f}%) {bar}")

# Find winner
max_votes = 0
winner = ""
for platform, count, percentage in vote_data:
    if count > max_votes:
        max_votes = count
        winner = platform

print(f"\n🏆 WINNER: {winner} with {max_votes} votes!")

# Find platforms with significant support (>15%)
popular_platforms = []
for platform, count, percentage in vote_data:
    if percentage >= 15:
        popular_platforms.append(platform)

print(f"\n⭐ Platforms with significant support (≥15%): {popular_platforms}")

# Demographics insight (simulated)
teen_favorites = ["TikTok", "Instagram", "Snapchat"]
teen_votes = 0
for response in responses:
    if response in teen_favorites:
        teen_votes += 1

teen_percentage = (teen_votes / len(responses)) * 100
print(f"\n👥 Teen-popular platforms represent {teen_percentage:.1f}% of votes")

### Application 2: Fitness Tracker Analysis

In [None]:
# Two weeks of daily step counts
week1_steps = [8500, 12000, 6800, 9200, 11500, 15000, 7200]
week2_steps = [9100, 10800, 7500, 8900, 12200, 14500, 8800]
days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]

# Daily goal
daily_goal = 10000

print("👟 FITNESS TRACKER ANALYSIS")
print("=" * 35)
print(f"Daily step goal: {daily_goal:,} steps")

# Analyze each week
weeks_data = [("Week 1", week1_steps), ("Week 2", week2_steps)]
all_weekly_totals = []

for week_name, steps in weeks_data:
    print(f"\n📅 {week_name.upper()}:")
    print("-" * 20)
    
    weekly_total = 0
    goal_days = 0
    
    for i in range(len(days)):
        day = days[i]
        daily_steps = steps[i]
        weekly_total += daily_steps
        
        # Check if goal was met
        if daily_steps >= daily_goal:
            goal_days += 1
            status = "✅"
        else:
            shortage = daily_goal - daily_steps
            status = f"❌ (-{shortage:,})"
        
        print(f"{day}: {daily_steps:5,} steps {status}")
    
    # Weekly statistics
    daily_average = weekly_total / len(days)
    all_weekly_totals.append(weekly_total)
    
    print(f"\n📊 {week_name} Summary:")
    print(f"   Total steps: {weekly_total:,}")
    print(f"   Daily average: {daily_average:,.0f}")
    print(f"   Goal days: {goal_days}/7 ({goal_days/7*100:.1f}%)")
    print(f"   Best day: {max(steps):,} steps")
    print(f"   Worst day: {min(steps):,} steps")

# Two-week comparison
print("\n" + "=" * 35)
print("📈 TWO-WEEK COMPARISON")

total_steps = sum(all_weekly_totals)
overall_average = total_steps / 14
week1_avg = sum(week1_steps) / len(week1_steps)
week2_avg = sum(week2_steps) / len(week2_steps)

print(f"\nTotal steps (2 weeks): {total_steps:,}")
print(f"Overall daily average: {overall_average:,.0f}")

if week2_avg > week1_avg:
    improvement = week2_avg - week1_avg
    print(f"\n📈 Improvement! Week 2 average was {improvement:,.0f} steps higher")
    print("   Keep up the great work! 🎉")
elif week1_avg > week2_avg:
    decline = week1_avg - week2_avg
    print(f"\n📉 Week 2 average was {decline:,.0f} steps lower")
    print("   Let's get back on track! 💪")
else:
    print("\n➡️ Consistent performance between weeks")

# Goal achievement analysis
all_steps = week1_steps + week2_steps
total_goal_days = 0
for daily_steps in all_steps:
    if daily_steps >= daily_goal:
        total_goal_days += 1

goal_percentage = (total_goal_days / 14) * 100
print(f"\n🎯 Goal Achievement: {total_goal_days}/14 days ({goal_percentage:.1f}%)")

if goal_percentage >= 80:
    print("   🌟 Excellent consistency!")
elif goal_percentage >= 60:
    print("   👍 Good progress, room for improvement")
else:
    print("   💪 Let's focus on reaching that daily goal!")

# Find patterns
weekend_steps = []
weekday_steps = []

for i in range(len(days)):
    if days[i] in ["Sat", "Sun"]:
        weekend_steps.extend([week1_steps[i], week2_steps[i]])
    else:
        weekday_steps.extend([week1_steps[i], week2_steps[i]])

weekend_avg = sum(weekend_steps) / len(weekend_steps)
weekday_avg = sum(weekday_steps) / len(weekday_steps)

print(f"\n🔍 PATTERNS:")
print(f"Weekday average: {weekday_avg:,.0f} steps")
print(f"Weekend average: {weekend_avg:,.0f} steps")

if weekend_avg > weekday_avg:
    print("   🌟 More active on weekends!")
else:
    print("   💼 More active on weekdays!")

### Challenge 3: Music Playlist Analyzer

In [None]:
# Playlist data: (song_title, artist, duration_seconds, genre, play_count)
playlist = [
    ("Blinding Lights", "The Weeknd", 200, "Pop", 45),
    ("Watermelon Sugar", "Harry Styles", 174, "Pop", 38),
    ("Good 4 U", "Olivia Rodrigo", 178, "Pop", 52),
    ("Industry Baby", "Lil Nas X", 212, "Hip-Hop", 41),
    ("Heat Waves", "Glass Animals", 238, "Indie", 67),
    ("Stay", "The Kid LAROI", 141, "Pop", 33),
    ("Levitating", "Dua Lipa", 203, "Pop", 29),
    ("Montero", "Lil Nas X", 137, "Hip-Hop", 24),
    ("Peaches", "Justin Bieber", 197, "Pop", 31),
    ("drivers license", "Olivia Rodrigo", 242, "Pop", 56)
]

print("🎵 MUSIC PLAYLIST ANALYZER")
print("=" * 40)
print(f"Playlist contains {len(playlist)} songs")

# Analyze each song
total_duration = 0
total_plays = 0
genres = []
artists = []

print("\n📑 SONG LIST:")
for i, (title, artist, duration, genre, plays) in enumerate(playlist, 1):
    minutes = duration // 60
    seconds = duration % 60
    total_duration += duration
    total_plays += plays
    
    # Collect unique genres and artists
    if genre not in genres:
        genres.append(genre)
    if artist not in artists:
        artists.append(artist)
    
    # Popularity indicator
    if plays >= 50:
        popularity = "🔥 Hot"
    elif plays >= 35:
        popularity = "⭐ Popular"
    else:
        popularity = "📻 Regular"
    
    print(f"{i:2}. {title:<20} - {artist:<15} [{minutes}:{seconds:02d}] {popularity}")

# Calculate totals
total_minutes = total_duration // 60
total_hours = total_minutes // 60
remaining_minutes = total_minutes % 60

print(f"\n⏱️ DURATION ANALYSIS:")
print(f"Total playlist time: {total_hours}h {remaining_minutes}m")
print(f"Average song length: {total_duration/len(playlist):.0f} seconds")
print(f"Total plays across all songs: {total_plays:,}")
print(f"Average plays per song: {total_plays/len(playlist):.1f}")

# Genre analysis
print(f"\n🎼 GENRE BREAKDOWN:")
for genre in genres:
    genre_count = 0
    genre_plays = 0
    
    for title, artist, duration, song_genre, plays in playlist:
        if song_genre == genre:
            genre_count += 1
            genre_plays += plays
    
    percentage = (genre_count / len(playlist)) * 100
    avg_plays = genre_plays / genre_count if genre_count > 0 else 0
    
    print(f"{genre:<10}: {genre_count} songs ({percentage:.1f}%) - Avg plays: {avg_plays:.1f}")

# Find top performers
print(f"\n🏆 TOP PERFORMERS:")

# Most played song
max_plays = 0
most_played_song = ""
most_played_artist = ""

for title, artist, duration, genre, plays in playlist:
    if plays > max_plays:
        max_plays = plays
        most_played_song = title
        most_played_artist = artist

print(f"Most played: '{most_played_song}' by {most_played_artist} ({max_plays} plays)")

# Longest and shortest songs
longest_duration = 0
shortest_duration = float('inf')
longest_song = ""
shortest_song = ""

for title, artist, duration, genre, plays in playlist:
    if duration > longest_duration:
        longest_duration = duration
        longest_song = title
    if duration < shortest_duration:
        shortest_duration = duration
        shortest_song = title

print(f"Longest song: '{longest_song}' ({longest_duration//60}:{longest_duration%60:02d})")
print(f"Shortest song: '{shortest_song}' ({shortest_duration//60}:{shortest_duration%60:02d})")

# Artist analysis
print(f"\n👤 ARTIST ANALYSIS:")
artist_stats = []

for artist in artists:
    song_count = 0
    total_artist_plays = 0
    
    for title, song_artist, duration, genre, plays in playlist:
        if song_artist == artist:
            song_count += 1
            total_artist_plays += plays
    
    artist_stats.append((artist, song_count, total_artist_plays))

# Sort by total plays
for i in range(len(artist_stats)):
    for j in range(i + 1, len(artist_stats)):
        if artist_stats[i][2] < artist_stats[j][2]:
            artist_stats[i], artist_stats[j] = artist_stats[j], artist_stats[i]

print("Top artists by total plays:")
for artist, song_count, total_artist_plays in artist_stats[:5]:
    avg_per_song = total_artist_plays / song_count
    print(f"  {artist:<15}: {song_count} song(s), {total_artist_plays} total plays ({avg_per_song:.1f} avg)")

# Recommendations
print(f"\n💡 RECOMMENDATIONS:")

# Find underperformed songs
underperformed = []
avg_plays = total_plays / len(playlist)

for title, artist, duration, genre, plays in playlist:
    if plays < avg_plays * 0.7:  # Less than 70% of average
        underperformed.append((title, artist, plays))

if underperformed:
    print(f"Songs that might need more attention:")
    for title, artist, plays in underperformed:
        print(f"  • '{title}' by {artist} ({plays} plays)")

# Genre recommendations
most_popular_genre = ""
max_genre_plays = 0

for genre in genres:
    genre_total_plays = 0
    for title, artist, duration, song_genre, plays in playlist:
        if song_genre == genre:
            genre_total_plays += plays
    
    if genre_total_plays > max_genre_plays:
        max_genre_plays = genre_total_plays
        most_popular_genre = genre

print(f"\nConsider adding more {most_popular_genre} songs - they're your most played genre!")

## Advanced List Techniques

Let's explore some powerful list manipulation techniques:

In [None]:
# List comprehensions - a concise way to create lists
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print("Original numbers:", numbers)

# Traditional way to create a list of squares
squares_traditional = []
for num in numbers:
    squares_traditional.append(num ** 2)

print("Squares (traditional):", squares_traditional)

# List comprehension way (more advanced - preview of what's possible)
squares_comprehension = [num ** 2 for num in numbers]
print("Squares (comprehension):", squares_comprehension)

# Filtering with conditions
# Traditional way
even_numbers = []
for num in numbers:
    if num % 2 == 0:
        even_numbers.append(num)

print("\nEven numbers (traditional):", even_numbers)

# List comprehension with condition (advanced)
even_comprehension = [num for num in numbers if num % 2 == 0]
print("Even numbers (comprehension):", even_comprehension)

# Working with multiple lists
names = ["Alice", "Bob", "Charlie"]
ages = [16, 17, 15]
grades = ["A", "B+", "A-"]

print("\nStudent information combined:")
for i in range(len(names)):
    print(f"{names[i]} (age {ages[i]}): {grades[i]}")

# Using zip() to combine lists (advanced)
print("\nUsing zip():")
for name, age, grade in zip(names, ages, grades):
    print(f"{name} (age {age}): {grade}")

## Common Patterns and Best Practices

Here are important patterns you'll use frequently:

In [None]:
# Pattern 1: Accumulating/Aggregating data
prices = [12.99, 25.50, 8.75, 45.00, 15.25]

total = 0
count = 0
expensive_items = 0

for price in prices:
    total += price           # Accumulate sum
    count += 1              # Count items
    if price > 20:          # Count expensive items
        expensive_items += 1

average = total / count
print(f"Total: ${total:.2f}")
print(f"Average: ${average:.2f}")
print(f"Expensive items (>$20): {expensive_items}")

# Pattern 2: Finding extremes
scores = [85, 92, 78, 96, 88, 74, 91]

highest_score = scores[0]  # Start with first item
lowest_score = scores[0]
highest_index = 0
lowest_index = 0

for i in range(1, len(scores)):  # Start from index 1
    if scores[i] > highest_score:
        highest_score = scores[i]
        highest_index = i
    if scores[i] < lowest_score:
        lowest_score = scores[i]
        lowest_index = i

print(f"\nHighest score: {highest_score} at position {highest_index}")
print(f"Lowest score: {lowest_score} at position {lowest_index}")

# Pattern 3: Building new lists based on conditions
all_grades = ["A", "B", "C", "A", "F", "B", "A", "D"]

passing_grades = []
honor_grades = []

for grade in all_grades:
    if grade != "F":
        passing_grades.append(grade)
    if grade in ["A", "B"]:
        honor_grades.append(grade)

print(f"\nAll grades: {all_grades}")
print(f"Passing grades: {passing_grades}")
print(f"Honor grades: {honor_grades}")

# Pattern 4: Checking conditions across entire list
attendance_rates = [95, 88, 92, 85, 97, 90, 86]
required_rate = 90

all_meet_requirement = True
any_meet_requirement = False
students_meeting = 0

for rate in attendance_rates:
    if rate >= required_rate:
        any_meet_requirement = True
        students_meeting += 1
    else:
        all_meet_requirement = False

print(f"\nAttendance rates: {attendance_rates}")
print(f"All students meet 90% requirement: {all_meet_requirement}")
print(f"Any students meet 90% requirement: {any_meet_requirement}")
print(f"Students meeting requirement: {students_meeting}/{len(attendance_rates)}")

## Final Challenge: Complete Data Analysis System

Put everything together in this comprehensive challenge:

In [None]:
# E-commerce sales data analysis
# Data format: (product_name, category, price, units_sold, customer_rating)
sales_data = [
    ("iPhone 14", "Electronics", 899.99, 150, 4.5),
    ("Nike Air Max", "Clothing", 129.99, 89, 4.2),
    ("MacBook Pro", "Electronics", 1299.99, 75, 4.7),
    ("Adidas Hoodie", "Clothing", 59.99, 120, 4.0),
    ("Samsung TV", "Electronics", 549.99, 65, 4.3),
    ("Levi's Jeans", "Clothing", 79.99, 95, 4.1),
    ("AirPods Pro", "Electronics", 249.99, 200, 4.6),
    ("Patagonia Jacket", "Clothing", 189.99, 45, 4.8),
    ("iPad Air", "Electronics", 599.99, 110, 4.4),
    ("Converse Shoes", "Clothing", 89.99, 78, 3.9)
]

print("🛒 E-COMMERCE SALES ANALYSIS")
print("=" * 50)
print(f"Analyzing {len(sales_data)} products...")

# Initialize tracking variables
total_revenue = 0
total_units = 0
categories = []
high_rated_products = []  # Rating >= 4.5
bestsellers = []  # Units sold >= 100
premium_products = []  # Price >= 500

# Process each product
print("\n📊 PRODUCT PERFORMANCE:")
print("-" * 70)
print(f"{'Product':<20} {'Category':<12} {'Price':>8} {'Units':>6} {'Rating':>7} {'Revenue':>10}")
print("-" * 70)

for product_name, category, price, units, rating in sales_data:
    revenue = price * units
    total_revenue += revenue
    total_units += units
    
    # Track unique categories
    if category not in categories:
        categories.append(category)
    
    # Categorize products
    if rating >= 4.5:
        high_rated_products.append(product_name)
    if units >= 100:
        bestsellers.append(product_name)
    if price >= 500:
        premium_products.append(product_name)
    
    # Performance indicators
    indicators = []
    if rating >= 4.5:
        indicators.append("⭐")
    if units >= 100:
        indicators.append("🔥")
    if price >= 500:
        indicators.append("💎")
    
    indicator_str = "".join(indicators)
    
    print(f"{product_name:<20} {category:<12} ${price:7.2f} {units:5} {rating:6.1f} ${revenue:9.2f} {indicator_str}")

# Overall statistics
average_price = sum(price for _, _, price, _, _ in sales_data) / len(sales_data)
average_rating = sum(rating for _, _, _, _, rating in sales_data) / len(sales_data)
average_units = total_units / len(sales_data)

print("\n" + "=" * 50)
print("📈 OVERALL STATISTICS")
print(f"Total Revenue: ${total_revenue:,.2f}")
print(f"Total Units Sold: {total_units:,}")
print(f"Average Price: ${average_price:.2f}")
print(f"Average Rating: {average_rating:.2f}/5.0")
print(f"Average Units per Product: {average_units:.1f}")

# Category analysis
print(f"\n🏷️ CATEGORY BREAKDOWN:")
for category in categories:
    category_revenue = 0
    category_units = 0
    category_products = 0
    category_ratings = []
    
    for product_name, prod_category, price, units, rating in sales_data:
        if prod_category == category:
            category_revenue += price * units
            category_units += units
            category_products += 1
            category_ratings.append(rating)
    
    avg_category_rating = sum(category_ratings) / len(category_ratings)
    revenue_percentage = (category_revenue / total_revenue) * 100
    
    print(f"\n{category}:")
    print(f"  Products: {category_products}")
    print(f"  Revenue: ${category_revenue:,.2f} ({revenue_percentage:.1f}% of total)")
    print(f"  Units Sold: {category_units:,}")
    print(f"  Avg Rating: {avg_category_rating:.2f}")

# Find top performers
print(f"\n🏆 TOP PERFORMERS:")

# Highest revenue product
max_revenue = 0
top_revenue_product = ""
for product_name, category, price, units, rating in sales_data:
    revenue = price * units
    if revenue > max_revenue:
        max_revenue = revenue
        top_revenue_product = product_name

# Highest rated product
max_rating = 0
top_rated_product = ""
for product_name, category, price, units, rating in sales_data:
    if rating > max_rating:
        max_rating = rating
        top_rated_product = product_name

# Best selling product
max_units = 0
bestselling_product = ""
for product_name, category, price, units, rating in sales_data:
    if units > max_units:
        max_units = units
        bestselling_product = product_name

print(f"💰 Highest Revenue: {top_revenue_product} (${max_revenue:,.2f})")
print(f"⭐ Highest Rated: {top_rated_product} ({max_rating}/5.0)")
print(f"🔥 Best Selling: {bestselling_product} ({max_units:,} units)")

# Special categories
print(f"\n🎯 SPECIAL CATEGORIES:")
print(f"⭐ High-Rated Products (≥4.5): {len(high_rated_products)}")
if high_rated_products:
    for product in high_rated_products:
        print(f"   • {product}")

print(f"\n🔥 Bestsellers (≥100 units): {len(bestsellers)}")
if bestsellers:
    for product in bestsellers:
        print(f"   • {product}")

print(f"\n💎 Premium Products (≥$500): {len(premium_products)}")
if premium_products:
    for product in premium_products:
        print(f"   • {product}")

# Business insights
print(f"\n💡 BUSINESS INSIGHTS:")

# Revenue distribution
electronics_revenue = 0
clothing_revenue = 0
for product_name, category, price, units, rating in sales_data:
    revenue = price * units
    if category == "Electronics":
        electronics_revenue += revenue
    else:
        clothing_revenue += revenue

if electronics_revenue > clothing_revenue:
    print(f"• Electronics dominates revenue ({electronics_revenue/total_revenue*100:.1f}% vs {clothing_revenue/total_revenue*100:.1f}%)")
else:
    print(f"• Clothing leads in revenue ({clothing_revenue/total_revenue*100:.1f}% vs {electronics_revenue/total_revenue*100:.1f}%)")

# Quality vs. Quantity analysis
high_volume_low_rating = []
low_volume_high_rating = []

for product_name, category, price, units, rating in sales_data:
    if units >= average_units and rating < average_rating:
        high_volume_low_rating.append(product_name)
    elif units < average_units and rating >= 4.5:
        low_volume_high_rating.append(product_name)

if high_volume_low_rating:
    print(f"• Products selling well despite lower ratings: {', '.join(high_volume_low_rating)}")

if low_volume_high_rating:
    print(f"• High-quality products with growth potential: {', '.join(low_volume_high_rating)}")

# Final recommendations
total_products = len(sales_data)
profitable_products = sum(1 for _, _, price, units, _ in sales_data if price * units > total_revenue / total_products)

print(f"\n🎯 RECOMMENDATIONS:")
print(f"• {profitable_products}/{total_products} products are above average revenue")
print(f"• Focus marketing on high-rated, low-volume products for growth")
print(f"• Consider quality improvements for high-volume, low-rated products")
if len(premium_products) > 0:
    print(f"• Premium segment ({len(premium_products)} products) shows strong potential")

## Key Takeaways

🎯 **Lists store multiple related items** - Perfect for collections of data

🔢 **Lists are indexed from 0** - Use positive/negative indices to access elements

🔧 **Lists are mutable** - You can add, remove, and change items after creation

🔄 **For loops automate repetition** - Process every item in a collection automatically

🧠 **Combine loops with logic** - Use if statements inside loops for smart processing

📊 **Common patterns solve real problems** - Accumulating, filtering, finding extremes

⚡ **Automation is programming's superpower** - Write once, run on any amount of data

🎨 **Lists + loops + logic = powerful programs** - Handle complex data processing tasks

---

**Congratulations!** You've completed Day One and learned the core superpowers of programming:
- Working with different types of data
- Making programs that can decide what to do
- Storing and processing collections of information
- Automating repetitive tasks

These skills form the foundation of all programming. Tomorrow, you'll learn how to organize your code into reusable functions and tackle even more complex problems! 🚀