# Lesson 9: File I/O & Error Handling

**Session:** Week 3, Tuesday (2 hours)  
**Learning Objectives:**
- Read and write files in Python
- Handle different file formats (text, CSV, JSON)
- Understand and implement error handling with try/except
- Build robust programs that handle unexpected situations
- Apply file operations to real data science scenarios

## 🎉 Welcome to Week 3! 

Congratulations on completing Week 2! You now have solid foundations in:
- **Decision making** with conditionals
- **Repetition** with loops  
- **Code organization** with functions

This week, we'll learn to work with **real data** and build **production-ready** programs!

In [None]:
# Quick Week 2 Skills Check
def skills_assessment():
    """Demonstrate Week 2 mastery in one function"""
    
    # Conditionals + Functions
    scores = [85, 92, 78, 95, 88]
    
    # Loops + Conditionals
    results = []
    for score in scores:
        if score >= 90:
            grade = "A"
        elif score >= 80:
            grade = "B"
        else:
            grade = "C"
        results.append(f"Score {score}: Grade {grade}")
    
    return results

# Test our combined skills
assessment_results = skills_assessment()
for result in assessment_results:
    print(result)
    
print("\n✅ Week 2 skills confirmed! Ready for Week 3! 🚀")

## The Problem: Data Lives in Files! 📁

So far, we've worked with data we create in our programs. But in the real world:

- **Data scientists** work with CSV files, JSON APIs, databases
- **Web developers** read configuration files, serve HTML/CSS
- **System administrators** process log files, manage configs
- **Everyone** needs to save and load information

**Today's mission:** Learn to work with files like a pro! 💪

## The File Cabinet Analogy 🗃️

### Think of File Operations Like Office Work

**File System = Office Building**
- **Folders** = Filing cabinets
- **Files** = Documents in folders
- **File paths** = Office addresses ("Building/Floor/Cabinet/Folder")

**File Operations = Office Tasks**
- **Opening** a file = Taking a document from the cabinet
- **Reading** = Looking at the document's content
- **Writing** = Adding or changing content
- **Closing** = Putting the document back properly

**File Modes = Document Permissions**
- **Read ('r')** = "View only" - can look but not change
- **Write ('w')** = "Replace entirely" - throw away old, write new
- **Append ('a')** = "Add to end" - keep existing, add more

Just like in an office, you must **open** before you work and **close** when done!

## Basic File Operations 📖

### The Essential Pattern: Open → Work → Close

In [None]:
# First, let's create a sample file to work with
sample_content = """Welcome to Python File I/O!
This is line 2 of our sample file.
Line 3 contains some data: Python, Java, JavaScript
Final line: Remember to close your files!"""

# Create our sample file
with open('sample.txt', 'w') as file:
    file.write(sample_content)

print("✅ Sample file 'sample.txt' created!")
print("📁 File contains 4 lines of text about Python programming.")

In [None]:
# Method 1: Basic file reading (manual close)
print("=== Method 1: Manual File Handling ===")

# Step 1: Open the file
file = open('sample.txt', 'r')  # 'r' = read mode

# Step 2: Read the content
content = file.read()  # Read entire file as one string

# Step 3: Close the file (IMPORTANT!)
file.close()

print("File content:")
print(content)
print(f"Content type: {type(content)}")
print(f"Content length: {len(content)} characters")

In [None]:
# Method 2: Using 'with' statement (RECOMMENDED!)
print("=== Method 2: 'with' Statement (Best Practice) ===")

# The 'with' statement automatically closes the file
with open('sample.txt', 'r') as file:
    content = file.read()
    print("📖 Reading file with 'with' statement:")
    print(content)
    print("\n🔄 File automatically closes when 'with' block ends!")

# File is automatically closed here - no need to call file.close()
print("\n✅ File handling complete and safe!")

## Different Ways to Read Files 📚

In [None]:
print("=== Different File Reading Methods ===")

# Method 1: Read entire file as one string
print("\n1. file.read() - Entire file as string:")
with open('sample.txt', 'r') as file:
    entire_content = file.read()
    print(repr(entire_content))  # repr() shows \n characters

# Method 2: Read file line by line into a list
print("\n2. file.readlines() - List of lines:")
with open('sample.txt', 'r') as file:
    lines_list = file.readlines()
    print(f"Lines: {lines_list}")
    print(f"Number of lines: {len(lines_list)}")

# Method 3: Read one line at a time
print("\n3. file.readline() - One line at a time:")
with open('sample.txt', 'r') as file:
    first_line = file.readline()
    second_line = file.readline()
    print(f"First line: {repr(first_line)}")
    print(f"Second line: {repr(second_line)}")

# Method 4: Iterate through lines (MOST COMMON)
print("\n4. for line in file - Iterate through lines:")
with open('sample.txt', 'r') as file:
    for line_number, line in enumerate(file, 1):
        # strip() removes the \n at the end of each line
        clean_line = line.strip()
        print(f"Line {line_number}: {clean_line}")

## Writing Files ✏️

### Three Writing Modes: Write, Append, and Create

In [None]:
# Writing Mode 1: 'w' - Write (overwrites existing file)
print("=== Mode 'w': Write (Replace) ===")

grocery_list = ["apples", "bananas", "bread", "milk", "eggs"]

with open('grocery_list.txt', 'w') as file:
    file.write("My Grocery List\n")
    file.write("===============\n")
    
    for i, item in enumerate(grocery_list, 1):
        file.write(f"{i}. {item.title()}\n")
    
    file.write("\nTotal items: " + str(len(grocery_list)))

print("✅ Grocery list written to 'grocery_list.txt'")

# Read it back to verify
with open('grocery_list.txt', 'r') as file:
    print("\n📖 File contents:")
    print(file.read())

In [None]:
# Writing Mode 2: 'a' - Append (add to existing file)
print("=== Mode 'a': Append ===")

# Add more items to our grocery list
additional_items = ["cheese", "yogurt", "chicken"]

with open('grocery_list.txt', 'a') as file:  # 'a' = append mode
    file.write("\n\nAdditional Items:\n")
    file.write("================\n")
    
    for item in additional_items:
        file.write(f"• {item.title()}\n")

print("✅ Additional items appended to grocery list")

# Read the updated file
with open('grocery_list.txt', 'r') as file:
    print("\n📖 Updated file contents:")
    print(file.read())

In [None]:
# Advanced writing: writelines() method
print("=== Using writelines() for Multiple Lines ===")

# Create a simple log file
from datetime import datetime

log_entries = [
    f"{datetime.now()}: Program started\n",
    f"{datetime.now()}: Processing data\n",
    f"{datetime.now()}: Analysis complete\n",
    f"{datetime.now()}: Results saved\n"
]

with open('program_log.txt', 'w') as file:
    file.write("Program Execution Log\n")
    file.write("====================\n")
    file.writelines(log_entries)  # Write multiple lines at once

print("✅ Log file created with writelines()")

# Read and display the log
with open('program_log.txt', 'r') as file:
    print("\n📖 Log file contents:")
    for line in file:
        print(line.strip())

## 🏗️ Live Coding: Student Grade Manager

Let's build a complete system that saves and loads student data!

In [None]:
# Student Grade Management System
print("=== Student Grade Management System ===")

def save_student_grades(students_data, filename='student_grades.txt'):
    """
    Save student grade data to a text file
    
    Parameters:
    - students_data: dict {student_name: [grades]}
    - filename: str (output file name)
    """
    with open(filename, 'w') as file:
        file.write("Student Grade Report\n")
        file.write("==================\n\n")
        
        for student, grades in students_data.items():
            average = sum(grades) / len(grades)
            
            file.write(f"Student: {student}\n")
            file.write(f"Grades: {', '.join(map(str, grades))}\n")
            file.write(f"Average: {average:.2f}\n")
            
            # Determine letter grade
            if average >= 90:
                letter = 'A'
            elif average >= 80:
                letter = 'B'
            elif average >= 70:
                letter = 'C'
            else:
                letter = 'F'
                
            file.write(f"Letter Grade: {letter}\n")
            file.write("-" * 30 + "\n")
    
    print(f"✅ Student grades saved to '{filename}'")

def load_and_analyze_grades(filename='student_grades.txt'):
    """
    Load and analyze student grades from file
    
    Parameters:
    - filename: str (input file name)
    """
    print(f"\n📖 Loading grades from '{filename}':")
    
    with open(filename, 'r') as file:
        content = file.read()
        print(content)
    
    # Bonus: Extract and analyze data
    averages = []
    with open(filename, 'r') as file:
        for line in file:
            if line.startswith('Average:'):
                # Extract the average value
                avg_str = line.split(':')[1].strip()
                averages.append(float(avg_str))
    
    if averages:
        class_average = sum(averages) / len(averages)
        print(f"\n📊 Class Statistics:")
        print(f"Total students: {len(averages)}")
        print(f"Class average: {class_average:.2f}")
        print(f"Highest average: {max(averages):.2f}")
        print(f"Lowest average: {min(averages):.2f}")

# Sample student data
students = {
    'Alice Johnson': [95, 88, 92, 90],
    'Bob Smith': [78, 85, 80, 77],
    'Charlie Brown': [88, 91, 87, 89],
    'Diana Prince': [92, 95, 98, 94],
    'Eve Wilson': [85, 87, 83, 86]
}

# Save grades to file
save_student_grades(students)

# Load and display grades
load_and_analyze_grades()

## Error Handling: When Things Go Wrong 🚨

### The Reality Check
In the real world, things go wrong:
- Files don't exist
- Permissions are denied  
- Disk space runs out
- Users enter invalid data
- Network connections fail

**Professional programmers handle errors gracefully!**

## The Exception Handling Analogy 🎭

### Think of Exceptions Like Emergency Procedures

**Normal Code = Daily Routine**
- Wake up, shower, eat breakfast, go to work
- Everything goes according to plan

**Exception = Emergency Situation**
- Fire alarm goes off during work
- Car breaks down on highway
- Restaurant is closed when you arrive

**Exception Handling = Emergency Response Plan**
- **try:** "Attempt your normal routine"
- **except:** "If specific emergency occurs, do this instead"
- **else:** "If no emergency, do this bonus thing"
- **finally:** "No matter what happens, always do this"

Good emergency plans keep you safe and functioning!

In [None]:
# Let's see what happens WITHOUT error handling
print("=== What Happens When Files Don't Exist ===")

try:
    # This will cause an error - file doesn't exist
    with open('nonexistent_file.txt', 'r') as file:
        content = file.read()
        print(content)
except FileNotFoundError as e:
    print(f"❌ Error occurred: {e}")
    print("🛡️ But our program didn't crash - we handled it!")

print("\n✅ Program continues running normally...")

In [None]:
# Basic try-except structure
print("=== Basic Error Handling Structure ===")

def safe_file_reader(filename):
    """
    Safely read a file with error handling
    """
    try:
        # This is what we WANT to do
        print(f"🔍 Attempting to read '{filename}'...")
        with open(filename, 'r') as file:
            content = file.read()
            print(f"✅ Successfully read {len(content)} characters")
            return content
            
    except FileNotFoundError:
        # Handle the specific case where file doesn't exist
        print(f"❌ File '{filename}' not found!")
        print("💡 Tip: Check the filename and path")
        return None
        
    except PermissionError:
        # Handle the case where we don't have permission
        print(f"🔒 Permission denied for '{filename}'")
        print("💡 Tip: Check file permissions")
        return None
        
    except Exception as e:
        # Handle any other unexpected errors
        print(f"⚠️ Unexpected error: {e}")
        return None

# Test with existing file
print("Test 1: Existing file")
content1 = safe_file_reader('sample.txt')

print("\nTest 2: Non-existent file")
content2 = safe_file_reader('missing_file.txt')

print("\n✅ Both tests completed - program didn't crash!")

In [None]:
# Complete error handling with try-except-else-finally
print("=== Complete Error Handling Pattern ===")

def robust_file_processor(filename, backup_filename=None):
    """
    Process a file with comprehensive error handling
    """
    processed_lines = 0
    
    try:
        print(f"🔄 Processing '{filename}'...")
        with open(filename, 'r') as file:
            lines = file.readlines()
            
            for line in lines:
                # Simulate processing each line
                if line.strip():  # Skip empty lines
                    processed_lines += 1
                    
    except FileNotFoundError:
        print(f"❌ '{filename}' not found!")
        
        if backup_filename:
            print(f"🔄 Trying backup file '{backup_filename}'...")
            return robust_file_processor(backup_filename)  # Recursive call
        else:
            return 0
            
    except PermissionError:
        print(f"🔒 No permission to read '{filename}'")
        return 0
        
    except UnicodeDecodeError:
        print(f"📝 '{filename}' has encoding issues")
        return 0
        
    except Exception as e:
        print(f"⚠️ Unexpected error: {type(e).__name__}: {e}")
        return 0
        
    else:
        # This runs only if NO exception occurred
        print(f"✅ Processing completed successfully!")
        
    finally:
        # This ALWAYS runs, regardless of success or failure
        print(f"📊 Final report: {processed_lines} lines processed")
        
    return processed_lines

# Test the robust processor
print("Test 1: Process existing file")
result1 = robust_file_processor('sample.txt')

print("\nTest 2: Process missing file with backup")
result2 = robust_file_processor('missing.txt', 'sample.txt')

print("\nTest 3: Process missing file without backup")
result3 = robust_file_processor('missing.txt')

print(f"\n📈 Results: {result1}, {result2}, {result3}")

## Working with CSV Files 📊

**CSV (Comma-Separated Values)** files are everywhere in data science!

In [None]:
# Method 1: Manual CSV handling (educational purpose)
print("=== Manual CSV Processing ===")

# Create sample sales data
sales_data = [
    ['Date', 'Product', 'Quantity', 'Price', 'Total'],
    ['2024-01-15', 'Laptop', '2', '999.99', '1999.98'],
    ['2024-01-16', 'Mouse', '5', '29.99', '149.95'],
    ['2024-01-17', 'Keyboard', '3', '89.99', '269.97'],
    ['2024-01-18', 'Monitor', '1', '299.99', '299.99'],
    ['2024-01-19', 'Headphones', '4', '149.99', '599.96']
]

# Write CSV manually
with open('sales_data.csv', 'w') as file:
    for row in sales_data:
        # Join items with commas and add newline
        csv_line = ','.join(row) + '\n'
        file.write(csv_line)

print("✅ CSV file created manually")

# Read CSV manually
print("\n📖 Reading CSV file manually:")
with open('sales_data.csv', 'r') as file:
    for line_num, line in enumerate(file, 1):
        # Split by comma and remove whitespace
        columns = [col.strip() for col in line.split(',')]
        print(f"Row {line_num}: {columns}")

In [None]:
# Method 2: Using Python's csv module (RECOMMENDED)
import csv

print("=== Professional CSV Handling with csv Module ===")

# Write CSV with csv module
employee_data = [
    {'Name': 'Alice Johnson', 'Department': 'Engineering', 'Salary': 95000, 'Years': 3},
    {'Name': 'Bob Smith', 'Department': 'Marketing', 'Salary': 72000, 'Years': 2},
    {'Name': 'Charlie Brown', 'Department': 'Engineering', 'Salary': 88000, 'Years': 4},
    {'Name': 'Diana Prince', 'Department': 'HR', 'Salary': 78000, 'Years': 5},
    {'Name': 'Eve Wilson', 'Department': 'Engineering', 'Salary': 102000, 'Years': 6}
]

# Write using DictWriter
with open('employees.csv', 'w', newline='') as file:
    fieldnames = ['Name', 'Department', 'Salary', 'Years']
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    # Write header row
    writer.writeheader()
    
    # Write data rows
    writer.writerows(employee_data)

print("✅ Employee CSV created with csv module")

# Read using DictReader
print("\n📖 Reading CSV with DictReader:")
try:
    with open('employees.csv', 'r') as file:
        reader = csv.DictReader(file)
        
        print(f"Columns: {reader.fieldnames}")
        print()
        
        for row_num, row in enumerate(reader, 1):
            print(f"Employee {row_num}:")
            print(f"  Name: {row['Name']}")
            print(f"  Department: {row['Department']}")
            print(f"  Salary: ${int(row['Salary']):,}")
            print(f"  Years: {row['Years']}")
            print()
            
except FileNotFoundError:
    print("❌ Employee CSV file not found!")
except Exception as e:
    print(f"⚠️ Error reading CSV: {e}")

In [None]:
# CSV Data Analysis Example
print("=== CSV Data Analysis Example ===")

def analyze_employee_data(filename='employees.csv'):
    """
    Analyze employee data from CSV file
    """
    try:
        employees = []
        
        # Read all employee data
        with open(filename, 'r') as file:
            reader = csv.DictReader(file)
            
            for row in reader:
                # Convert numeric fields
                row['Salary'] = int(row['Salary'])
                row['Years'] = int(row['Years'])
                employees.append(row)
        
        if not employees:
            print("⚠️ No employee data found!")
            return
        
        # Analysis
        total_employees = len(employees)
        salaries = [emp['Salary'] for emp in employees]
        years_list = [emp['Years'] for emp in employees]
        
        # Department analysis
        departments = {}
        for emp in employees:
            dept = emp['Department']
            if dept not in departments:
                departments[dept] = []
            departments[dept].append(emp)
        
        # Display results
        print(f"📊 Employee Analysis Results:")
        print(f"Total Employees: {total_employees}")
        print(f"Average Salary: ${sum(salaries) / len(salaries):,.2f}")
        print(f"Salary Range: ${min(salaries):,} - ${max(salaries):,}")
        print(f"Average Years: {sum(years_list) / len(years_list):.1f}")
        
        print(f"\n🏢 Department Breakdown:")
        for dept, emp_list in departments.items():
            dept_salaries = [emp['Salary'] for emp in emp_list]
            avg_salary = sum(dept_salaries) / len(dept_salaries)
            print(f"  {dept}: {len(emp_list)} employees, avg salary: ${avg_salary:,.2f}")
        
        # Find highest paid employee
        highest_paid = max(employees, key=lambda emp: emp['Salary'])
        print(f"\n💰 Highest Paid: {highest_paid['Name']} (${highest_paid['Salary']:,})")
        
        # Find most experienced
        most_experienced = max(employees, key=lambda emp: emp['Years'])
        print(f"🏆 Most Experienced: {most_experienced['Name']} ({most_experienced['Years']} years)")
        
    except FileNotFoundError:
        print(f"❌ File '{filename}' not found!")
    except csv.Error as e:
        print(f"📄 CSV error: {e}")
    except Exception as e:
        print(f"⚠️ Unexpected error: {e}")

# Run the analysis
analyze_employee_data()

## Working with JSON Files 📄

**JSON (JavaScript Object Notation)** is perfect for structured data and APIs!

In [None]:
import json
from datetime import datetime

print("=== JSON File Operations ===")

# Create complex Python data structure
course_data = {
    "course_info": {
        "title": "Python Fundamentals for Data Science",
        "instructor": "Dr. Python Expert",
        "duration_weeks": 3,
        "start_date": "2024-01-15",
        "format": "online"
    },
    "students": [
        {
            "id": 1,
            "name": "Alice Johnson",
            "email": "alice@email.com",
            "assignments": {
                "week1": {"completed": True, "score": 95},
                "week2": {"completed": True, "score": 88},
                "week3": {"completed": False, "score": None}
            },
            "attendance": ["2024-01-15", "2024-01-17", "2024-01-22"]
        },
        {
            "id": 2,
            "name": "Bob Smith",
            "email": "bob@email.com",
            "assignments": {
                "week1": {"completed": True, "score": 82},
                "week2": {"completed": True, "score": 79},
                "week3": {"completed": False, "score": None}
            },
            "attendance": ["2024-01-15", "2024-01-22"]
        }
    ],
    "statistics": {
        "total_enrolled": 2,
        "avg_attendance_rate": 0.75,
        "completion_rate": {
            "week1": 1.0,
            "week2": 1.0,
            "week3": 0.0
        }
    }
}

print("📊 Created complex course data structure")
print(f"Course: {course_data['course_info']['title']}")
print(f"Students: {course_data['statistics']['total_enrolled']}")

In [None]:
# Write JSON to file
print("\n=== Writing JSON File ===")

try:
    with open('course_data.json', 'w') as file:
        json.dump(course_data, file, indent=2)  # indent=2 makes it readable
    
    print("✅ Course data saved to 'course_data.json'")
    
    # Read JSON from file
    print("\n📖 Reading JSON File:")
    
    with open('course_data.json', 'r') as file:
        loaded_data = json.load(file)
    
    print("✅ JSON data loaded successfully")
    print(f"Data type: {type(loaded_data)}")
    print(f"Keys: {list(loaded_data.keys())}")
    
    # Access nested data
    course_title = loaded_data['course_info']['title']
    first_student = loaded_data['students'][0]['name']
    week1_score = loaded_data['students'][0]['assignments']['week1']['score']
    
    print(f"\n🎓 Course: {course_title}")
    print(f"👨‍🎓 First student: {first_student}")
    print(f"📝 Their Week 1 score: {week1_score}")
    
except json.JSONDecodeError as e:
    print(f"📄 JSON format error: {e}")
except FileNotFoundError:
    print("❌ JSON file not found!")
except Exception as e:
    print(f"⚠️ Error: {e}")

In [None]:
# JSON Data Analysis and Modification
print("=== JSON Data Analysis and Updates ===")

def analyze_and_update_course_data(filename='course_data.json'):
    """
    Analyze course data and add new information
    """
    try:
        # Load existing data
        with open(filename, 'r') as file:
            data = json.load(file)
        
        print("📊 Course Data Analysis:")
        
        # Analyze student performance
        students = data['students']
        total_students = len(students)
        
        print(f"Total Students: {total_students}")
        
        # Calculate average scores by week
        week_scores = {'week1': [], 'week2': [], 'week3': []}
        
        for student in students:
            assignments = student['assignments']
            for week, assignment in assignments.items():
                if assignment['completed'] and assignment['score']:
                    week_scores[week].append(assignment['score'])
        
        print("\n📈 Weekly Average Scores:")
        for week, scores in week_scores.items():
            if scores:
                avg = sum(scores) / len(scores)
                print(f"  {week.title()}: {avg:.1f} (from {len(scores)} students)")
            else:
                print(f"  {week.title()}: No completed assignments yet")
        
        # Add analysis results to the data
        current_time = datetime.now().isoformat()
        data['last_analysis'] = current_time
        data['analysis_results'] = {
            'weekly_averages': {
                week: sum(scores)/len(scores) if scores else 0 
                for week, scores in week_scores.items()
            },
            'completion_status': {
                week: len(scores) for week, scores in week_scores.items()
            }
        }
        
        # Find student with highest average
        best_student = None
        highest_avg = 0
        
        for student in students:
            completed_scores = [
                assignment['score'] 
                for assignment in student['assignments'].values() 
                if assignment['completed'] and assignment['score']
            ]
            
            if completed_scores:
                avg = sum(completed_scores) / len(completed_scores)
                if avg > highest_avg:
                    highest_avg = avg
                    best_student = student['name']
        
        if best_student:
            print(f"\n🏆 Top Student: {best_student} (Average: {highest_avg:.1f})")
            data['analysis_results']['top_student'] = {
                'name': best_student,
                'average': round(highest_avg, 1)
            }
        
        # Save updated data
        with open(filename, 'w') as file:
            json.dump(data, file, indent=2)
        
        print(f"\n✅ Analysis complete! Data updated in '{filename}'")
        
    except FileNotFoundError:
        print(f"❌ File '{filename}' not found!")
    except json.JSONDecodeError as e:
        print(f"📄 JSON error: {e}")
    except Exception as e:
        print(f"⚠️ Error: {e}")

# Run the analysis
analyze_and_update_course_data()

# Show the updated file content
print("\n📖 Updated JSON structure:")
try:
    with open('course_data.json', 'r') as file:
        updated_data = json.load(file)
    
    print(f"New keys added: {[k for k in updated_data.keys() if k not in course_data.keys()]}")
    print(f"Analysis timestamp: {updated_data.get('last_analysis', 'Not found')}")
    
except Exception as e:
    print(f"Error reading updated file: {e}")

## 🎯 In-Class Exercise: Data Processing Pipeline (25 minutes)

Build a complete data processing system with error handling!

In [None]:
# Exercise: Build a Personal Finance Tracker
print("💰 Personal Finance Tracker Exercise")
print("Build a system that processes financial transactions!")

# TODO: Create sample transaction data
sample_transactions = [
    {'date': '2024-01-15', 'category': 'Food', 'amount': -45.67, 'description': 'Grocery shopping'},
    {'date': '2024-01-16', 'category': 'Income', 'amount': 3000.00, 'description': 'Salary'},
    {'date': '2024-01-17', 'category': 'Transportation', 'amount': -28.50, 'description': 'Gas'},
    {'date': '2024-01-18', 'category': 'Entertainment', 'amount': -15.99, 'description': 'Netflix'},
    {'date': '2024-01-19', 'category': 'Food', 'amount': -32.40, 'description': 'Restaurant'},
]

def save_transactions_json(transactions, filename='transactions.json'):
    """
    TODO: Save transactions to JSON file with error handling
    
    Requirements:
    1. Use try-except for error handling
    2. Add metadata (timestamp, total transactions)
    3. Format JSON nicely with indentation
    4. Return success/failure status
    """
    # Your code here
    pass

def load_and_analyze_transactions(filename='transactions.json'):
    """
    TODO: Load transactions and perform analysis
    
    Requirements:
    1. Handle file not found gracefully
    2. Calculate total income, expenses, net
    3. Group by category
    4. Find largest expense
    5. Generate summary report
    """
    # Your code here
    pass

def export_summary_csv(transactions, filename='financial_summary.csv'):
    """
    TODO: Export category summaries to CSV
    
    Requirements:
    1. Group transactions by category
    2. Calculate totals and averages per category
    3. Export to CSV with proper headers
    4. Handle write errors
    """
    # Your code here
    pass

# Test your implementation
print("Testing finance tracker...")

# Save sample data
# save_result = save_transactions_json(sample_transactions)

# Load and analyze
# analysis = load_and_analyze_transactions()

# Export summary
# export_result = export_summary_csv(sample_transactions)

print("\n🎯 Complete the functions above to finish the exercise!")

## File I/O Best Practices 🌟

### Professional Guidelines

In [None]:
# Best Practices Demonstration
import os
from pathlib import Path

print("=== File I/O Best Practices ===")

# Practice 1: Always use 'with' statements
def good_file_handling(filename):
    """Good: Automatic file closing"""
    try:
        with open(filename, 'r') as file:
            return file.read()
    except FileNotFoundError:
        return None

# Practice 2: Check if files exist before processing
def safe_file_check(filename):
    """Check file existence safely"""
    if not os.path.exists(filename):
        print(f"❌ File '{filename}' does not exist")
        return False
    
    if not os.path.isfile(filename):
        print(f"❌ '{filename}' is not a regular file")
        return False
        
    return True

# Practice 3: Use pathlib for cross-platform paths
from pathlib import Path

def modern_path_handling():
    """Modern path handling with pathlib"""
    # Create paths that work on Windows, Mac, and Linux
    data_dir = Path('data')
    csv_file = data_dir / 'employees.csv'
    json_file = data_dir / 'course_data.json'
    
    print(f"Data directory: {data_dir}")
    print(f"CSV file path: {csv_file}")
    print(f"JSON file path: {json_file}")
    
    # Check if path exists
    if csv_file.exists():
        print(f"✅ {csv_file.name} exists")
    else:
        print(f"❌ {csv_file.name} not found")

# Practice 4: Validate file contents
def validate_csv_structure(filename, expected_columns):
    """
    Validate CSV file structure
    """
    try:
        import csv
        with open(filename, 'r') as file:
            reader = csv.DictReader(file)
            
            # Check if expected columns exist
            missing_columns = set(expected_columns) - set(reader.fieldnames)
            if missing_columns:
                print(f"❌ Missing columns: {missing_columns}")
                return False
                
            # Check if file has data
            row_count = sum(1 for row in reader)
            if row_count == 0:
                print("⚠️ CSV file is empty")
                return False
                
            print(f"✅ CSV valid: {row_count} rows, all columns present")
            return True
            
    except Exception as e:
        print(f"❌ CSV validation failed: {e}")
        return False

# Practice 5: Create backup files
def create_backup(filename):
    """
    Create a backup copy before modifying
    """
    if not os.path.exists(filename):
        return False
        
    backup_name = filename + '.backup'
    try:
        import shutil
        shutil.copy2(filename, backup_name)
        print(f"✅ Backup created: {backup_name}")
        return True
    except Exception as e:
        print(f"❌ Backup failed: {e}")
        return False

# Demonstrate best practices
print("\n1. Path handling:")
modern_path_handling()

print("\n2. File validation:")
if safe_file_check('employees.csv'):
    validate_csv_structure('employees.csv', ['Name', 'Department', 'Salary', 'Years'])

print("\n3. Backup creation:")
if os.path.exists('employees.csv'):
    create_backup('employees.csv')

print("\n📋 Best Practices Summary:")
print("✅ Always use 'with' statements")
print("✅ Handle exceptions appropriately")
print("✅ Validate file existence and structure")
print("✅ Use pathlib for cross-platform compatibility")
print("✅ Create backups before modifying files")
print("✅ Close files properly (automatic with 'with')")
print("✅ Use appropriate file modes ('r', 'w', 'a')")

## Real-World Applications 🌍

### How Data Scientists Use File I/O Daily

In [None]:
# Real-world data science scenarios
print("=== Data Science File I/O Scenarios ===")

# Scenario 1: Processing sensor data
def process_sensor_data():
    """
    Simulate processing IoT sensor data
    """
    print("\n🌡️ Scenario 1: IoT Sensor Data Processing")
    
    # Simulate sensor data
    import random
    from datetime import datetime, timedelta
    
    sensor_data = []
    base_time = datetime.now()
    
    for i in range(24):  # 24 hours of data
        timestamp = base_time + timedelta(hours=i)
        temperature = round(random.uniform(18, 28), 1)
        humidity = round(random.uniform(40, 80), 1)
        
        sensor_data.append({
            'timestamp': timestamp.isoformat(),
            'temperature': temperature,
            'humidity': humidity,
            'sensor_id': 'SENSOR_001'
        })
    
    # Save raw data
    try:
        with open('sensor_data.json', 'w') as file:
            json.dump(sensor_data, file, indent=2)
        
        # Process and create summary
        temperatures = [reading['temperature'] for reading in sensor_data]
        humidity_values = [reading['humidity'] for reading in sensor_data]
        
        summary = {
            'period': '24_hours',
            'readings_count': len(sensor_data),
            'temperature': {
                'min': min(temperatures),
                'max': max(temperatures),
                'avg': round(sum(temperatures) / len(temperatures), 1)
            },
            'humidity': {
                'min': min(humidity_values),
                'max': max(humidity_values),
                'avg': round(sum(humidity_values) / len(humidity_values), 1)
            },
            'alerts': []
        }
        
        # Add alerts for extreme values
        for reading in sensor_data:
            if reading['temperature'] > 25:
                summary['alerts'].append(f"High temp: {reading['temperature']}°C at {reading['timestamp'][:16]}")
        
        # Save summary
        with open('sensor_summary.json', 'w') as file:
            json.dump(summary, file, indent=2)
            
        print(f"✅ Processed {len(sensor_data)} sensor readings")
        print(f"📊 Temperature range: {summary['temperature']['min']}°C - {summary['temperature']['max']}°C")
        print(f"⚠️ Alerts generated: {len(summary['alerts'])}")
        
    except Exception as e:
        print(f"❌ Error processing sensor data: {e}")

# Scenario 2: Log file analysis
def analyze_web_logs():
    """
    Simulate web server log analysis
    """
    print("\n🌐 Scenario 2: Web Server Log Analysis")
    
    # Create sample log file
    log_entries = [
        "2024-01-15 10:30:15 GET /api/users 200 0.123",
        "2024-01-15 10:30:16 POST /api/login 401 0.056",
        "2024-01-15 10:30:17 GET /api/dashboard 200 0.234",
        "2024-01-15 10:30:18 GET /static/css/main.css 200 0.012",
        "2024-01-15 10:30:19 POST /api/data 500 1.234",
        "2024-01-15 10:30:20 GET /api/users 200 0.087",
    ]
    
    try:
        # Write log file
        with open('server.log', 'w') as file:
            for entry in log_entries:
                file.write(entry + '\n')
        
        # Analyze logs
        status_counts = {}
        response_times = []
        error_logs = []
        
        with open('server.log', 'r') as file:
            for line in file:
                parts = line.strip().split()
                if len(parts) >= 5:
                    method = parts[2]
                    endpoint = parts[3]
                    status = parts[4]
                    response_time = float(parts[5])
                    
                    # Count status codes
                    status_counts[status] = status_counts.get(status, 0) + 1
                    
                    # Collect response times
                    response_times.append(response_time)
                    
                    # Log errors
                    if status.startswith('4') or status.startswith('5'):
                        error_logs.append(line.strip())
        
        # Generate report
        avg_response_time = sum(response_times) / len(response_times)
        
        report = {
            'total_requests': len(response_times),
            'status_code_summary': status_counts,
            'average_response_time': round(avg_response_time, 3),
            'error_count': len(error_logs),
            'errors': error_logs
        }
        
        # Save report
        with open('log_analysis_report.json', 'w') as file:
            json.dump(report, file, indent=2)
        
        print(f"✅ Analyzed {report['total_requests']} requests")
        print(f"📈 Average response time: {report['average_response_time']}s")
        print(f"❌ Errors found: {report['error_count']}")
        
    except Exception as e:
        print(f"❌ Error analyzing logs: {e}")

# Scenario 3: Configuration management
def manage_config_files():
    """
    Demonstrate configuration file management
    """
    print("\n⚙️ Scenario 3: Configuration Management")
    
    # Create application config
    config = {
        "database": {
            "host": "localhost",
            "port": 5432,
            "name": "analytics_db",
            "ssl_enabled": True
        },
        "api": {
            "rate_limit": 1000,
            "timeout": 30,
            "debug_mode": False
        },
        "logging": {
            "level": "INFO",
            "file": "app.log",
            "max_size_mb": 100
        }
    }
    
    try:
        # Save configuration
        with open('app_config.json', 'w') as file:
            json.dump(config, file, indent=2)
        
        # Load and validate configuration
        with open('app_config.json', 'r') as file:
            loaded_config = json.load(file)
        
        # Validate required settings
        required_sections = ['database', 'api', 'logging']
        missing_sections = [section for section in required_sections 
                          if section not in loaded_config]
        
        if missing_sections:
            print(f"❌ Missing config sections: {missing_sections}")
        else:
            print("✅ Configuration loaded and validated")
            print(f"🗄️ Database: {loaded_config['database']['host']}:{loaded_config['database']['port']}")
            print(f"🚀 API rate limit: {loaded_config['api']['rate_limit']} requests/hour")
        
    except Exception as e:
        print(f"❌ Config management error: {e}")

# Run all scenarios
process_sensor_data()
analyze_web_logs()
manage_config_files()

print("\n🎯 These scenarios show how file I/O is essential for:")
print("• Processing sensor/IoT data")
print("• Analyzing application logs")
print("• Managing configuration files")
print("• Creating data processing pipelines")
print("• Building ETL (Extract, Transform, Load) systems")

## 📚 Session Summary

🎉 **Outstanding!** You've mastered file operations and error handling - essential skills for data science!

### ✅ Core Skills Acquired
- **File Operations**: Read, write, append with proper file handling
- **Error Handling**: Graceful exception handling with try/except/finally
- **CSV Processing**: Reading and writing structured data for analysis
- **JSON Operations**: Working with complex data structures and APIs
- **Best Practices**: Professional-grade file handling techniques

### 🔑 Key Patterns Mastered
1. **'with' statement**: Automatic file closing and resource management
2. **Exception handling**: try/except/else/finally for robust programs
3. **Data validation**: Checking file structure and content integrity
4. **Path handling**: Cross-platform file path management
5. **Data transformation**: Converting between formats (CSV ↔ JSON ↔ Python objects)

### 🗃️ Remember: The File Cabinet Analogy
- **Files** are documents in your office filing system
- **Opening** files is like taking documents from cabinets
- **Reading/Writing** is like reviewing or updating documents
- **Closing** properly is like returning documents to their place
- **Error handling** is like having procedures for missing or damaged files

### 🏠 Homework Preview
This week's homework will include:
1. Building a complete data processing pipeline
2. Creating robust error handling systems
3. Working with real CSV and JSON datasets
4. Implementing data validation and backup systems

### 🚀 Next Session Preview
Thursday we'll learn about **Working with Data** - pandas basics, data cleaning, and analysis techniques!

### 💡 Pro Tips for Success
- Always use `with` statements for file operations
- Plan for errors - they WILL happen in real projects
- Validate your data before processing
- Keep backups of important files
- Test with edge cases (empty files, missing files, corrupt data)

**You're now ready to handle real-world data like a professional!** 📊✨