<a href="https://colab.research.google.com/github/1235357/Python-Basics-File-Handling-Lecture/blob/main/Updated_Version_Python_File_Handling_Lecture.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Basics: Complete File Handling Tutorial

**Instructor**: Zhentong Ye (1235357)  
**Duration**: 35-45 minutes  
**Platform**: Jupyter Notebook

---

## Course Outline - Progressive Difficulty Structure

### Level 1: Foundation (12 minutes)
1. **What is File Handling?** - Understanding the basics
2. **Your First File** - Simple open and close
3. **The Magic Word: 'with'** - Safe file handling
4. **Basic Reading** - read() method

### Level 2: Essential Skills (10 minutes)
5. **Line by Line Reading** - readline() and iteration
6. **Reading Multiple Lines** - readlines() method
7. **Your First Write** - write() method
8. **Adding to Files** - append mode

### Level 3: Real-World Skills (8 minutes)
9. **When Things Go Wrong** - Exception handling
10. **File Modes Explained** - r, w, a, x modes
11. **Text Encoding** - UTF-8 and character sets

### Level 4: Professional Applications (10 minutes)
12. **Working with CSV Files** - Structured data
13. **Text File Processing** - Real applications
14. **Best Practices** - Production-ready code

### Level 5: Hands-On Practice (5 minutes)
15. **Progressive Challenges** - From simple to advanced

---

## Learning Objectives

By the end of this tutorial, you will:
- Master basic file operations (open, read, write, close)
- Handle errors gracefully in file operations
- Work with different file formats (text, CSV, JSON)
- Apply file handling to real-world problems
- Follow Python best practices for file handling

# Setup: Creating Sample Data Files for Learning

**Important**: This cell creates all the sample data files we'll use throughout the tutorial. Run this first!

In [None]:
# Generate simulated data for subsequent lessons!!!
import os
import csv
import json
from datetime import datetime

# Create directories
directories = ['practice_files', 'output', 'data']
for directory in directories:
    os.makedirs(directory, exist_ok=True)
    print(f"✓ Created directory: {directory}")

# 1. Create Sales Data (CSV format) - for learning structured data processing
sales_data = [
    ['Date', 'Product', 'Category', 'Quantity', 'Price', 'Total'],
    ['2024-01-15', 'Laptop', 'Electronics', '2', '999.99', '1999.98'],
    ['2024-01-16', 'Coffee Mug', 'Kitchen', '5', '12.50', '62.50'],
    ['2024-01-17', 'Book', 'Education', '3', '25.00', '75.00'],
    ['2024-01-18', 'Smartphone', 'Electronics', '1', '699.99', '699.99'],
    ['2024-01-19', 'Desk Chair', 'Furniture', '1', '199.99', '199.99']
]

with open('data/sales_data.csv', 'w', newline='', encoding='utf-8') as f:
    writer = csv.writer(f)
    writer.writerows(sales_data)
print("✓ Created sales_data.csv with", len(sales_data)-1, "records")

# 2. Create Log Data (TXT format) - for learning text data processing
log_entries = [
    "[2024-01-15 09:00:00] INFO: Application started successfully",
    "[2024-01-15 09:15:23] INFO: User login: john_doe",
    "[2024-01-15 09:30:45] WARNING: High memory usage detected (85%)",
    "[2024-01-15 09:45:12] ERROR: Database connection timeout",
    "[2024-01-15 10:00:33] INFO: Database connection restored",
    "[2024-01-15 10:15:56] INFO: User logout: john_doe",
    "[2024-01-15 10:30:18] WARNING: Disk space low (15% remaining)",
    "[2024-01-15 10:45:41] ERROR: Failed to save user preferences",
    "[2024-01-15 11:00:04] INFO: System backup completed"
]

with open('data/application.log', 'w', encoding='utf-8') as f:
    for entry in log_entries:
        f.write(entry + '\n')
print("✓ Created application.log with", len(log_entries), "log entries")

# 3. Create Configuration Data (JSON format) - for learning semi-structured data processing
config_data = {
    "application": {
        "name": "File Handler Pro",
        "version": "2.1.0",
        "debug_mode": False
    },
    "database": {
        "host": "localhost",
        "port": 5432,
        "name": "filehandler_db",
        "timeout": 30
    },
    "logging": {
        "level": "INFO",
        "file": "application.log",
        "max_size_mb": 100
    },
    "features": {
        "auto_backup": True,
        "compression": True,
        "encryption": False
    }
}

with open('data/config.json', 'w', encoding='utf-8') as f:
    json.dump(config_data, f, indent=2)
print("✓ Created config.json with application settings")

# 4. Create a simple text file for basic operations
sample_text = """Welcome to Python File Handling!
This is line 2 of our sample file.
Line 3 contains some numbers: 123, 456, 789.
The last line has special characters: @#$%^&*()"""

with open('practice_files/sample.txt', 'w', encoding='utf-8') as f:
    f.write(sample_text)
print("✓ Created sample.txt for basic operations")

print("\n All sample data files created successfully!")
print(" Files created:")
print("   - data/sales_data.csv (structured data)")
print("   - data/application.log (text data)")
print("   - data/config.json (semi-structured data)")
print("   - practice_files/sample.txt (basic text)")

✓ Created directory: practice_files
✓ Created directory: output
✓ Created directory: data
✓ Created sales_data.csv with 5 records
✓ Created application.log with 9 log entries
✓ Created config.json with application settings
✓ Created sample.txt for basic operations

 All sample data files created successfully!
 Files created:
   - data/sales_data.csv (structured data)
   - data/application.log (text data)
   - data/config.json (semi-structured data)
   - practice_files/sample.txt (basic text)


# Level 1: Foundation - Building Your First File Skills

## What is File Handling?

File handling is like having a conversation with your computer's storage system. Just as you might read a book, write notes, or organize documents, Python can:

- **Read files** - Get information from stored files
- **Write files** - Save information to storage  
- **Modify files** - Update existing content
- **Organize files** - Manage file operations

### Why Learn File Handling?

File handling is essential for:
- **Data Science**: Reading datasets, saving analysis results
- **Web Development**: Configuration files, user data
- **Automation**: Processing multiple files, generating reports
- **Everyday Programming**: Logs, settings, data persistence

## Your First File: Opening and Reading

The most basic operation is reading a file. Let's start simple:

In [None]:
# Method 1: Basic file opening (we'll improve this)
file = open('practice_files/sample.txt', 'r')
content = file.read()
print("File content:")
print(content)
file.close()  # Don't forget this!

print("\n What just happened?")
print("1. open() - Opens the file for reading ('r' means read mode)")
print("2. read() - Reads the entire file content")
print("3. close() - Closes the file (very important!)")

File content:
Welcome to Python File Handling!
This is line 2 of our sample file.
Line 3 contains some numbers: 123, 456, 789.
The last line has special characters: @#$%^&*()

 What just happened?
1. open() - Opens the file for reading ('r' means read mode)
2. read() - Reads the entire file content
3. close() - Closes the file (very important!)


**But there's a problem with this approach...**

What if something goes wrong and `close()` never runs? The file stays open, wasting resources!

## The Magic Word: 'with'

Python's solution: the `with` statement

In [None]:
# Method 2: Using 'with' (much better!)
with open('practice_files/sample.txt', 'r') as f:
    content = f.read()
    print("Content using 'with':")
    print(content)
# File automatically closes here, even if errors occur!

print("\n Why 'with' is magic:")
print("- Automatically closes files")
print("- Works even if errors occur")
print("- Cleaner, more readable code")
print("- Python best practice")

Content using 'with':
Welcome to Python File Handling!
This is line 2 of our sample file.
Line 3 contains some numbers: 123, 456, 789.
The last line has special characters: @#$%^&*()

 Why 'with' is magic:
- Automatically closes files
- Works even if errors occur
- Cleaner, more readable code
- Python best practice


## 🎯 Your Turn: Practice Basic File Reading

Now it's your turn! Try reading the sales data file we created earlier.

In [None]:
# Your task: Read the sales_data.csv file and print its content
# Hint: Use the 'with' statement and the file is at 'data/sales_data.csv'

# Step 1: Open the file using 'with' statement
# Step 2: Read the content using .read()
# Step 3: Print the content

# Write your code here:


## Understanding File Modes

Before we go further, let's understand how to open files:

In [None]:
# Different ways to open files
print(" File Modes:")
print("'r'  - Read mode (default)")
print("'w'  - Write mode (overwrites existing)")
print("'a'  - Append mode (adds to end)")
print("'x'  - Create mode (fails if exists)")

# Let's see the difference:
# Write mode - creates new file or overwrites existing
with open('practice_files/modes_test.txt', 'w') as f:
    f.write("This is write mode\n")
    f.write("It overwrites everything\n")

print("\n✓ File created with write mode")

# Read what we just wrote
with open('practice_files/modes_test.txt', 'r') as f:
    content = f.read()
    print("Content after write mode:")
    print(content)

 File Modes:
'r'  - Read mode (default)
'w'  - Write mode (overwrites existing)
'a'  - Append mode (adds to end)
'x'  - Create mode (fails if exists)

✓ File created with write mode
Content after write mode:
This is write mode
It overwrites everything



# Level 2: Essential Skills - Reading and Writing Like a Pro

## Line by Line Reading: readline()

Sometimes you don't want to read the entire file at once:

In [None]:
# Reading one line at a time
with open('practice_files/sample.txt', 'r') as f:
    first_line = f.readline()
    second_line = f.readline()

    print("First line:", repr(first_line))
    print("Second line:", repr(second_line))

print("\n Notice the \\n at the end? That's the newline character!")

# Clean up the lines
with open('practice_files/sample.txt', 'r') as f:
    first_line = f.readline().strip()  # strip() removes \n
    second_line = f.readline().strip()

    print("\nClean first line:", first_line)
    print("Clean second line:", second_line)

First line: 'Welcome to Python File Handling!\n'
Second line: 'This is line 2 of our sample file.\n'

 Notice the \n at the end? That's the newline character!

Clean first line: Welcome to Python File Handling!
Clean second line: This is line 2 of our sample file.


## The Python Way: File Iteration

The most elegant way to read files line by line:

In [None]:
# Iteration - the Pythonic way
print(" Reading line by line:")
with open('practice_files/sample.txt', 'r') as f:
    for line_number, line in enumerate(f, 1):
        clean_line = line.strip()
        print(f"Line {line_number}: {clean_line}")

print("\n This approach is:")
print("- Memory efficient (doesn't load entire file)")
print("- Clean and readable")
print("- Handles large files well")

 Reading line by line:
Line 1: Welcome to Python File Handling!
Line 2: This is line 2 of our sample file.
Line 3: Line 3 contains some numbers: 123, 456, 789.
Line 4: The last line has special characters: @#$%^&*()

 This approach is:
- Memory efficient (doesn't load entire file)
- Clean and readable
- Handles large files well


## 🎯 Your Turn: Practice Line-by-Line Reading

Let's practice reading the log file line by line and counting different log levels.

In [None]:
# Your task: Read the log file and count INFO, WARNING, and ERROR messages
# File location: 'data/application.log'

# Initialize counters
info_count = 0
warning_count = 0
error_count = 0

# Step 1: Open the log file
# Step 2: Read each line
# Step 3: Check if line contains 'INFO', 'WARNING', or 'ERROR'
# Step 4: Increment appropriate counter
# Step 5: Print the results

# Write your code here:


# Print results (uncomment when ready)
# print(f" Log Analysis Results:")
# print(f"INFO messages: {info_count}")
# print(f"WARNING messages: {warning_count}")
# print(f"ERROR messages: {error_count}")

## Reading All Lines: readlines()

When you want all lines in a list:

In [None]:
# readlines() returns a list
with open('practice_files/sample.txt', 'r') as f:
    all_lines = f.readlines()

    print("Type:", type(all_lines))
    print("Number of lines:", len(all_lines))
    print("Raw lines:", all_lines)

# Clean up the lines
with open('practice_files/sample.txt', 'r') as f:
    all_lines = f.readlines()
    clean_lines = [line.strip() for line in all_lines]

    print("\nClean lines:", clean_lines)

Type: <class 'list'>
Number of lines: 4
Raw lines: ['Welcome to Python File Handling!\n', 'This is line 2 of our sample file.\n', 'Line 3 contains some numbers: 123, 456, 789.\n', 'The last line has special characters: @#$%^&*()']

Clean lines: ['Welcome to Python File Handling!', 'This is line 2 of our sample file.', 'Line 3 contains some numbers: 123, 456, 789.', 'The last line has special characters: @#$%^&*()']


## Your First Write Operation

Now let's create our own content:

In [None]:
# Writing a simple file
my_content = """Welcome to my file!
I am learning Python file handling.
This is going to be awesome!"""

with open('output/my_first_file.txt', 'w') as f:
    f.write(my_content)

print("✓ File written successfully!")

# Verify what we wrote
with open('output/my_first_file.txt', 'r') as f:
    content = f.read()
    print("\nWhat I wrote:")
    print(content)

✓ File written successfully!

What I wrote:
Welcome to my file!
I am learning Python file handling.
This is going to be awesome!


## 🎯 Your Turn: Create Your Own File

Create a personal learning journal entry!

In [None]:
# Your task: Create a learning journal entry
# Include: Date, what you learned today, and your thoughts

from datetime import datetime

# Step 1: Get current date
today = datetime.now().strftime("%Y-%m-%d")

# Step 2: Create your journal entry content
# Include the date, what you learned, and your thoughts

# Step 3: Write to 'output/learning_journal.txt'

# Step 4: Read it back and print to verify

# Write your code here:


## Writing Multiple Lines: writelines()

When you have multiple lines to write:

In [None]:
# Prepare lines to write
shopping_list = [
    "Apples\n",
    "Bananas\n",
    "Carrots\n",
    "Dates\n"
]

with open('output/shopping.txt', 'w') as f:
    f.writelines(shopping_list)

print("✓ Shopping list written!")

# Read it back
with open('output/shopping.txt', 'r') as f:
    content = f.read()
    print("Shopping list:")
    print(content)

print(" Important: writelines() doesn't add newlines automatically!")

# Without newlines
items_no_newline = ["Eggs", "Flour", "Sugar"]

with open('output/no_newlines.txt', 'w') as f:
    f.writelines(items_no_newline)

# Check the result
with open('output/no_newlines.txt', 'r') as f:
    print("\nWithout newlines:", repr(f.read()))

✓ Shopping list written!
Shopping list:
Apples
Bananas
Carrots
Dates

 Important: writelines() doesn't add newlines automatically!

Without newlines: 'EggsFlourSugar'


## Adding to Files: Append Mode

Sometimes you want to add to existing files without overwriting:

In [None]:
# Start with a base file
with open('output/diary.txt', 'w') as f:
    f.write("My Python Learning Diary\n")
    f.write("Day 1: Started learning file handling\n")

print("✓ Diary created")

# Add to existing file
with open('output/diary.txt', 'a') as f:  # 'a' for append
    f.write("Day 2: Learned about append mode\n")
    f.write("Day 3: Getting more confident!\n")

print("✓ Entries added to diary")

# Read the complete diary
with open('output/diary.txt', 'r') as f:
    diary_content = f.read()
    print("\n Complete diary:")
    print(diary_content)

✓ Diary created
✓ Entries added to diary

 Complete diary:
My Python Learning Diary
Day 1: Started learning file handling
Day 2: Learned about append mode
Day 3: Getting more confident!



# Level 3: Real-World Skills - Handling Problems and Encoding

## When Things Go Wrong: Exception Handling

Real programs must handle errors gracefully:

In [None]:
# What happens with a missing file?
try:
    with open('practice_files/missing.txt', 'r') as f:
        content = f.read()
        print(content)
except FileNotFoundError:
    print(" Oops! File not found. That's okay, we handled it!")

# A robust file reading function
def safe_read_file(filename):
    """Safely read a file with error handling"""
    try:
        with open(filename, 'r', encoding='utf-8') as f:
            return f.read()
    except FileNotFoundError:
        print(f" File not found: {filename}")
        return None
    except PermissionError:
        print(f" Permission denied: {filename}")
        return None
    except Exception as e:
        print(f" Unexpected error: {e}")
        return None

# Test it
result = safe_read_file('practice_files/sample.txt')
if result:
    print(" Success! First 50 characters:", result[:50])

result = safe_read_file('practice_files/missing.txt')
print("Missing file result:", result)

 Oops! File not found. That's okay, we handled it!
 Success! First 50 characters: Welcome to Python File Handling!
This is line 2 of
 File not found: practice_files/missing.txt
Missing file result: None


## 🎯 Your Turn: Practice Error Handling

Create a function that safely processes multiple files.

In [None]:
# Your task: Create a function that processes multiple files safely
# It should try to read each file and report success/failure

def process_multiple_files(file_list):
    """Process multiple files and report results"""
    results = {}

    # Step 1: Loop through each file in file_list
    # Step 2: Try to read each file
    # Step 3: Store success/failure in results dictionary
    # Step 4: Return results

    # Write your code here:

    return results

# Test files (some exist, some don't)
test_files = [
    'practice_files/sample.txt',
    'data/sales_data.csv',
    'data/missing_file.txt',
    'data/application.log'
]

# Test your function (uncomment when ready)
# results = process_multiple_files(test_files)
# print(" Processing Results:")
# for file, status in results.items():
#     print(f"  {file}: {status}")

## Text Encoding: Handling Different Characters

Modern applications need to handle international characters:

In [None]:
# Create a file with special characters
special_text = "Hello World! \nPython is awesome! \nChinese: 你好\nEmoji: "

with open('output/unicode_test.txt', 'w', encoding='utf-8') as f:
    f.write(special_text)

print("✓ Unicode file created")

# Read it back correctly
with open('output/unicode_test.txt', 'r', encoding='utf-8') as f:
    content = f.read()
    print("\n Unicode content:")
    print(content)

print("\n💡 Important: Always specify encoding='utf-8' for consistent behavior!")

✓ Unicode file created

 Unicode content:
Hello World! 
Python is awesome! 
Chinese: 你好
Emoji: 

💡 Important: Always specify encoding='utf-8' for consistent behavior!


# Level 4: Professional Applications - CSV, JSON, and Real Data

## Working with CSV Files

CSV (Comma-Separated Values) files are everywhere in data work:

In [None]:
import csv

# Reading CSV file
print(" Reading CSV file:")
with open('data/sales_data.csv', 'r', encoding='utf-8') as f:
    reader = csv.reader(f)
    header = next(reader)  # Get header row
    print("Header:", header)

    for row_num, row in enumerate(reader, 1):
        print(f"Record {row_num}: {row}")

print("\n" + "="*50)

# CSV with Dictionaries (More Powerful)
print(" Reading CSV as dictionaries:")
with open('data/sales_data.csv', 'r', encoding='utf-8') as f:
    dict_reader = csv.DictReader(f)

    total_sales = 0
    for sale in dict_reader:
        total = float(sale['Total'])
        total_sales += total
        print(f"{sale['Product']}: ${sale['Total']} ({sale['Category']})")

    print(f"\n💰 Total Sales: ${total_sales:.2f}")

 Reading CSV file:
Header: ['Date', 'Product', 'Category', 'Quantity', 'Price', 'Total']
Record 1: ['2024-01-15', 'Laptop', 'Electronics', '2', '999.99', '1999.98']
Record 2: ['2024-01-16', 'Coffee Mug', 'Kitchen', '5', '12.50', '62.50']
Record 3: ['2024-01-17', 'Book', 'Education', '3', '25.00', '75.00']
Record 4: ['2024-01-18', 'Smartphone', 'Electronics', '1', '699.99', '699.99']
Record 5: ['2024-01-19', 'Desk Chair', 'Furniture', '1', '199.99', '199.99']

 Reading CSV as dictionaries:
Laptop: $1999.98 (Electronics)
Coffee Mug: $62.50 (Kitchen)
Book: $75.00 (Education)
Smartphone: $699.99 (Electronics)
Desk Chair: $199.99 (Furniture)

💰 Total Sales: $3037.46


## 🎯 Your Turn: CSV Analysis Challenge

Analyze the sales data and create a summary report!

In [None]:
# Your task: Analyze sales data and create a summary
# Calculate: total sales, average sale, best-selling category

import csv
from collections import defaultdict

# Initialize variables
total_sales = 0
sale_count = 0
category_sales = defaultdict(float)

# Step 1: Read the CSV file
# Step 2: Calculate total sales and count
# Step 3: Track sales by category
# Step 4: Find the best-selling category
# Step 5: Write summary to a new file

# Write your code here:


# Calculate and display results (uncomment when ready)
# average_sale = total_sales / sale_count if sale_count > 0 else 0
# best_category = max(category_sales, key=category_sales.get)

# print(f" Sales Analysis:")
# print(f"Total Sales: ${total_sales:.2f}")
# print(f"Average Sale: ${average_sale:.2f}")
# print(f"Best Category: {best_category} (${category_sales[best_category]:.2f})")

# Write summary to file
# summary = f"""Sales Analysis Report
# =====================
# Total Sales: ${total_sales:.2f}
# Average Sale: ${average_sale:.2f}
# Best Category: {best_category}

# Category Breakdown:
# """

# for category, amount in category_sales.items():
#     summary += f"{category}: ${amount:.2f}\n"

# with open('output/sales_summary.txt', 'w', encoding='utf-8') as f:
#     f.write(summary)

# print("\n✓ Summary saved to output/sales_summary.txt")

## Working with JSON Files

JSON is perfect for configuration and structured data:

In [None]:
import json

# Reading JSON configuration
with open('data/config.json', 'r', encoding='utf-8') as f:
    config = json.load(f)

print(" Application Configuration:")
print(f"App Name: {config['application']['name']}")
print(f"Version: {config['application']['version']}")
print(f"Debug Mode: {config['application']['debug_mode']}")

print("\n Database Settings:")
for key, value in config['database'].items():
    print(f"  {key}: {value}")

# Modify and save configuration
config['application']['debug_mode'] = True
config['features']['new_feature'] = True

with open('output/updated_config.json', 'w', encoding='utf-8') as f:
    json.dump(config, f, indent=2)

print("\n✓ Updated configuration saved to output/updated_config.json")

 Application Configuration:
App Name: File Handler Pro
Version: 2.1.0
Debug Mode: False

 Database Settings:
  host: localhost
  port: 5432
  name: filehandler_db
  timeout: 30

✓ Updated configuration saved to output/updated_config.json


## Text File Processing: Log Analysis

Let's build a real log analyzer:

In [None]:
import re
from datetime import datetime
from collections import Counter

def analyze_log_file(log_filename):
    """Comprehensive log file analysis"""
    log_levels = Counter()
    timestamps = []
    error_messages = []

    try:
        with open(log_filename, 'r', encoding='utf-8') as f:
            for line_num, line in enumerate(f, 1):
                line = line.strip()
                if not line:
                    continue

                # Extract timestamp and log level
                match = re.match(r'\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\] (\w+): (.+)', line)
                if match:
                    timestamp_str, level, message = match.groups()

                    # Count log levels
                    log_levels[level] += 1

                    # Store timestamps
                    timestamps.append(timestamp_str)

                    # Collect error messages
                    if level == 'ERROR':
                        error_messages.append(message)

        return {
            'log_levels': dict(log_levels),
            'total_entries': sum(log_levels.values()),
            'error_messages': error_messages,
            'time_range': (timestamps[0], timestamps[-1]) if timestamps else None
        }

    except Exception as e:
        print(f"Error analyzing log: {e}")
        return None

# Analyze our log file
analysis = analyze_log_file('data/application.log')

if analysis:
    print(" Log Analysis Results:")
    print(f"Total Entries: {analysis['total_entries']}")

    print("\n Log Level Distribution:")
    for level, count in analysis['log_levels'].items():
        percentage = (count / analysis['total_entries']) * 100
        print(f"  {level}: {count} ({percentage:.1f}%)")

    if analysis['error_messages']:
        print("\n Error Messages:")
        for i, error in enumerate(analysis['error_messages'], 1):
            print(f"  {i}. {error}")

    if analysis['time_range']:
        print(f"\n Time Range: {analysis['time_range'][0]} to {analysis['time_range'][1]}")

 Log Analysis Results:
Total Entries: 9

 Log Level Distribution:
  INFO: 5 (55.6%)
  ERROR: 2 (22.2%)

 Error Messages:
  1. Database connection timeout
  2. Failed to save user preferences

 Time Range: 2024-01-15 09:00:00 to 2024-01-15 11:00:04


## Best Practices Summary

Here are the key principles for professional file handling:

In [None]:
print("File Handling Best Practices:")
print()
print("1. Always use 'with' statements:")
print("   with open('file.txt', 'r', encoding='utf-8') as f:")
print("       content = f.read()")
print()
print("2. Always specify encoding:")
print("   encoding='utf-8' for consistent behavior")
print()
print("3. Handle exceptions appropriately:")
print("   try/except blocks for FileNotFoundError, PermissionError")
print()
print("4. Use iteration for large files:")
print("   for line in f:  # Memory efficient!")
print()
print("5. Choose the right file mode:")
print("   'r' for reading, 'w' for writing, 'a' for appending")
print()
print("6. Use appropriate libraries:")
print("   csv module for CSV files, json module for JSON files")

File Handling Best Practices:

1. Always use 'with' statements:
   with open('file.txt', 'r', encoding='utf-8') as f:
       content = f.read()

2. Always specify encoding:
   encoding='utf-8' for consistent behavior

3. Handle exceptions appropriately:
   try/except blocks for FileNotFoundError, PermissionError

4. Use iteration for large files:
   for line in f:  # Memory efficient!

5. Choose the right file mode:
   'r' for reading, 'w' for writing, 'a' for appending

6. Use appropriate libraries:
   csv module for CSV files, json module for JSON files


# Level 5: Hands-On Practice - Progressive Challenges

This level is intentionally student-driven. Work through each part in order, documenting your reasoning and referencing the files you generated earlier in the notebook.
- **Part 1: Multiple Choice Checkpoint** – interpret real artifacts to choose the most defensible answer.
- **Part 2: Fill in the Blanks** – support each blank with evidence you can point to.
- **Part 3: Hands-On Practice** – complete the coding challenges from beginner to expert.

> Tip: Resist the temptation to query AI tools. Instead, inspect the files you created, run targeted snippets, and justify every choice in your own words.

### Part 1: Multiple Choice Checkpoint

Select the most defensible answer for each scenario. The options intentionally look similar—inspect the actual files produced in earlier levels before you decide.

1. **Reviewing `practice_files/sample.txt` without leaving the file open**  
   - A. Open the file with `open(..., 'r')` and trust garbage collection to close it eventually.  
   - B. Wrap the read in `with open('practice_files/sample.txt', 'r', encoding='utf-8') as reader:`.  
   - C. Read the file via `os.path` utilities because they auto-close handles.  
   - D. Use `open(..., 'r+')` so you can read and close in one call.  

2. **Confirming the first line in `practice_files/sample.txt` after running the Level 1 demo**  
   - A. `Python lets you handle files.`  
   - B. `Welcome to Python File Handling!`  
   - C. `This is line 2 of our sample file.`  
   - D. `Line 3 contains some numbers: 123, 456, 789.`  

3. **Counting log levels in `data/application.log` during the Level 2 exercise**  
   - A. Convert the entire file to lowercase once and call `.count('error')`, `.count('warning')`, and `.count('info')`.  
   - B. Iterate over each line inside a `with` block, check `'INFO'`, `'WARNING'`, and `'ERROR'` separately, and increment dedicated counters.  
   - C. Load the log with `csv.reader`, treat each line as a row, and read the second column as the log level.  
   - D. Use `json.load` so you can access level names directly.  

4. **Understanding the effect of `.strip()` in the sample file cleanup**  
   - A. It removes the trailing newline so `Clean first line:` prints without an extra blank line.  
   - B. It alphabetically sorts the characters on each line.  
   - C. It slices away the first three characters of every string.  
   - D. It converts the text to uppercase for display.  

5. **Inspecting `output/no_newlines.txt` after writing the grocery list**  
   - A. The file contains the single string `EggsFlourSugar`.  
   - B. The file shows three lines separated by `\n`.  
   - C. The file stores the list literal `['Eggs', 'Flour', 'Sugar']`.  
   - D. The file is empty because `writelines` requires a newline argument.  

6. **Verifying the final line in `output/diary.txt` once append mode finishes**  
   - A. `Day 1: Started learning file handling`  
   - B. `Day 2: Learned about append mode`  
   - C. `Day 3: Getting more confident!`  
   - D. `Day 4: Practiced writing CSV files`  

7. **Interpreting the return value of `safe_read_file('practice_files/missing.txt')`**  
   - A. It returns the string `'Missing file'`.  
   - B. It returns `None` after printing an explanatory message.  
   - C. It raises a `FileNotFoundError` back to the caller.  
   - D. It returns an empty dictionary.  

8. **Tracking configuration changes before saving `output/updated_config.json`**  
   - A. `config['application']['debug_mode']` stays `False`.  
   - B. A new flag `config['features']['new_feature'] = True` is added to the configuration.  
   - C. `config['database']['host']` switches to `'remote-server'`.  
   - D. `config['logging']['level']` is downgraded to `'DEBUG'`.  

9. **Summing all sales values in `data/sales_data.csv`**  
   - A. $2,837.47  
   - B. $3,037.46  
   - C. $3,250.00  
   - D. $3,333.33  

10. **Identifying the highest-grossing category from the CSV analysis**  
   - A. Kitchen  
   - B. Electronics  
   - C. Education  
   - D. Furniture  

11. **Reading the `time_range` returned by `analyze_log_file('data/application.log')`**  
   - A. `('2024-01-15 09:00:00', '2024-01-15 11:00:04')`  
   - B. `('2024-01-15 09:15:23', '2024-01-15 10:45:41')`  
   - C. `('2024-01-15 09:30:45', '2024-01-15 10:30:18')`  
   - D. `('2024-01-15 10:00:33', '2024-01-15 10:15:56')`  

12. **Explaining why `encoding='utf-8'` is specified in the Unicode example**  
   - A. It forces Python to drop any emoji characters so the file stays ASCII.  
   - B. It ensures emojis and non-Latin characters such as `你好` survive the write/read cycle.  
   - C. It automatically compresses the file to save disk space.  
   - D. It converts all numbers to floats before saving.  

### Part 2: Fill in the Blanks

Complete each statement by providing the missing phrase. Cite the evidence you used (file name, line number, or snippet) in your notes.

1. The setup cell calls `os.makedirs(directory, ____ )` so rerunning it never raises an error when a folder already exists.
2. The generated `sales_data.csv` header lists the second column as `____`.
3. Line 3 of `practice_files/sample.txt` records the numbers `____`.
4. After applying `.strip()` to `first_line`, the trailing `____` character disappears.
5. The log warning about disk capacity reports only `____` of free space.
6. The counters `info_count`, `warning_count`, and `error_count` each start at `____` before the loop.
7. When `safe_read_file` cannot locate a file, it prints a message and returns `____`.
8. The base diary entry begins with the title `____`.
9. Once append mode finishes, the new final line in `output/diary.txt` reads `____`.
10. Inside the CSV dictionary loop, totals accumulate with `float(sale['____'])`.
11. The database configuration retains a `timeout` value of `____` seconds.
12. `output/unicode_test.txt` includes the greeting `Chinese: ____`.

### Part 3: Hands-On Practice (Coding Challenges)

## Challenge 1: Personal Notes System (Beginner)

Create a simple note-taking system:

In [None]:
# Your task: Complete this function
from datetime import datetime

def create_note(filename, title, content):
    """Create a note file with title and content"""
    # Step 1: Get current timestamp
    # Step 2: Format the note with title, timestamp, and content
    # Step 3: Write to file
    # Step 4: Confirm creation

    # Write your code here:
    pass

def read_note(filename):
    """Read and display a note file"""
    # Step 1: Safely read the file
    # Step 2: Display formatted content

    # Write your code here:
    pass

# Test your functions (uncomment when ready)
# create_note('output/my_note.txt', 'Learning Python', 'Today I learned file handling!')
# read_note('output/my_note.txt')

## Challenge 2: Data Processing Pipeline (Intermediate)

Build a complete data processing pipeline:

In [None]:
# Your task: Create a data processing pipeline
# Read sales data, process it, and generate multiple reports

import csv
import json
from collections import defaultdict

def process_sales_data():
    """Complete sales data processing pipeline"""

    # Step 1: Read sales data from CSV
    # Step 2: Calculate various statistics
    # Step 3: Generate text report
    # Step 4: Generate JSON summary
    # Step 5: Create CSV with processed data

    # Initialize data structures
    sales_data = []
    category_stats = defaultdict(lambda: {'total': 0, 'count': 0, 'items': []})

    # Write your code here:


    print("✅ Data processing pipeline completed!")
    print("📄 Generated files:")
    print("  - output/sales_report.txt")
    print("  - output/sales_summary.json")
    print("  - output/processed_sales.csv")

# Run the pipeline (uncomment when ready)
# process_sales_data()

## Challenge 3: Log Monitoring System (Advanced)

Build a comprehensive log monitoring system:

In [None]:
# Your task: Create a log monitoring and alerting system
# Analyze logs, detect patterns, and generate alerts

import re
from datetime import datetime, timedelta
from collections import Counter, defaultdict

class LogMonitor:
    def __init__(self, log_file):
        self.log_file = log_file
        self.alerts = []
        self.stats = defaultdict(int)

    def analyze_logs(self):
        """Analyze log file and detect issues"""
        # Step 1: Read and parse log entries
        # Step 2: Count different log levels
        # Step 3: Detect error patterns
        # Step 4: Check for time-based anomalies
        # Step 5: Generate alerts

        # Write your code here:
        pass

    def generate_report(self):
        """Generate comprehensive monitoring report"""
        # Step 1: Create summary statistics
        # Step 2: List all alerts
        # Step 3: Provide recommendations
        # Step 4: Save to file

        # Write your code here:
        pass

# Test the log monitor (uncomment when ready)
# monitor = LogMonitor('data/application.log')
# monitor.analyze_logs()
# monitor.generate_report()

## Challenge 4: File Backup System (Expert)

Create an automated backup system with versioning:

In [None]:
# Your task: Create a comprehensive backup system
# Include versioning, compression, and integrity checks

import os
import shutil
import hashlib
from datetime import datetime
import json

class BackupSystem:
    def __init__(self, source_dir, backup_dir):
        self.source_dir = source_dir
        self.backup_dir = backup_dir
        self.backup_log = []

    def create_backup(self):
        """Create timestamped backup with integrity checks"""
        # Step 1: Create timestamped backup directory
        # Step 2: Copy files with verification
        # Step 3: Generate checksums
        # Step 4: Create backup manifest
        # Step 5: Log the backup operation

        # Write your code here:
        pass

    def verify_backup(self, backup_path):
        """Verify backup integrity"""
        # Step 1: Read backup manifest
        # Step 2: Verify file checksums
        # Step 3: Report verification results

        # Write your code here:
        pass

    def list_backups(self):
        """List all available backups"""
        # Step 1: Scan backup directory
        # Step 2: Read backup manifests
        # Step 3: Display backup information

        # Write your code here:
        pass

# Test the backup system (uncomment when ready)
# backup_system = BackupSystem('data', 'backups')
# backup_system.create_backup()
# backup_system.list_backups()

## Final Project: Complete File Management System

Combine everything you've learned into a comprehensive file management system:

In [None]:
# Your final challenge: Create a complete file management system
# Features: file operations, data processing, monitoring, and backup

class FileManager:
    """Complete file management system"""

    def __init__(self, workspace_dir):
        self.workspace = workspace_dir
        self.ensure_workspace()

    def ensure_workspace(self):
        """Create workspace directory structure"""
        # Create necessary directories
        pass

    def process_csv_data(self, csv_file):
        """Process CSV data and generate reports"""
        # Implement CSV processing
        pass

    def monitor_logs(self, log_file):
        """Monitor log files and generate alerts"""
        # Implement log monitoring
        pass

    def backup_files(self, source_pattern):
        """Backup files matching pattern"""
        # Implement backup functionality
        pass

    def generate_dashboard(self):
        """Generate HTML dashboard with all information"""
        # Create comprehensive dashboard
        pass

# Your implementation here:
# Create an instance and test all features

print("Final Project: File Management System")
print("Implement all the features you've learned:")
print("- File reading/writing with error handling")
print("- CSV and JSON processing")
print("- Log analysis and monitoring")
print("- Backup and versioning")
print("- Dashboard generation")
print("\nGood luck! ")

Final Project: File Management System
Implement all the features you've learned:
- File reading/writing with error handling
- CSV and JSON processing
- Log analysis and monitoring
- Backup and versioning
- Dashboard generation

Good luck! 


# Conclusion and Next Steps

## What You've Accomplished

Congratulations! You've completed a comprehensive journey through Python file handling. You now have:

### Core Skills Mastered
- **Basic File Operations**: open, read, write, close
- **Safe File Handling**: Using `with` statements
- **Error Handling**: Graceful exception management
- **Text Encoding**: UTF-8 and international characters

### Advanced Techniques
- **CSV Processing**: Reading and writing structured data
- **JSON Handling**: Configuration and API data
- **Log Analysis**: Real-world text processing
- **File Modes**: Understanding r, w, a, x modes

### Professional Practices
- **Best Practices**: Production-ready code patterns
- **Error Recovery**: Robust error handling
- **Performance**: Memory-efficient file processing
- **Security**: Safe file operations

## Next Steps in Your Python Journey

### Immediate Applications
- **Data Analysis**: Process real datasets with pandas
- **Web Development**: Handle user uploads and configuration
- **Automation**: Create file processing scripts
- **System Administration**: Log analysis and monitoring

### Advanced Topics to Explore
- **Binary Files**: Images, videos, and binary data
- **Database Integration**: SQLite and file-based databases
- **Network Files**: FTP, HTTP file operations
- **Compression**: ZIP, GZIP file handling

### Recommended Libraries
- **Pandas**: Advanced data file processing
- **Pathlib**: Modern file path handling
- **Openpyxl**: Excel file manipulation
- **Requests**: Download files from web APIs

## Keep Practicing!

The best way to master file handling is through practice with real data:

1. **Find real datasets** online (Kaggle, government data)
2. **Build practical projects** (log analyzers, data processors)
3. **Contribute to open source** projects using file handling
4. **Create your own tools** for daily file management tasks

---

**Thank you for completing this tutorial!**

You're now equipped with solid file handling skills that will serve you well in your Python programming journey. Remember: practice makes perfect, and real-world projects are the best teachers.

**Happy Coding!**