# Python Workshop: File Handling

## Learning Objectives

By the end of this section, you will be able to:
- Open and read from files in Python
- Write data to files
- Use context managers to ensure files are properly opened and closed
- Understand different file modes and their use cases
- Handle exceptions related to file operations

## 1. Introduction to Files

Python makes it easy to work with files. Files can be text files or binary files and are used for storing data permanently.

### Different Modes for Opening Files

- **'r'**: Read (default) - Opens file for reading
- **'w'**: Write - Opens file for writing (overwrites if exists)
- **'a'**: Append - Opens file for appending (adds to end)
- **'x'**: Create - Creates a new file (fails if exists)
- **'b'**: Binary mode - Use with other modes (e.g., 'rb', 'wb')
- **'t'**: Text mode - Default text mode
- **'+'**: Read and write mode

Let's explore these modes with examples.

In [None]:
# Let's create a sample file first to work with
sample_content = """Line 1: Welcome to Python file handling!
Line 2: This is a sample text file.
Line 3: We'll use this for demonstration.
Line 4: Python makes file handling easy.
Line 5: Remember to always close your files!
"""

# Create a sample file
with open('sample.txt', 'w') as file:
    file.write(sample_content)

print("Created sample.txt with content:")
print(sample_content)

## 2. Reading from Files

There are several ways to read from files in Python. Let's explore the most common methods.

### Basic File Reading (Not Recommended)

In [None]:
# Basic file reading - NOT recommended because file might not be closed properly
print("Basic file reading (without 'with' statement):")
file = open('sample.txt', 'r')
content = file.read()  # Reads the whole file
print(content)
file.close()  # Don't forget to close!

print("File closed successfully")

### Reading with Context Manager (Recommended)

In [None]:
# Using 'with' statement - RECOMMENDED approach
print("Reading entire file with 'with' statement:")
with open('sample.txt', 'r') as file:
    content = file.read()
    print(content)
# File is automatically closed when exiting the 'with' block

print("File automatically closed after 'with' block")

### Different Reading Methods

In [None]:
print("1. Reading entire file with read():")
with open('sample.txt', 'r') as file:
    content = file.read()
    print(repr(content))  # repr shows \n characters

print("\n2. Reading all lines into a list with readlines():")
with open('sample.txt', 'r') as file:
    lines = file.readlines()
    print(f"Number of lines: {len(lines)}")
    for i, line in enumerate(lines, 1):
        print(f"Line {i}: {repr(line)}")

print("\n3. Reading one line at a time with readline():")
with open('sample.txt', 'r') as file:
    line_num = 1
    while True:
        line = file.readline()
        if not line:  # End of file
            break
        print(f"Line {line_num}: {line.strip()}")
        line_num += 1

### Reading Line by Line (Memory Efficient)

In [None]:
print("Iterating through file line by line (most efficient):")
with open('sample.txt', 'r') as file:
    for line_num, line in enumerate(file, 1):
        print(f"Line {line_num}: {line.strip()}")

print("\nThis method is memory-efficient for large files!")

### Reading Specific Number of Characters

In [None]:
print("Reading specific number of characters:")
with open('sample.txt', 'r') as file:
    # Read first 20 characters
    chunk1 = file.read(20)
    print(f"First 20 characters: {repr(chunk1)}")
    
    # Read next 15 characters
    chunk2 = file.read(15)
    print(f"Next 15 characters: {repr(chunk2)}")
    
    # Read the rest
    rest = file.read()
    print(f"Rest of the file: {repr(rest[:50])}...")  # Show first 50 chars

## 3. Writing to Files

Writing to files is straightforward in Python. Let's explore different writing modes.

### Writing to a New File

In [None]:
# Writing to a new file (or overwriting existing file)
print("Writing to output.txt:")
with open('output.txt', 'w') as file:
    file.write("Hello, World!\n")
    file.write("Python is great for file handling.\n")
    file.write("This file was created using Python.\n")

# Verify the content
print("Content of output.txt:")
with open('output.txt', 'r') as file:
    print(file.read())

### Writing Multiple Lines at Once

In [None]:
# Writing multiple lines using writelines()
lines_to_write = [
    "This is line 1\n",
    "This is line 2\n",
    "This is line 3\n",
    "This is the final line\n"
]

with open('multiple_lines.txt', 'w') as file:
    file.writelines(lines_to_write)

print("Content of multiple_lines.txt:")
with open('multiple_lines.txt', 'r') as file:
    print(file.read())

### Appending to Files

In [None]:
# Appending to an existing file
print("Before appending - output.txt content:")
with open('output.txt', 'r') as file:
    print(file.read())

print("Appending new content...")
with open('output.txt', 'a') as file:
    file.write("This line is appended to the file.\n")
    file.write("So is this one!\n")

print("After appending - output.txt content:")
with open('output.txt', 'r') as file:
    print(file.read())

### Writing Data Types Other Than Strings

In [None]:
# Writing numbers and other data types (must convert to string)
import datetime

data_to_write = [
    ("Name", "Alice"),
    ("Age", 25),
    ("Score", 95.5),
    ("Date", datetime.date.today()),
    ("Active", True)
]

with open('mixed_data.txt', 'w') as file:
    file.write("Mixed Data File\n")
    file.write("="*20 + "\n")
    for key, value in data_to_write:
        file.write(f"{key}: {value}\n")

print("Content of mixed_data.txt:")
with open('mixed_data.txt', 'r') as file:
    print(file.read())

## 4. Working with Different File Types

Let's explore working with various file formats commonly used in Python.

### Working with CSV Files

In [None]:
import csv

# Writing CSV data
csv_data = [
    ['Name', 'Age', 'City', 'Salary'],
    ['Alice', '25', 'New York', '75000'],
    ['Bob', '30', 'London', '65000'],
    ['Charlie', '35', 'Tokyo', '80000'],
    ['Diana', '28', 'Paris', '70000']
]

# Write CSV file
with open('employees.csv', 'w') as file:
    writer = csv.writer(file)
    writer.writerows(csv_data)

print("Created employees.csv")

# Read CSV file
print("Reading CSV file:")
with open('employees.csv', 'r') as file:
    reader = csv.reader(file)
    for row_num, row in enumerate(reader, 1):
        print(f"Row {row_num}: {row}")

# Read CSV as dictionary
print("\nReading CSV as dictionary:")
with open('employees.csv', 'r') as file:
    dict_reader = csv.DictReader(file)
    for row in dict_reader:
        print(f"{row['Name']}: {row['Age']} years old, earns ${row['Salary']}")

### Working with JSON Files

In [None]:
import json

# Creating sample data
employee_data = {
    "employees": [
        {
            "id": 1,
            "name": "Alice Johnson",
            "position": "Software Developer",
            "skills": ["Python", "JavaScript", "SQL"],
            "salary": 75000,
            "active": True
        },
        {
            "id": 2,
            "name": "Bob Smith",
            "position": "Data Analyst",
            "skills": ["Python", "R", "Excel"],
            "salary": 65000,
            "active": True
        },
        {
            "id": 3,
            "name": "Charlie Brown",
            "position": "Project Manager",
            "skills": ["Leadership", "Agile", "Communication"],
            "salary": 80000,
            "active": False
        }
    ],
    "company": "TechCorp",
    "last_updated": "2025-07-22"
}

# Writing JSON data
print("Writing JSON data to employees.json:")
with open('employees.json', 'w') as json_file:
    json.dump(employee_data, json_file, indent=2)
print("JSON file created successfully!")

# Reading JSON data
print("\nReading JSON data:")
with open('employees.json', 'r') as json_file:
    loaded_data = json.load(json_file)

print(f"Company: {loaded_data['company']}")
print(f"Last updated: {loaded_data['last_updated']}")
print("\nEmployees:")
for employee in loaded_data['employees']:
    status = "Active" if employee['active'] else "Inactive"
    print(f"- {employee['name']}: {employee['position']} (${employee['salary']:,}) - {status}")
    print(f"  Skills: {', '.join(employee['skills'])}")

### Working with Text Files (Advanced)

In [None]:
# Working with different text encodings
text_content = """
This file contains special characters:
• Bullet points
© Copyright symbol
€ Euro symbol
中文 (Chinese characters)
🐍 Python emoji
"""

# Write with UTF-8 encoding
with open('unicode_text.txt', 'w', encoding='utf-8') as file:
    file.write(text_content)

# Read with UTF-8 encoding
print("Reading Unicode text file:")
with open('unicode_text.txt', 'r', encoding='utf-8') as file:
    content = file.read()
    print(content)

# Get file information
import os
file_size = os.path.getsize('unicode_text.txt')
print(f"File size: {file_size} bytes")

### Working with Binary Files

In [None]:
# Create a simple binary file (simulating image data)
binary_data = bytes([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A])  # PNG signature
binary_data += b"This is simulated binary data for demonstration." * 10

# Write binary data
with open('sample_binary.dat', 'wb') as file:
    file.write(binary_data)

print(f"Created binary file with {len(binary_data)} bytes")

# Read binary data
with open('sample_binary.dat', 'rb') as file:
    read_data = file.read()
    print(f"Read {len(read_data)} bytes from binary file")
    print(f"First 20 bytes: {read_data[:20]}")
    print(f"As hex: {read_data[:20].hex()}")

# Read specific chunks of binary data
print("\nReading binary file in chunks:")
with open('sample_binary.dat', 'rb') as file:
    chunk_size = 32
    chunk_num = 1
    while True:
        chunk = file.read(chunk_size)
        if not chunk:
            break
        print(f"Chunk {chunk_num}: {len(chunk)} bytes")
        chunk_num += 1
        if chunk_num > 3:  # Limit output for demo
            break

## 5. Exception Handling in File Operations

Proper error handling is crucial when working with files. Let's explore common scenarios and how to handle them.

### Handling File Not Found Errors

In [None]:
# Attempting to read a non-existent file
print("Trying to read a non-existent file:")
try:
    with open('non_existent_file.txt', 'r') as file:
        content = file.read()
        print(content)
except FileNotFoundError:
    print("❌ Error: The file does not exist!")
except IOError:
    print("❌ Error: An I/O error occurred.")

print("Program continues running after handling the error.")

### Handling Permission Errors

In [None]:
# Simulating permission errors and other common issues
import os

def safe_file_operation(filename, mode='r'):
    """Safely perform file operations with comprehensive error handling."""
    try:
        with open(filename, mode) as file:
            if 'r' in mode:
                return file.read()
            elif 'w' in mode:
                file.write("Test content")
                return f"Successfully wrote to {filename}"
                
    except FileNotFoundError:
        return f"❌ Error: File '{filename}' not found"
    except PermissionError:
        return f"❌ Error: Permission denied for '{filename}'"
    except IsADirectoryError:
        return f"❌ Error: '{filename}' is a directory, not a file"
    except IOError as e:
        return f"❌ I/O Error: {e}"
    except Exception as e:
        return f"❌ Unexpected error: {e}"

# Test various scenarios
test_files = [
    ('sample.txt', 'r'),  # Should work
    ('missing_file.txt', 'r'),  # File not found
    ('.', 'r'),  # Directory instead of file
]

for filename, mode in test_files:
    result = safe_file_operation(filename, mode)
    print(f"Testing '{filename}': {result}")

### Creating a Robust File Reader Function

In [None]:
def robust_file_reader(filename, encoding='utf-8', max_size_mb=10):
    """
    A robust function to read files with comprehensive error handling.
    
    Args:
        filename (str): Path to the file
        encoding (str): Text encoding to use
        max_size_mb (int): Maximum file size in MB to read
    
    Returns:
        tuple: (success: bool, content: str or error_message: str)
    """
    try:
        # Check if file exists
        if not os.path.exists(filename):
            return False, f"File '{filename}' does not exist"
        
        # Check if it's a file (not a directory)
        if not os.path.isfile(filename):
            return False, f"'{filename}' is not a file"
        
        # Check file size
        file_size_mb = os.path.getsize(filename) / (1024 * 1024)
        if file_size_mb > max_size_mb:
            return False, f"File too large: {file_size_mb:.2f}MB (max: {max_size_mb}MB)"
        
        # Read the file
        with open(filename, 'r', encoding=encoding) as file:
            content = file.read()
            return True, content
            
    except PermissionError:
        return False, f"Permission denied: Cannot read '{filename}'"
    except UnicodeDecodeError:
        return False, f"Encoding error: Cannot decode '{filename}' with {encoding}"
    except Exception as e:
        return False, f"Unexpected error: {str(e)}"

# Test the robust file reader
test_files = ['sample.txt', 'employees.json', 'missing_file.txt']

for filename in test_files:
    success, result = robust_file_reader(filename)
    if success:
        print(f"✅ Successfully read '{filename}': {len(result)} characters")
        print(f"   Preview: {result[:100]}..." if len(result) > 100 else f"   Content: {result}")
    else:
        print(f"❌ Failed to read '{filename}': {result}")
    print()

## 6. File System Operations

Let's explore some useful file system operations using the `os` module.

In [None]:
import os
from datetime import datetime

# Get information about files we created
files_to_check = ['sample.txt', 'output.txt', 'employees.csv', 'employees.json']

print("File Information:")
print("=" * 60)

for filename in files_to_check:
    if os.path.exists(filename):
        # Get file stats
        stat = os.stat(filename)
        size = stat.st_size
        modified_time = datetime.fromtimestamp(stat.st_mtime)
        
        print(f"📄 {filename}")
        print(f"   Size: {size} bytes")
        print(f"   Modified: {modified_time.strftime('%Y-%m-%d %H:%M:%S')}")
        print(f"   Is file: {os.path.isfile(filename)}")
        print(f"   Is readable: {os.access(filename, os.R_OK)}")
        print(f"   Is writable: {os.access(filename, os.W_OK)}")
        print()
    else:
        print(f"❌ {filename}: File not found")

# List all files in current directory
print("Files in current directory:")
current_files = [f for f in os.listdir('.') if os.path.isfile(f)]
for i, filename in enumerate(sorted(current_files), 1):
    print(f"{i:2d}. {filename}")

## 7. Working with File Paths

Using `pathlib` for modern, cross-platform file path handling.

In [None]:
from pathlib import Path
import os

# Create Path objects
current_dir = Path('.')
sample_file = Path('output.txt')

print("Working with Path objects:")
print(f"Current directory: {current_dir.absolute()}")
print(f"Sample file: {sample_file.absolute()}")
print(f"File exists: {sample_file.exists()}")
print(f"File suffix: {sample_file.suffix}")
print(f"File stem (name without extension): {sample_file.stem}")
print(f"Parent directory: {sample_file.parent}")

# Create directory structure
data_dir = Path('data')
data_dir.mkdir(exist_ok=True)  # Create directory if it doesn't exist

# Move a file (by copying and removing original)
import shutil

backup_file = data_dir / 'sample_backup.txt'
shutil.copy2(sample_file, backup_file)
print(f"\nCreated backup: {backup_file}")

# List files with specific extension
print("\nText files in current directory:")
text_files = list(Path('.').glob('*.txt'))
for txt_file in text_files:
    print(f"- {txt_file}")

## Exercises

Let's practice what we've learned with some hands-on exercises!

### 1. Reading from a File
- Create a text file called `sample.txt` with several lines of text.
- Write a Python program to read and print each line from the file.

In [None]:
# Your solution here


### 2. Writing to a File
- Write a program that prompts a user for their name and saves it to a file `usernames.txt`.
- Each new name should be appended to the file, not overwrite existing entries.

In [None]:
# Your solution here


### 3. Working with JSON
- Create a dictionary with some personal information.
- Write it to a `info.json` file using the `json` module.
- Read the data back from the `info.json` file and print it.

In [None]:
# Your solution here


### 4. Binary File Handling
- Write a program to read a text file in binary mode and print the size of the read data.
- Compare the binary size with the text file size.

In [None]:
# Your solution here


### 5. Exception Handling
- Create a Python program that tries to read from a file that doesn't exist and gracefully handles the error, printing a custom message.
- Also handle cases where you might not have permission to read the file.

In [None]:
# Your solution here


### 6. CSV Data Processing
- Create a CSV file with student data (name, grade, subject).
- Write a program to read the CSV and calculate the average grade.

In [None]:
# Your solution here


## Cleanup

Let's clean up all the files we created during this lesson.

In [None]:
import os
import shutil
from pathlib import Path

# List of files and directories to clean up
files_to_remove = [
    'sample.txt', 'output.txt', 'multiple_lines.txt', 'mixed_data.txt',
    'employees.csv', 'employees.json', 'unicode_text.txt', 'sample_binary.dat'
]

dirs_to_remove = ['data']

print("Cleaning up files and directories:")

# Remove files
for filename in files_to_remove:
    if os.path.exists(filename):
        os.remove(filename)
        print(f"✅ Removed: {filename}")
    else:
        print(f"⚠️  File not found: {filename}")

# Remove directories
for dirname in dirs_to_remove:
    if os.path.exists(dirname):
        shutil.rmtree(dirname)
        print(f"✅ Removed directory: {dirname}")
    else:
        print(f"⚠️  Directory not found: {dirname}")

print("\n🧹 Cleanup complete!")

## Key Takeaways

### File Handling Best Practices:
- **Always use context managers** (`with` statement) to ensure files are properly closed
- **Handle exceptions** appropriately to make your programs robust
- **Choose the right file mode** for your needs (read, write, append, binary)
- **Use appropriate encoding** (usually UTF-8) for text files
- **Check file existence and permissions** before attempting operations

### Common File Operations:
- **Reading**: `read()`, `readline()`, `readlines()`, or iterate line by line
- **Writing**: `write()`, `writelines()` for text; use binary mode for non-text data
- **Working with structured data**: Use `json` module for JSON, `csv` module for CSV files
- **Path handling**: Use `pathlib` for modern, cross-platform path operations

### Error Handling:
- `FileNotFoundError`: File doesn't exist
- `PermissionError`: Insufficient permissions
- `IsADirectoryError`: Path points to directory, not file
- `UnicodeDecodeError`: Encoding issues with text files

### Memory Considerations:
- For large files, read line by line instead of loading entire file into memory
- Use appropriate chunk sizes for binary file processing
- Consider file size limits in production applications

File handling is a fundamental skill in Python programming. Master these concepts, and you'll be able to work with data persistence, configuration files, logs, and much more!