# Exercise 07: File Operations

**Learning Objectives:**
- Read and write text files in Python
- Understand different file modes ('r', 'w', 'a')
- Use with statements for proper file handling
- Work with file paths safely
- Read and write CSV files
- Process data files with real-world examples

**Estimated Time:** 75-90 minutes

**Prerequisites:** Ex01-Ex06 completed

---

## 📚 Recommended Reading:
- [Python File I/O Documentation](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)
- [Real Python Working with Files](https://realpython.com/working-with-files-in-python/)

---

## 🎯 Part 1: Reading Text Files

In [None]:
# First, let's create a sample file to work with
sample_content = """Hello, Python!
This is a sample text file.
We will use this to practice file operations.
Line 4: Reading files is important for data processing.
Line 5: Always remember to close files properly."""

# Write the sample file
with open('sample.txt', 'w') as file:
    file.write(sample_content)

print("Sample file created!")

# Reading the entire file
with open('sample.txt', 'r') as file:
    content = file.read()
    print("Full content:")
    print(content)

# Reading line by line
print("\nReading line by line:")
with open('sample.txt', 'r') as file:
    for line_number, line in enumerate(file, 1):
        print(f"Line {line_number}: {line.strip()}")

### TODO 1.1: File Reading Practice

In [None]:
# TODO: Read the sample.txt file and count:
# - Number of lines
# - Number of words
# - Number of characters

# TODO: Implement the counting function
# def analyze_file(filename):
#     with open(filename, 'r') as file:
#         lines = file.readlines()
#         
#     line_count = len(lines)
#     word_count = 0
#     char_count = 0
#     
#     for line in lines:
#         words = line.split()
#         word_count += len(words)
#         char_count += len(line)
#     
#     return line_count, word_count, char_count

# TODO: Test your function
# lines, words, chars = analyze_file('sample.txt')
# print(f"File analysis:")
# print(f"  Lines: {lines}")
# print(f"  Words: {words}")
# print(f"  Characters: {chars}")

print("File reading practice completed!")

## 🎯 Part 2: Writing Text Files

In [None]:
# Writing to a new file
students = ["Alice", "Bob", "Charlie", "Diana"]
grades = [85, 92, 78, 88]

with open('grades.txt', 'w') as file:
    file.write("Student Grade Report\n")
    file.write("=" * 20 + "\n")
    for student, grade in zip(students, grades):
        file.write(f"{student}: {grade}\n")

print("Grades file created!")

# Verify by reading it back
with open('grades.txt', 'r') as file:
    print("Contents of grades.txt:")
    print(file.read())

### TODO 2.1: Create a Log File

In [None]:
# TODO: Create a function that writes log entries to a file
# Each entry should include timestamp, level, and message

# from datetime import datetime

# def write_log_entry(filename, level, message):
#     """Write a log entry to the specified file."""
#     timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
#     log_entry = f"[{timestamp}] {level}: {message}\n"
#     
#     with open(filename, 'a') as file:  # 'a' for append mode
#         file.write(log_entry)

# TODO: Test your logging function
# write_log_entry('app.log', 'INFO', 'Application started')
# write_log_entry('app.log', 'DEBUG', 'Loading configuration')
# write_log_entry('app.log', 'ERROR', 'Database connection failed')
# write_log_entry('app.log', 'INFO', 'Application ended')

# TODO: Read and display the log file
# print("Log file contents:")
# with open('app.log', 'r') as file:
#     print(file.read())

print("Log file practice completed!")

## 🎯 Part 3: File Modes and Path Handling

In [None]:
import os

# Different file modes
print("File modes demonstration:")

# Write mode ('w') - overwrites existing file
with open('demo.txt', 'w') as file:
    file.write("This is the original content.\n")

# Append mode ('a') - adds to end of file
with open('demo.txt', 'a') as file:
    file.write("This line is appended.\n")
    file.write("Another appended line.\n")

# Read the result
with open('demo.txt', 'r') as file:
    print("Demo file contents:")
    print(file.read())

# Check if file exists
if os.path.exists('demo.txt'):
    print("Demo file exists")
    file_size = os.path.getsize('demo.txt')
    print(f"File size: {file_size} bytes")

### TODO 3.1: Safe File Operations

In [None]:
# TODO: Create a function that safely reads a file
# Handle the case where the file doesn't exist

# def safe_read_file(filename):
#     """Safely read a file, return None if file doesn't exist."""
#     try:
#         with open(filename, 'r') as file:
#             return file.read()
#     except FileNotFoundError:
#         print(f"File '{filename}' not found.")
#         return None
#     except Exception as e:
#         print(f"Error reading file: {e}")
#         return None

# TODO: Test with existing and non-existing files
# content1 = safe_read_file('sample.txt')  # Should work
# if content1:
#     print("Successfully read sample.txt")
#     print(f"First 50 characters: {content1[:50]}...")

# content2 = safe_read_file('nonexistent.txt')  # Should handle gracefully
# if content2 is None:
#     print("Gracefully handled missing file")

print("Safe file operations completed!")

## 🎯 Part 4: Working with CSV Files

In [None]:
import csv

# Create sample CSV data
student_data = [
    ['Name', 'Age', 'Major', 'GPA'],
    ['Alice Johnson', 20, 'Computer Science', 3.8],
    ['Bob Smith', 21, 'Engineering', 3.6],
    ['Charlie Brown', 19, 'Mathematics', 3.9],
    ['Diana Wilson', 22, 'Physics', 3.7]
]

# Write CSV file
with open('students.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(student_data)

print("CSV file created!")

# Read CSV file
print("\nReading CSV file:")
with open('students.csv', 'r') as file:
    reader = csv.reader(file)
    for row_num, row in enumerate(reader):
        if row_num == 0:
            print(f"Headers: {row}")
        else:
            print(f"Student {row_num}: {row}")

### TODO 4.1: CSV Data Processing

In [None]:
# TODO: Create a function to analyze the student CSV data
# Calculate average GPA, find highest/lowest, count by major

# def analyze_student_data(filename):
#     """Analyze student data from CSV file."""
#     students = []
#     
#     with open(filename, 'r') as file:
#         reader = csv.DictReader(file)  # Use DictReader for easier access
#         for row in reader:
#             students.append({
#                 'name': row['Name'],
#                 'age': int(row['Age']),
#                 'major': row['Major'],
#                 'gpa': float(row['GPA'])
#             })
#     
#     # Calculate statistics
#     gpas = [student['gpa'] for student in students]
#     avg_gpa = sum(gpas) / len(gpas)
#     max_gpa = max(gpas)
#     min_gpa = min(gpas)
#     
#     # Count by major
#     major_counts = {}
#     for student in students:
#         major = student['major']
#         major_counts[major] = major_counts.get(major, 0) + 1
#     
#     return {
#         'total_students': len(students),
#         'average_gpa': avg_gpa,
#         'highest_gpa': max_gpa,
#         'lowest_gpa': min_gpa,
#         'major_distribution': major_counts,
#         'students': students
#     }

# TODO: Test your analysis function
# stats = analyze_student_data('students.csv')
# 
# print("Student Data Analysis:")
# print(f"Total students: {stats['total_students']}")
# print(f"Average GPA: {stats['average_gpa']:.2f}")
# print(f"Highest GPA: {stats['highest_gpa']}")
# print(f"Lowest GPA: {stats['lowest_gpa']}")
# print("\nMajor distribution:")
# for major, count in stats['major_distribution'].items():
#     print(f"  {major}: {count} students")

print("CSV analysis completed!")

## 🎯 Part 5: Challenge - Process a Data File

In [None]:
# TODO: Create a comprehensive data processing system
# 1. Generate sample sales data
# 2. Write it to a CSV file
# 3. Read and analyze the data
# 4. Generate a summary report

# TODO: Generate sample sales data
# import random
# from datetime import datetime, timedelta
# 
# def generate_sales_data(num_records=50):
#     """Generate random sales data."""
#     products = ['Laptop', 'Mouse', 'Keyboard', 'Monitor', 'Headphones', 'Webcam']
#     sales_reps = ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve']
#     
#     sales_data = [['Date', 'Product', 'Quantity', 'Price', 'Sales_Rep', 'Total']]
#     
#     start_date = datetime(2024, 1, 1)
#     
#     for i in range(num_records):
#         date = start_date + timedelta(days=random.randint(0, 365))
#         product = random.choice(products)
#         quantity = random.randint(1, 5)
#         price = round(random.uniform(25.0, 500.0), 2)
#         sales_rep = random.choice(sales_reps)
#         total = round(quantity * price, 2)
#         
#         sales_data.append([
#             date.strftime('%Y-%m-%d'),
#             product,
#             quantity,
#             price,
#             sales_rep,
#             total
#         ])
#     
#     return sales_data

# TODO: Write sales data to CSV
# sales_data = generate_sales_data(30)
# with open('sales_data.csv', 'w', newline='') as file:
#     writer = csv.writer(file)
#     writer.writerows(sales_data)
# 
# print("Sales data generated and saved!")

# TODO: Analyze sales data
# def analyze_sales_data(filename):
#     """Analyze sales data and generate insights."""
#     sales = []
#     
#     with open(filename, 'r') as file:
#         reader = csv.DictReader(file)
#         for row in reader:
#             sales.append({
#                 'date': row['Date'],
#                 'product': row['Product'],
#                 'quantity': int(row['Quantity']),
#                 'price': float(row['Price']),
#                 'sales_rep': row['Sales_Rep'],
#                 'total': float(row['Total'])
#             })
#     
#     # Calculate various statistics
#     total_revenue = sum(sale['total'] for sale in sales)
#     total_quantity = sum(sale['quantity'] for sale in sales)
#     
#     # Revenue by product
#     product_revenue = {}
#     for sale in sales:
#         product = sale['product']
#         product_revenue[product] = product_revenue.get(product, 0) + sale['total']
#     
#     # Revenue by sales rep
#     rep_revenue = {}
#     for sale in sales:
#         rep = sale['sales_rep']
#         rep_revenue[rep] = rep_revenue.get(rep, 0) + sale['total']
#     
#     return {
#         'total_sales': len(sales),
#         'total_revenue': total_revenue,
#         'total_quantity': total_quantity,
#         'average_sale': total_revenue / len(sales),
#         'product_revenue': product_revenue,
#         'rep_revenue': rep_revenue
#     }

# TODO: Generate report
# def generate_sales_report(analysis, output_file='sales_report.txt'):
#     """Generate a formatted sales report."""
#     with open(output_file, 'w') as file:
#         file.write("SALES ANALYSIS REPORT\n")
#         file.write("=" * 50 + "\n\n")
#         
#         file.write("OVERVIEW\n")
#         file.write("-" * 20 + "\n")
#         file.write(f"Total Sales: {analysis['total_sales']}\n")
#         file.write(f"Total Revenue: ${analysis['total_revenue']:.2f}\n")
#         file.write(f"Total Quantity: {analysis['total_quantity']} items\n")
#         file.write(f"Average Sale: ${analysis['average_sale']:.2f}\n\n")
#         
#         file.write("REVENUE BY PRODUCT\n")
#         file.write("-" * 20 + "\n")
#         for product, revenue in sorted(analysis['product_revenue'].items(), 
#                                      key=lambda x: x[1], reverse=True):
#             percentage = (revenue / analysis['total_revenue']) * 100
#             file.write(f"{product}: ${revenue:.2f} ({percentage:.1f}%)\n")
#         
#         file.write("\nREVENUE BY SALES REP\n")
#         file.write("-" * 20 + "\n")
#         for rep, revenue in sorted(analysis['rep_revenue'].items(),
#                                  key=lambda x: x[1], reverse=True):
#             percentage = (revenue / analysis['total_revenue']) * 100
#             file.write(f"{rep}: ${revenue:.2f} ({percentage:.1f}%)\n")

# TODO: Run the complete analysis
# analysis = analyze_sales_data('sales_data.csv')
# generate_sales_report(analysis)
# 
# print("Sales analysis completed!")
# print("\nQuick summary:")
# print(f"Total revenue: ${analysis['total_revenue']:.2f}")
# print(f"Average sale: ${analysis['average_sale']:.2f}")
# 
# # Display the report
# print("\nGenerated report:")
# with open('sales_report.txt', 'r') as file:
#     print(file.read())

print("Data processing challenge completed!")

## 🎯 Summary

Congratulations! You've completed Exercise 07. You should now understand:

✅ Reading and writing text files  
✅ Different file modes ('r', 'w', 'a')  
✅ Using with statements for proper file handling  
✅ Safe file operations with error handling  
✅ Working with CSV files for data processing  
✅ Building complete data analysis workflows  

### 🚀 Next Steps:
- Complete Ex08_Error_Handling to learn about robust error management
- Practice combining file operations with data structures
- Explore JSON files (similar to dictionaries)

### 💡 Key Takeaways:
- Always use 'with' statements for automatic file closing
- Handle file not found errors gracefully
- CSV files are perfect for structured data
- File operations are essential for data persistence
- Real-world data processing involves multiple steps

**Excellent work! You can now work with external data sources! 🐍**