# Economic Python: Enhanced CS50 Introduction to Programming
## **Lecture 6: File I/O**

Welcome to Lecture 6 of CS50's Introduction to Programming with Python! In this notebook, we'll explore how to work with files in Python, including reading from and writing to files, handling CSV data, and working with images.

### **Why File I/O Matters for Economists**
In economics, you often need to:
- **Analyze Economic Data:** Read data from CSV files containing economic indicators
- **Store Research Results:** Save your analysis results for future reference
- **Process Large Datasets:** Work with files that are too large to fit in memory
- **Automate Data Collection:** Save data from APIs or web scraping to files
- **Create Reports:** Generate formatted reports from your economic analysis

File I/O allows you to persistently store and retrieve data, making it possible to build sophisticated economic analysis tools.

### **Table of Contents**
1.  [Introduction to File I/O](#section-1)
2.  [Reading from Files](#section-2)
3.  [Writing to Files](#section-3)
4.  [Using the `with` Statement](#section-4)
5.  [Working with CSV Files](#section-5)
6.  [Using the `csv` Module](#section-6)
7.  [Working with Images](#section-7)
8.  [Problem Set 1: Lines](#problem-1)
9.  [Problem Set 2: Pizza](#problem-2)
10. [Problem Set 3: Scourgify](#problem-3)
11. [Problem Set 4: Shirt](#problem-4)

<a id='section-1'></a>
## 1. Introduction to File I/O

Up until now, most of our programs have stored information in memory, which is lost when the program exits. **File I/O** (Input/Output) allows us to store data persistently on disk, so it can be accessed later.

Files can be used for:
- Storing user input between program runs
- Reading configuration settings
- Processing data from other applications
- Saving results for later analysis

In Python, we use the `open()` function to work with files.

#### Economic Application
In economics research, you'll frequently work with data files containing:
- Time series data (GDP, inflation, unemployment rates over time)
- Cross-sectional data (economic indicators across countries or regions)
- Panel data (combination of time series and cross-sectional)
- Survey results and experimental data

Being able to read, process, and write these files is essential for economic analysis.

In [None]:
# Example: Economic data that might be stored in a file
economic_data = {
    "country": "Bangladesh",
    "gdp_2021": 416.3,  # Billion USD
    "gdp_2022": 460.2,  # Billion USD
    "growth_rate": 10.5,  # Percentage
    "inflation_2022": 6.2,  # Percentage
    "unemployment_2022": 5.1  # Percentage
}

# This data could be stored in a file and retrieved later
print("Sample economic data for Bangladesh:")
for key, value in economic_data.items():
    print(f"{key.replace('_', ' ').title()}: {value}")

<a id='section-2'></a>
## 2. Reading from Files

To read from a file, we first need to open it in read mode. Let's start with a simple example of reading a file line by line.

In [None]:
# First, let's create a sample file to work with
with open("sample.txt", "w") as file:
    file.write("Hello, Siddiqur!\n")
    file.write("This is a sample file.\n")
    file.write("It has multiple lines.\n")

# Now, let's read the file
with open("sample.txt", "r") as file:
    for line in file:
        print(line.rstrip())  # rstrip() removes the trailing newline

We can also read all lines at once using the `readlines()` method:

In [None]:
# Reading all lines at once
with open("sample.txt", "r") as file:
    lines = file.readlines()
    
for line in lines:
    print(line.rstrip())

Or read the entire file as a single string:

In [None]:
# Reading the entire file as a string
with open("sample.txt", "r") as file:
    content = file.read()
    
print(content)

#### Economic Context
When analyzing economic data, you'll often read data from text files, CSV files, or other structured formats. For example, you might read a file containing daily stock prices, quarterly GDP figures, or monthly unemployment rates.

In [None]:
# First, let's create a sample economic data file to work with
with open("economic_indicators.txt", "w") as file:
    file.write("Bangladesh Economic Indicators\n")
    file.write("GDP (2022): $460.2 billion\n")
    file.write("Growth Rate: 10.5%\n")
    file.write("Inflation: 6.2%\n")
    file.write("Unemployment: 5.1%\n")

# Now, let's read the file line by line
print("Reading economic indicators file:")
with open("economic_indicators.txt", "r") as file:
    for line in file:
        print(line.rstrip())  # rstrip() removes the trailing newline

We can also read all lines at once using the `readlines()` method:

In [None]:
# Reading all lines at once
print("Reading all lines at once:")
with open("economic_indicators.txt", "r") as file:
    lines = file.readlines()
    
for i, line in enumerate(lines):
    print(f"Line {i+1}: {line.rstrip()}")

Or read the entire file as a single string:

In [None]:
# Reading the entire file as a string
print("Reading entire file as a string:")
with open("economic_indicators.txt", "r") as file:
    content = file.read()
    
print(content)

When working with economic data, you often need to parse the content to extract specific values:

In [None]:
# Parse economic data from file
economic_data = {}
with open("economic_indicators.txt", "r") as file:
    # Skip the header line
    next(file)
    
    # Process each data line
    for line in file:
        # Split by colon and clean up
        key, value = line.rstrip().split(": ")
        # Remove special characters and convert to appropriate type
        key = key.replace("(", "").replace(")", "").replace(" ", "_").lower()
        
        # Extract numeric value from string
        if "$" in value:
            # Handle GDP value
            numeric_value = float(value.replace("$", "").replace(" billion", ""))
            economic_data[key] = numeric_value
        elif "%" in value:
            # Handle percentage values
            numeric_value = float(value.replace("%", ""))
            economic_data[key] = numeric_value

print("Parsed economic data:")
for key, value in economic_data.items():
    print(f"{key}: {value}")

<a id='section-3'></a>
## 3. Writing to Files

To write to a file, we need to open it in write mode (`"w"`) or append mode (`"a"`). Write mode will create a new file or overwrite an existing one, while append mode will add to the end of an existing file.

In [None]:
# Writing to a file (this will overwrite the file if it exists)
with open("output.txt", "w") as file:
    file.write("This is a new file.\n")
    file.write("Created by Siddiqur Rahman.\n")

# Appending to a file
with open("output.txt", "a") as file:
    file.write("This line was appended.\n")

# Reading the file to verify
with open("output.txt", "r") as file:
    print(file.read())

#### Economic Context
In economic analysis, you might write files to:
- Save processed economic data
- Store results of economic models
- Generate reports for policymakers
- Export data for visualization in other tools

In [None]:
# Writing economic analysis results to a file
with open("economic_analysis.txt", "w") as file:
    file.write("Economic Analysis Report\n")
    file.write("========================\n\n")
    file.write("Country: Bangladesh\n")
    file.write("Analysis Date: 2023\n\n")
    file.write("Key Findings:\n")
    file.write("1. GDP growth of 10.5% indicates strong economic expansion\n")
    file.write("2. Inflation rate of 6.2% is above the target range\n")
    file.write("3. Unemployment at 5.1% shows moderate labor market conditions\n\n")
    file.write("Recommendation: Monitor inflation closely while maintaining growth policies\n")

# Appending additional analysis to the file
with open("economic_analysis.txt", "a") as file:
    file.write("\nAdditional Analysis:\n")
    file.write("- Sector-wise growth analysis shows strong performance in manufacturing\n")
    file.write("- Export growth has contributed significantly to GDP expansion\n")
    file.write("- Foreign direct investment has increased by 15% year-on-year\n")

# Reading the file to verify
print("Economic analysis report:")
with open("economic_analysis.txt", "r") as file:
    print(file.read())

Let's look at how we might save structured economic data:

In [None]:
# Save structured economic data
economic_forecast = {
    "2023": {"gdp_growth": 7.5, "inflation": 5.8, "unemployment": 4.9},
    "2024": {"gdp_growth": 7.2, "inflation": 5.2, "unemployment": 4.7},
    "2025": {"gdp_growth": 7.0, "inflation": 4.8, "unemployment": 4.5}
}

# Write forecast data to a file
with open("economic_forecast.txt", "w") as file:
    file.write("Economic Forecast for Bangladesh\n")
    file.write("=============================\n\n")
    
    for year, indicators in economic_forecast.items():
        file.write(f"{year} Forecast:\n")
        for indicator, value in indicators.items():
            file.write(f"  {indicator.replace('_', ' ').title()}: {value}%\n")
        file.write("\n")

# Reading the file to verify
print("Economic forecast:")
with open("economic_forecast.txt", "r") as file:
    print(file.read())

<a id='section-4'></a>
## 4. Using the `with` Statement

The `with` statement is a context manager that automatically handles closing files for us, even if an error occurs. This is the recommended way to work with files in Python.

In [None]:
# Using with statement (recommended)
with open("sample.txt", "r") as file:
    content = file.read()
    print(content)

# The file is automatically closed when we exit the with block

# Traditional way (not recommended)
file = open("sample.txt", "r")
try:
    content = file.read()
    print(content)
finally:
    file.close()  # Must remember to close the file

#### Economic Importance
In economic analysis, data integrity is crucial. The `with` statement helps ensure that:
- Files are properly closed after reading/writing
- Data is not corrupted if an error occurs
- System resources are released efficiently
- Your analysis results are safely saved

In [None]:
# Using with statement (recommended)
def read_economic_data(filename):
    """
    Read economic data from a file using the with statement.
    
    Args:
        filename (str): Path to the file containing economic data
        
    Returns:
        list: Lines of economic data
    """
    with open(filename, "r") as file:
        data = file.readlines()
    return data

# The file is automatically closed when we exit the with block
data = read_economic_data("economic_indicators.txt")
print("Data read using with statement:")
for line in data:
    print(line.rstrip())

In [None]:
# Traditional way (not recommended)
def read_economic_data_traditional(filename):
    """
    Read economic data using traditional file handling.
    This approach is more error-prone.
    
    Args:
        filename (str): Path to the file containing economic data
        
    Returns:
        list: Lines of economic data
    """
    file = open(filename, "r")
    try:
        data = file.readlines()
        return data
    finally:
        file.close()  # Must remember to close the file

# Even with try/finally, the with statement is cleaner and safer
data = read_economic_data_traditional("economic_indicators.txt")
print("\nData read using traditional approach:")
for line in data[:2]:  # Just show first 2 lines
    print(line.rstrip())
print("...")

The `with` statement is particularly important when working with valuable economic data, as it ensures your data is properly saved even if an error occurs during processing:

In [None]:
# Example of error handling with with statement
def process_and_save_economic_data(input_file, output_file):
    """
    Process economic data and save results, with error handling.
    
    Args:
        input_file (str): Path to input file
        output_file (str): Path to output file
    """
    try:
        # Read input data
        with open(input_file, "r") as infile:
            data = infile.readlines()
        
        # Process data (simulate an error might occur)
        processed_data = []
        for line in data:
            # Simulate processing
            processed_line = f"PROCESSED: {line.rstrip()}\n"
            processed_data.append(processed_line)
            
            # Simulate an error on the third line
            if "Growth" in line:
                raise ValueError("Simulated processing error")
        
        # Write processed data
        with open(output_file, "w") as outfile:
            outfile.writelines(processed_data)
            
    except ValueError as e:
        print(f"Error processing data: {e}")
        print("Files were properly closed despite the error.")

# This function demonstrates that files are properly closed even if an error occurs
process_and_save_economic_data("economic_indicators.txt", "processed_data.txt")

<a id='section-5'></a>
## 5. Working with CSV Files

CSV (Comma-Separated Values) files are a common format for storing tabular data. Let's create a sample CSV file and work with it.

In [None]:
# Create a sample CSV file
with open("economic_data.csv", "w") as file:
    file.write("Country,GDP,Inflation\n")
    file.write("USA,21427.7,1.8\n")
    file.write("China,14342.9,2.5\n")
    file.write("Japan,5081.8,0.5\n")
    file.write("Germany,3846.4,1.4\n")

# Read the CSV file manually
with open("economic_data.csv", "r") as file:
    # Skip the header
    header = file.readline().strip().split(",")
    print(f"Header: {header}")
    
    # Read the data
    for line in file:
        country, gdp, inflation = line.strip().split(",")
        print(f"Country: {country}, GDP: {gdp} trillion, Inflation: {inflation}%")

#### Economic Context
CSV files are widely used in economics for:
- Storing time series data (GDP, inflation, etc.)
- Sharing datasets between different statistical software
- Exporting data from databases for analysis
- Exchanging data with international organizations like World Bank, IMF

In [None]:
# Create a sample CSV file with economic data
with open("south_asia_economics.csv", "w") as file:
    file.write("Country,GDP_Billion,Population_Million,Growth_Percent,Inflation_Percent\n")
    file.write("Bangladesh,460.2,169.4,10.5,6.2\n")
    file.write("India,3385.1,1408.0,7.2,6.7\n")
    file.write("Pakistan,376.5,235.8,4.0,21.3\n")
    file.write("Sri Lanka,89.0,21.9,-7.8,66.0\n")
    file.write("Nepal,40.8,30.5,5.8,7.7\n")

# Read the CSV file manually
print("Reading South Asia economic data:")
with open("south_asia_economics.csv", "r") as file:
    # Skip the header
    header = file.readline().strip().split(",")
    print(f"Header: {header}")
    
    # Read the data
    print("\nCountry Data:")
    for line in file:
        country, gdp, population, growth, inflation = line.strip().split(",")
        print(f"{country}: GDP=${gdp}B, Pop={population}M, Growth={growth}%, Inflation={inflation}%")

Let's analyze this economic data:

In [None]:
# Analyze economic data from CSV
countries = []
with open("south_asia_economics.csv", "r") as file:
    # Skip header
    next(file)
    
    # Process each country
    for line in file:
        data = line.strip().split(",")
        countries.append({
            "name": data[0],
            "gdp": float(data[1]),
            "population": float(data[2]),
            "growth": float(data[3]),
            "inflation": float(data[4])
        })

# Calculate some economic statistics
total_gdp = sum(country["gdp"] for country in countries)
total_population = sum(country["population"] for country in countries)
avg_growth = sum(country["growth"] for country in countries) / len(countries)
avg_inflation = sum(country["inflation"] for country in countries) / len(countries)

# Find countries with highest and lowest growth
highest_growth = max(countries, key=lambda x: x["growth"])
lowest_growth = min(countries, key=lambda x: x["growth"])

print("South Asia Economic Summary:")
print(f"Total GDP: ${total_gdp:.1f} billion")
print(f"Total Population: {total_population:.1f} million")
print(f"Average Growth Rate: {avg_growth:.2f}%")
print(f"Average Inflation: {avg_inflation:.2f}%")
print(f"Highest Growth: {highest_growth['name']} ({highest_growth['growth']}%)")
print(f"Lowest Growth: {lowest_growth['name']} ({lowest_growth['growth']}%)")

While we can parse CSV files manually, Python's `csv` module provides more robust handling of CSV data, especially for complex cases.

<a id='section-6'></a>
## 6. Using the `csv` Module

The `csv` module provides tools for reading and writing CSV files, handling edge cases like quoted fields that contain commas.

In [None]:
import csv

# Create a more complex CSV file with quoted fields
with open("students.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(["Name", "House", "Economic Interest"])
    writer.writerow(["Harry Potter", "Gryffindor", "Monetary Policy"])
    writer.writerow(["Hermione Granger", "Gryffindor", "Development Economics"])
    writer.writerow(["Draco Malfoy", "Slytherin", "International Trade"])
    writer.writerow(["Luna Lovegood", "Ravenclaw", "Behavioral Economics"])

# Read the CSV file using csv.reader
with open("students.csv", "r") as file:
    reader = csv.reader(file)
    
    # Skip the header
    header = next(reader)
    print(f"Header: {header}")
    
    # Read the data
    for row in reader:
        name, house, interest = row
        print(f"{name} from {house} is interested in {interest}")

We can also use `csv.DictReader` to work with CSV data as dictionaries, which can be more readable:

In [None]:
# Read the CSV file using csv.DictReader
with open("students.csv", "r") as file:
    reader = csv.DictReader(file)
    
    for row in reader:
        print(f"{row['Name']} from {row['House']} is interested in {row['Economic Interest']}")

Let's also see how to write CSV data using `csv.DictWriter`:

In [None]:
# Write CSV data using csv.DictWriter
with open("output_students.csv", "w", newline="") as file:
    fieldnames = ["Name", "House", "Economic Interest"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerow({"Name": "Siddiqur Rahman", "House": "Economics", "Economic Interest": "All of them!"})
    writer.writerow({"Name": "John Doe", "House": "Business", "Economic Interest": "Finance"})

# Read the file to verify
with open("output_students.csv", "r") as file:
    reader = csv.DictReader(file)
    
    for row in reader:
        print(f"{row['Name']} from {row['House']} is interested in {row['Economic Interest']}")

#### Economic Application
The `csv` module is particularly useful for economists because:
- It handles complex CSV formats with quoted fields
- It properly deals with different line endings across operating systems
- It provides dictionary access for more readable code
- It's optimized for performance with large economic datasets

In [None]:
import csv

# Create a more complex CSV file with economic data
with open("economic_research.csv", "w", newline="") as file:
    writer = csv.writer(file)
    writer.writerow(["Researcher", "Institution", "Topic", "Year", "Impact_Factor"])
    writer.writerow(["Dr. Siddiqur Rahman", "Jahangirnagar University", "Monetary Policy in Bangladesh", 2022, 4.5])
    writer.writerow(["Prof. Amartya Sen", "Harvard University", "Welfare Economics", 2021, 5.2])
    writer.writerow(["Dr. Abhijit Banerjee", "MIT", "Poverty Alleviation", 2022, 4.8])
    writer.writerow(["Dr. Esther Duflo", "MIT", "Development Economics", 2021, 4.9])

# Read the CSV file using csv.reader
print("Reading economic research data with csv.reader:")
with open("economic_research.csv", "r") as file:
    reader = csv.reader(file)
    
    # Skip the header
    header = next(reader)
    print(f"Header: {header}")
    
    # Read the data
    for row in reader:
        researcher, institution, topic, year, impact = row
        print(f"{researcher} ({institution}): '{topic}' ({year}), Impact: {impact}")

We can also use `csv.DictReader` to work with CSV data as dictionaries, which can be more readable:

In [None]:
# Read the CSV file using csv.DictReader
print("Reading economic research data with csv.DictReader:")
with open("economic_research.csv", "r") as file:
    reader = csv.DictReader(file)
    
    for row in reader:
        print(f"{row['Researcher']} from {row['Institution']} published '{row['Topic']}' in {row['Year']} with impact factor {row['Impact_Factor']}")

Let's also see how to write CSV data using `csv.DictWriter`:

In [None]:
# Write CSV data using csv.DictWriter
economic_papers = [
    {"Title": "The Impact of COVID-19 on South Asian Economies", "Author": "Siddiqur Rahman", "Journal": "Journal of Asian Economics", "Year": 2022},
    {"Title": "Monetary Policy in Developing Nations", "Author": "Siddiqur Rahman", "Journal": "World Development", "Year": 2021},
    {"Title": "Trade Liberalization and Poverty", "Author": "Siddiqur Rahman", "Journal": "Economic Modelling", "Year": 2023}
]

with open("siddiqur_papers.csv", "w", newline="") as file:
    fieldnames = ["Title", "Author", "Journal", "Year"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    writer.writeheader()
    for paper in economic_papers:
        writer.writerow(paper)

# Read the file to verify
print("Siddiqur's economic papers:")
with open("siddiqur_papers.csv", "r") as file:
    reader = csv.DictReader(file)
    
    for row in reader:
        print(f"'{row['Title']}' by {row['Author']} published in {row['Journal']} ({row['Year']})")

Let's create a more complex example with economic time series data:

In [None]:
# Create and read a time series CSV file
import random

# Generate sample time series data
years = list(range(2010, 2023))
gdp_data = []
base_gdp = 100.0  # Base GDP in 2010

for year in years:
    # Simulate GDP growth with some randomness
    growth_rate = 0.05 + random.uniform(-0.02, 0.03)  # 3-8% growth
    base_gdp *= (1 + growth_rate)
    gdp_data.append({
        "Year": year,
        "GDP": round(base_gdp, 2),
        "Growth_Rate": round(growth_rate * 100, 2)
    })

# Write to CSV
with open("gdp_time_series.csv", "w", newline="") as file:
    fieldnames = ["Year", "GDP", "Growth_Rate"]
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerows(gdp_data)

# Read and analyze the time series
print("GDP Time Series Analysis:")
with open("gdp_time_series.csv", "r") as file:
    reader = csv.DictReader(file)
    
    data = list(reader)
    
    # Convert to appropriate types
    for row in data:
        row["Year"] = int(row["Year"])
        row["GDP"] = float(row["GDP"])
        row["Growth_Rate"] = float(row["Growth_Rate"])
    
    # Calculate statistics
    initial_gdp = data[0]["GDP"]
    final_gdp = data[-1]["GDP"]
    total_growth = (final_gdp / initial_gdp - 1) * 100
    avg_growth = sum(row["Growth_Rate"] for row in data) / len(data)
    max_growth = max(data, key=lambda x: x["Growth_Rate"])
    min_growth = min(data, key=lambda x: x["Growth_Rate"])
    
    print(f"Initial GDP ({data[0]['Year']}): ${initial_gdp:.2f} billion")
    print(f"Final GDP ({data[-1]['Year']}): ${final_gdp:.2f} billion")
    print(f"Total Growth: {total_growth:.2f}%")
    print(f"Average Annual Growth: {avg_growth:.2f}%")
    print(f"Highest Growth: {max_growth['Year']} ({max_growth['Growth_Rate']}%)")
    print(f"Lowest Growth: {min_growth['Year']} ({min_growth['Growth_Rate']}%)")

<a id='section-7'></a>
## 7. Working with Images

Python can also work with images using libraries like PIL (Python Imaging Library), which is available as the `pillow` package. This is particularly useful for tasks like image processing, computer vision, and creating visualizations of economic data.

In [None]:
# This is a demonstration of image processing
# In a real environment, you would need to install pillow: pip install pillow

try:
    from PIL import Image, ImageOps, ImageDraw, ImageFont
    import io
    import base64
    
    # Create a simple image programmatically for demonstration
    img = Image.new('RGB', (200, 100), color = 'blue')
    d = ImageDraw.Draw(img)
    
    # Add text to the image
    try:
        font = ImageFont.truetype("arial.ttf", 16)
    except:
        font = ImageFont.load_default()
    
    d.text((10,10), "Economic Data", fill=(255,255,0), font=font)
    d.text((10,40), "By Siddiqur Rahman", fill=(255,255,0), font=font)
    
    # Save the image
    img.save('economic_chart.png')
    
    # Display image info
    print(f"Image format: {img.format}")
    print(f"Image size: {img.size}")
    print(f"Image mode: {img.mode}")
    
    # Resize the image
    resized_img = img.resize((100, 50))
    resized_img.save('economic_chart_small.png')
    
    # Convert to grayscale
    gray_img = img.convert('L')
    gray_img.save('economic_chart_gray.png')
    
    print("Images created successfully!")
    
except ImportError:
    print("The 'pillow' package is not installed.")
    print("In a terminal, run: pip install pillow")

#### Economic Application
Image processing has several applications in economics:
- Creating charts and graphs for economic reports
- Processing satellite imagery to measure economic activity (e.g., night lights as a proxy for economic development)
- Analyzing text from scanned economic documents
- Creating visual representations of economic models

In [None]:
# This is a demonstration of image processing
# In a real environment, you would need to install pillow: pip install pillow

try:
    from PIL import Image, ImageOps, ImageDraw, ImageFont
    import io
    import base64
    
    # Create a simple economic chart programmatically for demonstration
    img = Image.new('RGB', (600, 400), color = 'white')
    d = ImageDraw.Draw(img)
    
    # Add title
    try:
        font = ImageFont.truetype("arial.ttf", 16)
    except:
        font = ImageFont.load_default()
    
    d.text((200, 20), "Bangladesh Economic Indicators", fill=(0, 0, 0), font=font)
    d.text((150, 50), "By Siddiqur Rahman, Jahangirnagar University", fill=(0, 0, 0), font=font)
    
    # Draw simple bar chart
    # GDP Growth
    d.rectangle([(100, 100), (150, 200)], fill=(0, 0, 255))  # Blue
    d.text((100, 210), "GDP Growth", fill=(0, 0, 0), font=font)
    d.text((115, 220), "10.5%", fill=(0, 0, 0), font=font)
    
    # Inflation
    d.rectangle([(250, 100), (300, 162)], fill=(255, 0, 0))  # Red
    d.text((250, 210), "Inflation", fill=(0, 0, 0), font=font)
    d.text((265, 220), "6.2%", fill=(0, 0, 0), font=font)
    
    # Unemployment
    d.rectangle([(400, 100), (450, 155)], fill=(0, 255, 0))  # Green
    d.text((400, 210), "Unemployment", fill=(0, 0, 0), font=font)
    d.text((415, 220), "5.1%", fill=(0, 0, 0), font=font)
    
    # Save the image
    img.save('economic_chart.png')
    
    # Display image info
    print(f"Image format: {img.format}")
    print(f"Image size: {img.size}")
    print(f"Image mode: {img.mode}")
    print("Economic chart created successfully!")
    
    # Resize the image
    resized_img = img.resize((300, 200))
    resized_img.save('economic_chart_small.png')
    
    # Convert to grayscale
    gray_img = img.convert('L')
    gray_img.save('economic_chart_gray.png')
    
    print("Image variations created!")
    
except ImportError:
    print("The 'pillow' package is not installed.")
    print("In a terminal, run: pip install pillow")

Let's look at a more advanced example of creating an economic visualization:

In [None]:
# Advanced economic chart creation
try:
    from PIL import Image, ImageDraw, ImageFont
    import math
    
    # Create a line chart for GDP over time
    img = Image.new('RGB', (800, 500), color = 'white')
    d = ImageDraw.Draw(img)
    
    # Try to load a font
    try:
        title_font = ImageFont.truetype("arial.ttf", 20)
        label_font = ImageFont.truetype("arial.ttf", 12)
    except:
        title_font = ImageFont.load_default()
        label_font = ImageFont.load_default()
    
    # Title
    d.text((250, 20), "Bangladesh GDP Growth (2010-2022)", fill=(0, 0, 0), font=title_font)
    d.text((300, 50), "Source: Simulated Data by Siddiqur Rahman", fill=(100, 100, 100), font=label_font)
    
    # Chart area
    chart_left = 80
    chart_top = 100
    chart_width = 640
    chart_height = 300
    chart_bottom = chart_top + chart_height
    chart_right = chart_left + chart_width
    
    # Draw axes
    d.line([(chart_left, chart_bottom), (chart_right, chart_bottom)], fill=(0, 0, 0), width=2)  # X-axis
    d.line([(chart_left, chart_top), (chart_left, chart_bottom)], fill=(0, 0, 0), width=2)  # Y-axis
    
    # Sample GDP data (simulated)
    years = list(range(2010, 2023))
    gdp_values = [100, 108, 116, 125, 135, 146, 158, 171, 185, 200, 216, 233, 251]  # Billion USD
    
    # Scale data to fit chart
    min_gdp = min(gdp_values)
    max_gdp = max(gdp_values)
    gdp_range = max_gdp - min_gdp
    
    # Plot the data
    points = []
    for i, (year, gdp) in enumerate(zip(years, gdp_values)):
        x = chart_left + (i / (len(years) - 1)) * chart_width
        y = chart_bottom - ((gdp - min_gdp) / gdp_range) * chart_height
        points.append((x, y))
        
        # Draw point
        d.ellipse([(x-3, y-3), (x+3, y+3)], fill=(255, 0, 0))
        
        # Draw year label (every other year to avoid crowding)
        if i % 2 == 0:
            d.text((x-15, chart_bottom + 10), str(year), fill=(0, 0, 0), font=label_font)
    
    # Connect the points
    for i in range(len(points) - 1):
        d.line([points[i], points[i+1]], fill=(255, 0, 0), width=2)
    
    # Y-axis labels
    for i in range(6):
        value = min_gdp + (i / 5) * gdp_range
        y = chart_bottom - (i / 5) * chart_height
        d.text((chart_left - 60, y - 5), f"${value:.0f}B", fill=(0, 0, 0), font=label_font)
        
        # Grid line
        d.line([(chart_left, y), (chart_right, y)], fill=(200, 200, 200), width=1)
    
    # Axis labels
    d.text((chart_left + chart_width // 2 - 30, chart_bottom + 40), "Year", fill=(0, 0, 0), font=label_font)
    d.text((chart_left - 70, chart_top + chart_height // 2), "GDP", fill=(0, 0, 0), font=label_font)
    
    # Save the chart
    img.save('bangladesh_gdp_chart.png')
    print("Bangladesh GDP chart created successfully!")
    
except ImportError:
    print("The 'pillow' package is not installed.")
    print("In a terminal, run: pip install pillow")

<a id='problem-1'></a>
## Problem Set 1: Lines

#### Problem Description
In this problem, you'll implement a program that counts the number of lines of code in a Python file, excluding comments and blank lines. This is a common metric used to measure the complexity of a program.

Implement a program in a file called `lines.py` that:
1. Expects exactly one command-line argument, the name (or path) of a Python file
2. Outputs the number of lines of code in that file, excluding comments and blank lines
3. Exits via `sys.exit` if:
   - The user does not specify exactly one command-line argument
   - The specified file's name does not end in `.py`
   - The specified file does not exist

Assume that any line that starts with `#`, optionally preceded by whitespace, is a comment. A docstring should not be considered a comment. Assume that any line that only contains whitespace is blank.

#### Hints
- Use `sys.argv` to access command-line arguments
- Use `os.path.exists()` to check if a file exists
- Use `str.endswith()` to check the file extension
- Use `str.lstrip()` to remove leading whitespace
- Use `str.startswith()` to check if a line is a comment

In [None]:
# Write your solution for Problem 1: Lines
# TODO: Implement your solution here

#### Unit Tests for Problem 1

In [None]:
# Unit tests for Problem 1: Lines

import os
import sys
from unittest.mock import patch

# Create a test Python file
with open("test_file.py", "w") as file:
    file.write("# This is a comment\n")
    file.write("\n")
    file.write("def hello():\n")
    file.write("    print('Hello, world!')\n")
    file.write("\n")
    file.write("# Another comment\n")
    file.write("hello()\n")

def test_lines():
    # Test with a valid Python file
    with patch.object(sys, 'argv', ['lines.py', 'test_file.py']):
        try:
            # Import the lines module
            import lines
            # Redirect stdout to capture the output
            from io import StringIO
            old_stdout = sys.stdout
            sys.stdout = result = StringIO()
            try:
                lines.main()
            except SystemExit:
                pass
            sys.stdout = old_stdout
            output = result.getvalue().strip()
            assert output == "3", f"Expected 3 lines of code, got {output}"
            print("Test 1 passed: Valid Python file")
        except ImportError:
            print("Test 1 skipped: 'lines' module not found")
    
    # Test with invalid file extension
    with patch.object(sys, 'argv', ['lines.py', 'test_file.txt']):
        try:
            import lines
            try:
                lines.main()
                assert False, "Expected SystemExit for invalid file extension"
            except SystemExit:
                print("Test 2 passed: Invalid file extension")
        except ImportError:
            print("Test 2 skipped: 'lines' module not found")
    
    # Test with non-existent file
    with patch.object(sys, 'argv', ['lines.py', 'non_existent.py']):
        try:
            import lines
            try:
                lines.main()
                assert False, "Expected SystemExit for non-existent file"
            except SystemExit:
                print("Test 3 passed: Non-existent file")
        except ImportError:
            print("Test 3 skipped: 'lines' module not found")

# Run the tests
test_lines()

# Clean up
if os.path.exists("test_file.py"):
    os.remove("test_file.py")

#### Solution for Problem 1

In [None]:
# Solution for Problem 1: Lines

import sys
import os

def count_lines_of_code(filename):
    """Count the number of lines of code in a Python file, excluding comments and blank lines."""
    count = 0
    
    with open(filename, "r") as file:
        for line in file:
            # Remove leading whitespace
            stripped = line.lstrip()
            
            # Skip blank lines
            if not stripped:
                continue
            
            # Skip comment lines
            if stripped.startswith("#"):
                continue
            
            # Count the line
            count += 1
    
    return count

def main():
    # Check for exactly one command-line argument
    if len(sys.argv) != 2:
        sys.exit("Too few or too many command-line arguments")
    
    filename = sys.argv[1]
    
    # Check if the file ends with .py
    if not filename.endswith(".py"):
        sys.exit("Not a Python file")
    
    # Check if the file exists
    if not os.path.exists(filename):
        sys.exit("File does not exist")
    
    # Count and print the lines of code
    count = count_lines_of_code(filename)
    print(count)

if __name__ == "__main__":
    main()

<a id='problem-2'></a>
## Problem Set 2: Pizza

#### Problem Description
In this problem, you'll implement a program that formats a CSV file of pizza data as an ASCII table using the `tabulate` package.

Implement a program in a file called `pizza.py` that:
1. Expects exactly one command-line argument, the name (or path) of a CSV file in Pinocchio's format
2. Outputs a table formatted as ASCII art using `tabulate` in grid format
3. Exits via `sys.exit` if:
   - The user does not specify exactly one command-line argument
   - The specified file's name does not end in `.csv`
   - The specified file does not exist

#### Hints
- Install the `tabulate` package: `pip install tabulate`
- Use `tabulate.tabulate()` to format the data
- Use the `tablefmt="grid"` parameter to get the grid format
- Use the `csv` module to read the CSV file

In [None]:
# Write your solution for Problem 2: Pizza
# TODO: Implement your solution here

#### Unit Tests for Problem 2

In [None]:
# Unit tests for Problem 2: Pizza

import os
import sys
from unittest.mock import patch

# Create a test CSV file
with open("test_pizza.csv", "w") as file:
    file.write("Sicilian Pizza,Small,Large\n")
    file.write("Cheese,$25.50,$39.95\n")
    file.write("1 item,$27.50,$41.95\n")
    file.write("2 items,$29.50,$43.95\n")

def test_pizza():
    # Test with a valid CSV file
    with patch.object(sys, 'argv', ['pizza.py', 'test_pizza.csv']):
        try:
            # Import the pizza module
            import pizza
            # Redirect stdout to capture the output
            from io import StringIO
            old_stdout = sys.stdout
            sys.stdout = result = StringIO()
            try:
                pizza.main()
            except SystemExit:
                pass
            sys.stdout = old_stdout
            output = result.getvalue()
            # Check if the output contains the expected table elements
            assert "+" in output and "|" in output, "Output doesn't look like a table"
            assert "Sicilian Pizza" in output, "Table header not found in output"
            assert "Cheese" in output, "First row not found in output"
            print("Test 1 passed: Valid CSV file")
        except ImportError:
            print("Test 1 skipped: 'pizza' module not found")
    
    # Test with invalid file extension
    with patch.object(sys, 'argv', ['pizza.py', 'test_pizza.txt']):
        try:
            import pizza
            try:
                pizza.main()
                assert False, "Expected SystemExit for invalid file extension"
            except SystemExit:
                print("Test 2 passed: Invalid file extension")
        except ImportError:
            print("Test 2 skipped: 'pizza' module not found")
    
    # Test with non-existent file
    with patch.object(sys, 'argv', ['pizza.py', 'non_existent.csv']):
        try:
            import pizza
            try:
                pizza.main()
                assert False, "Expected SystemExit for non-existent file"
            except SystemExit:
                print("Test 3 passed: Non-existent file")
        except ImportError:
            print("Test 3 skipped: 'pizza' module not found")

# Run the tests
test_pizza()

# Clean up
if os.path.exists("test_pizza.csv"):
    os.remove("test_pizza.csv")

#### Solution for Problem 2

In [None]:
# Solution for Problem 2: Pizza

import sys
import os
import csv

try:
    from tabulate import tabulate
except ImportError:
    sys.exit("tabulate package not installed. Run: pip install tabulate")

def main():
    # Check for exactly one command-line argument
    if len(sys.argv) != 2:
        sys.exit("Too few or too many command-line arguments")
    
    filename = sys.argv[1]
    
    # Check if the file ends with .csv
    if not filename.endswith(".csv"):
        sys.exit("Not a CSV file")
    
    # Check if the file exists
    if not os.path.exists(filename):
        sys.exit("File does not exist")
    
    # Read the CSV file
    with open(filename, "r") as file:
        reader = csv.reader(file)
        table = list(reader)
    
    # Print the table in grid format
    print(tabulate(table, headers="firstrow", tablefmt="grid"))

if __name__ == "__main__":
    main()

<a id='problem-3'></a>
## Problem Set 3: Scourgify

#### Problem Description
In this problem, you'll implement a program that "cleans" CSV data by reformatting it. Specifically, you'll convert a CSV file with student names and houses into a new format with first name, last name, and house as separate columns.

Implement a program in a file called `scourgify.py` that:
1. Expects the user to provide two command-line arguments:
   - The name of an existing CSV file to read as input, with columns for name and house
   - The name of a new CSV to write as output, with columns for first, last, and house
2. Converts the input to the output, splitting each name into a first name and last name
3. Exits via `sys.exit` with an error message if:
   - The user does not provide exactly two command-line arguments
   - The first file cannot be read

#### Hints
- Use the `csv` module to read and write CSV files
- Use `str.split(", ")` to split the name into last name and first name
- Use `csv.DictReader` and `csv.DictWriter` for more readable code

In [None]:
# Write your solution for Problem 3: Scourgify
# TODO: Implement your solution here

#### Unit Tests for Problem 3

In [None]:
# Unit tests for Problem 3: Scourgify

import os
import sys
import csv
from unittest.mock import patch

# Create a test input CSV file
with open("test_before.csv", "w") as file:
    writer = csv.writer(file)
    writer.writerow(["name", "house"])
    writer.writerow(["Abbott, Hannah", "Hufflepuff"])
    writer.writerow(["Bell, Katie", "Gryffindor"])
    writer.writerow(["Bones, Susan", "Hufflepuff"])

def test_scourgify():
    # Test with valid input and output files
    with patch.object(sys, 'argv', ['scourgify.py', 'test_before.csv', 'test_after.csv']):
        try:
            # Import the scourgify module
            import scourgify
            try:
                scourgify.main()
            except SystemExit:
                pass
            
            # Check if the output file was created
            if os.path.exists("test_after.csv"):
                # Read the output file and check its content
                with open("test_after.csv", "r") as file:
                    reader = csv.DictReader(file)
                    rows = list(reader)
                    
                    # Check if the output has the correct headers
                    assert reader.fieldnames == ["first", "last", "house"], "Incorrect headers in output file"
                    
                    # Check if the first row is correct
                    assert rows[0]["first"] == "Hannah", "Incorrect first name in first row"
                    assert rows[0]["last"] == "Abbott", "Incorrect last name in first row"
                    assert rows[0]["house"] == "Hufflepuff", "Incorrect house in first row"
                    
                    # Check if the second row is correct
                    assert rows[1]["first"] == "Katie", "Incorrect first name in second row"
                    assert rows[1]["last"] == "Bell", "Incorrect last name in second row"
                    assert rows[1]["house"] == "Gryffindor", "Incorrect house in second row"
                    
                    print("Test 1 passed: Valid input and output files")
            else:
                print("Test 1 failed: Output file not created")
        except ImportError:
            print("Test 1 skipped: 'scourgify' module not found")
    
    # Test with incorrect number of arguments
    with patch.object(sys, 'argv', ['scourgify.py', 'test_before.csv']):
        try:
            import scourgify
            try:
                scourgify.main()
                assert False, "Expected SystemExit for incorrect number of arguments"
            except SystemExit:
                print("Test 2 passed: Incorrect number of arguments")
        except ImportError:
            print("Test 2 skipped: 'scourgify' module not found")
    
    # Test with non-existent input file
    with patch.object(sys, 'argv', ['scourgify.py', 'non_existent.csv', 'output.csv']):
        try:
            import scourgify
            try:
                scourgify.main()
                assert False, "Expected SystemExit for non-existent input file"
            except SystemExit:
                print("Test 3 passed: Non-existent input file")
        except ImportError:
            print("Test 3 skipped: 'scourgify' module not found")

# Run the tests
test_scourgify()

# Clean up
if os.path.exists("test_before.csv"):
    os.remove("test_before.csv")
if os.path.exists("test_after.csv"):
    os.remove("test_after.csv")

#### Solution for Problem 3

In [None]:
# Solution for Problem 3: Scourgify

import sys
import csv

def main():
    # Check for exactly two command-line arguments
    if len(sys.argv) != 3:
        sys.exit("Usage: python scourgify.py input.csv output.csv")
    
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    
    try:
        # Read the input CSV file
        with open(input_file, "r") as infile:
            reader = csv.DictReader(infile)
            
            # Prepare the output data
            students = []
            for row in reader:
                # Split the name into last name and first name
                last, first = row["name"].split(", ")
                students.append({
                    "first": first,
                    "last": last,
                    "house": row["house"]
                })
        
        # Write the output CSV file
        with open(output_file, "w", newline="") as outfile:
            fieldnames = ["first", "last", "house"]
            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
            
            writer.writeheader()
            for student in students:
                writer.writerow(student)
    
    except FileNotFoundError:
        sys.exit(f"Could not read {input_file}")

if __name__ == "__main__":
    main()

<a id='problem-4'></a>
## Problem Set 4: Shirt

#### Problem Description
In this problem, you'll implement a program that overlays a shirt image on a person's photo after resizing and cropping the input to be the same size.

Implement a program in a file called `shirt.py` that:
1. Expects exactly two command-line arguments:
   - The name (or path) of a JPEG or PNG to read as input
   - The name (or path) of a JPEG or PNG to write as output
2. Overlays `shirt.png` on the input after resizing and cropping the input to be the same size
3. Saves the result as the output
4. Exits via `sys.exit` if:
   - The user does not specify exactly two command-line arguments
   - The input's and output's names do not end in `.jpg`, `.jpeg`, or `.png` (case-insensitive)
   - The input's name does not have the same extension as the output's name
   - The specified input does not exist

#### Hints
- Install the `pillow` package: `pip install pillow`
- Use `Image.open()` to open the input image
- Use `ImageOps.fit()` to resize and crop the input
- Use `Image.paste()` to overlay the shirt
- Use `Image.save()` to save the result
- Make sure `shirt.png` is in the same directory as your script

In [None]:
# Write your solution for Problem 4: Shirt
# TODO: Implement your solution here

#### Unit Tests for Problem 4

In [None]:
# Unit tests for Problem 4: Shirt

import os
import sys
from unittest.mock import patch

def test_shirt():
    # Test with incorrect number of arguments
    with patch.object(sys, 'argv', ['shirt.py', 'input.jpg']):
        try:
            import shirt
            try:
                shirt.main()
                assert False, "Expected SystemExit for incorrect number of arguments"
            except SystemExit:
                print("Test 1 passed: Incorrect number of arguments")
        except ImportError:
            print("Test 1 skipped: 'shirt' module not found")
    
    # Test with invalid file extensions
    with patch.object(sys, 'argv', ['shirt.py', 'input.txt', 'output.txt']):
        try:
            import shirt
            try:
                shirt.main()
                assert False, "Expected SystemExit for invalid file extensions"
            except SystemExit:
                print("Test 2 passed: Invalid file extensions")
        except ImportError:
            print("Test 2 skipped: 'shirt' module not found")
    
    # Test with mismatched file extensions
    with patch.object(sys, 'argv', ['shirt.py', 'input.jpg', 'output.png']):
        try:
            import shirt
            try:
                shirt.main()
                assert False, "Expected SystemExit for mismatched file extensions"
            except SystemExit:
                print("Test 3 passed: Mismatched file extensions")
        except ImportError:
            print("Test 3 skipped: 'shirt' module not found")
    
    # Test with non-existent input file
    with patch.object(sys, 'argv', ['shirt.py', 'non_existent.jpg', 'output.jpg']):
        try:
            import shirt
            try:
                shirt.main()
                assert False, "Expected SystemExit for non-existent input file"
            except SystemExit:
                print("Test 4 passed: Non-existent input file")
        except ImportError:
            print("Test 4 skipped: 'shirt' module not found")

# Run the tests
test_shirt()

#### Solution for Problem 4

In [None]:
# Solution for Problem 4: Shirt

import sys
import os

try:
    from PIL import Image, ImageOps
except ImportError:
    sys.exit("pillow package not installed. Run: pip install pillow")

def main():
    # Check for exactly two command-line arguments
    if len(sys.argv) != 3:
        sys.exit("Usage: python shirt.py input.jpg output.jpg")
    
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    
    # Check if the files have valid extensions
    valid_extensions = [".jpg", ".jpeg", ".png"]
    input_ext = os.path.splitext(input_file)[1].lower()
    output_ext = os.path.splitext(output_file)[1].lower()
    
    if input_ext not in valid_extensions or output_ext not in valid_extensions:
        sys.exit("Invalid file extension")
    
    # Check if the input and output have the same extension
    if input_ext != output_ext:
        sys.exit("Input and output must have the same extension")
    
    # Check if the input file exists
    if not os.path.exists(input_file):
        sys.exit("Input does not exist")
    
    try:
        # Open the input image
        input_image = Image.open(input_file)
        
        # Open the shirt image
        shirt_image = Image.open("shirt.png")
        
        # Get the size of the shirt
        size = shirt_image.size
        
        # Resize and crop the input image to match the shirt size
        input_image = ImageOps.fit(input_image, size)
        
        # Paste the shirt on the input image
        input_image.paste(shirt_image, shirt_image)
        
        # Save the result
        input_image.save(output_file)
    
    except FileNotFoundError:
        sys.exit("Input or shirt file not found")

if __name__ == "__main__":
    main()

## Conclusion

In this lecture, we've explored the fundamentals of File I/O in Python, which allows us to store and retrieve data persistently. We've learned how to:

- Read from and write to text files
- Use the `with` statement for automatic file handling
- Work with CSV files using the `csv` module
- Process and transform data from one format to another
- Manipulate images using the `pillow` package

### Economic Applications of File I/O
File I/O is fundamental to modern economic analysis and research:

1. **Data Analysis:** Reading economic data from CSV files, Excel spreadsheets, or databases for statistical analysis and modeling.

2. **Time Series Analysis:** Working with historical economic data stored in files to identify trends, cycles, and patterns.

3. **Economic Modeling:** Saving and loading model parameters, results, and intermediate calculations for complex economic simulations.

4. **Report Generation:** Creating formatted reports and visualizations from economic analysis results for policymakers and stakeholders.

5. **Data Collection:** Automating the collection and storage of economic data from APIs, web scraping, or other sources.

6. **Reproducible Research:** Ensuring economic research is reproducible by properly documenting and storing data, code, and results.

### Best Practices for Economic Programming

- **Always Use Context Managers:** Use `with` statements to ensure files are properly closed, even when errors occur.

- **Validate Economic Data:** Implement checks to ensure economic data is reasonable (e.g., GDP values are positive, percentages are between 0-100).

- **Document Data Sources:** Include metadata in your files about where economic data comes from, when it was collected, and any transformations applied.

- **Handle Missing Values:** Economic data often has missing values; implement strategies to handle them appropriately.

- **Backup Important Data:** Economic data can be valuable and irreplaceable; implement backup strategies for important datasets.

The problem sets in this lecture have given you hands-on experience with various file operations, from counting lines of code to processing CSV data and manipulating images. These practical skills will serve you well in your future programming endeavors.