# Lesson 4: Working with Files

So far, all the data we've used in our programs has disappeared the moment the program finishes. To store data permanently, we need to save it to a file. In this lesson, we will explore how to read from and write to text files using Python. We will also learn how to handle file paths and manage potential errors.

## 1. File Paths: Locating Your Files

Before you can open a file, you need to tell Python where to find it. This is done using a file path.

* **Relative Path**: A path from the current working directory of your script. It's shorter and more portable. Example: `data/my_file.txt`.
* **Absolute Path**: A full path starting from the root directory of your file system. It's very specific but less portable. 
    * Windows Example: `C:\Users\YourUser\Documents\my_file.txt`
    * macOS/Linux Example: `/home/youruser/documents/my_file.txt`

**Best Practice:** For projects, it's almost always better to use relative paths. This ensures your code will work on any computer, regardless of its file structure.

### Working with Paths Across Operating Systems

As you can see, Windows uses a backslash (`\`) while macOS and Linux use a forward slash (`/`) as a path separator. This can cause problems. The `os` module helps us write code that works everywhere.

The `os.path.join()` function intelligently creates a correct path for the operating system your code is running on.

In [None]:
import os

# This will create 'data/output' on Mac/Linux and 'data\output' on Windows
folder_path = os.path.join('data', 'output')

print(f"The constructed path is: {folder_path}")

## 2. Setting Up Our Workspace

Let's create the files and folders we'll need for this lesson. We'll create a `data` folder, and inside it, we'll place an `input.txt` file. We'll also create an `output` subfolder inside `data`.

We use `os.makedirs(..., exist_ok=True)` to safely create directories. If the directory already exists, it won't raise an error.

In [None]:
# Create the main data directory
data_dir = 'data'
os.makedirs(data_dir, exist_ok=True)

# Create an output subdirectory
output_dir = os.path.join(data_dir, 'output')
os.makedirs(output_dir, exist_ok=True)

# Define the path for our input file
input_file_path = os.path.join(data_dir, 'input.txt')

# Create and write some content to the input file
# The 'w' mode means we are opening the file for writing. It will be created if it doesn't exist.
with open(input_file_path, 'w') as f:
    f.write("Hello Python learners!\n")
    f.write("This is the second line.\n")
    f.write("File I/O is a fundamental skill.\n")

print(f"Successfully created '{input_file_path}' and the directory structure.")

## 3. Reading from Files

The best way to open a file is using the `with open(...) as ...:` statement. This is a special construct in Python that ensures the file is automatically closed when you are done with it, even if errors occur.

**Syntax:** `with open('path/to/file', 'mode') as file_variable:`

**Common Modes:**
* `'r'` - **Read** (default). Opens a file for reading, raises an error if the file does not exist.
* `'w'` - **Write**. Opens a file for writing, creates the file if it does not exist, **overwrites** the content if it exists.
* `'a'` - **Append**. Opens a file for appending, creates the file if it does not exist, adds new content to the end of the file.

### Method 1: Reading the Entire File (`.read()`)

This reads the entire content of the file into a single string. It's simple but can be inefficient for very large files as it loads everything into memory.

In [None]:
with open(input_file_path, 'r') as f:
    content = f.read()
    print("--- Content read with .read() ---")
    print(content)

### Method 2: Reading Line by Line (Iteration - Most Common)

This is the most common and memory-efficient way to process a file. You can loop through the file object directly to get one line at a time.

In [None]:
print("--- Reading line by line ---")
with open(input_file_path, 'r') as f:
    for line in f:
        # The .strip() method is useful here to remove the invisible newline character (\n)
        print(line.strip())

### Method 3: Reading All Lines into a List (`.readlines()`)

This reads all lines and returns them as a list of strings. Each string in the list ends with a newline character `\n`.

In [None]:
with open(input_file_path, 'r') as f:
    lines_list = f.readlines()
    print("--- Content read with .readlines() ---")
    print(lines_list)

## 4. Writing to Files

We use the `'w'` (write) or `'a'` (append) modes to write to files.

### Using Mode `'w'` (Write)

**Warning:** This will completely **erase** the file's contents if it already exists.

In [None]:
output_file_w = os.path.join(output_dir, 'written_file.txt')

with open(output_file_w, 'w') as f:
    f.write("This is the first line.\n")
    f.write("This is the second line.\n")

print(f"'{output_file_w}' created and written to.")

# Let's verify by reading it back
with open(output_file_w, 'r') as f:
    print(f.read())

### Using Mode `'a'` (Append)

This adds content to the end of the file without deleting its existing content.

In [None]:
with open(output_file_w, 'a') as f:
    f.write("This is a third line, appended later.\n")

print(f"Content appended to '{output_file_w}'.")

# Let's verify again
with open(output_file_w, 'r') as f:
    print(f.read())

## 5. Practice Exercise: Process and Write

Let's combine everything we've learned. Your task is to write a program that:

1.  Reads the content from `data/input.txt` line by line.
2.  Processes each line by converting it to uppercase and adding a line number at the beginning.
3.  Writes the processed lines to a new file called `data/output/summary.txt`.
4.  Prints a confirmation message to the console when done.

In [None]:
# Define file paths
input_path = os.path.join('data', 'input.txt')
output_path = os.path.join('data', 'output', 'summary.txt')

# We open both files at the same time
with open(input_path, 'r') as infile, open(output_path, 'w') as outfile:
    line_number = 1
    for line in infile:
        # Process the line
        processed_line = f"Line {line_number}: {line.strip().upper()}\n"
        
        # Write to the output file
        outfile.write(processed_line)
        
        line_number += 1
        
print(f"Successfully processed the file and saved the summary to '{output_path}'.")