# Chapter 3: Python File Operations for Beginners

Welcome! In this notebook, you'll learn how to work with files in Python step by step. We'll keep it practical and beginner-friendly with short examples and mini-exercises.

What you'll learn:
- What file paths are (absolute vs. relative) and how to work with them safely
- How to open and close files the right way (context managers)
- How to read from and write to text files
- How to append to existing files
- How to handle encodings (like UTF-8) and basic errors
- How to work with CSV and JSON files using Python's standard library
- How to read/write binary files (like images)
- A few quick exercises to practice

Tip: Run cells from top to bottom. If a cell fails, re-run the earlier ones first.

In [None]:
# Setup: Create a working folder for this notebook
from pathlib import Path
import os, textwrap

BASE = Path("./_file_ops_demo")
BASE.mkdir(exist_ok=True)
print("Working in:", BASE.resolve())

# Helper: list files in our working folder
for p in sorted(BASE.glob("**/*")):
    print("-", p)

## Understanding File Paths

- Absolute path: starts from the drive root (e.g., `/home/user/docs/file.txt`).
- Relative path: starts from the current working directory (this notebook's folder).

We'll use `pathlib` for paths. It's safer and more readable than string paths.

Examples:
- `Path("notes.txt")` — a file in the current folder
- `Path("data") / "notes.txt"` — uses `/` to join paths in a cross-platform way
- `p.exists()`, `p.is_file()`, `p.is_dir()` — quick checks

In [None]:
from pathlib import Path

subdir = BASE / "data"
subdir.mkdir(exist_ok=True)

file_path = subdir / "hello.txt"
print("Absolute:", file_path.resolve())
print("Exists?", file_path.exists())
print("Parent:", file_path.parent)
print("Name:", file_path.name)

## Opening Files Safely with Context Managers

Use `with open(...) as f:` to ensure the file is closed automatically.

Common modes:
- `"r"` — read (default)
- `"w"` — write (overwrite if exists)
- `"a"` — append (add to the end)
- `"x"` — create (error if exists)
- add `"b"` for binary (e.g., `"rb"`, `"wb"`)

Text files often use `encoding="utf-8"`.

In [None]:
msg_lines = [
    "Hello, file!\n",
    "This is line 2.\n",
    "Unicode: café, 🚀\n",
]

with open(file_path, mode="w", encoding="utf-8") as f:
    f.writelines(msg_lines)  

print("Wrote:", file_path, "(size:", file_path.stat().st_size, "bytes)")

In [None]:
a = file_path.read_text(encoding="utf-8")
print("\nread_text():\n", a)

In [None]:
print("\nIterating lines:")
with open(file_path, encoding="utf-8") as f:
    for i, line in enumerate(f, start=1):
        print(f"{i:02d}:", line.rstrip("\n"))

### Writing vs. Appending vs. Reading

- Write (`"w"`) overwrites the file or creates it if missing.
- Append (`"a"`) adds new content at the end without deleting existing data.
- Read (`"r"`) opens an existing file for reading; it fails if the file doesn't exist.

Helpful methods:
- `read()` — whole content as one string
- `readline()` — one line at a time
- `readlines()` — list of lines (be mindful of memory for big files)
- Iterating `for line in f:` — memory-friendly for large files

In [None]:
# Append a new line and show the result
with open(file_path, mode="a", encoding="utf-8") as f:
    f.write("Appended line.\n")

print("After append:")
print(file_path.read_text(encoding="utf-8"))

## Encodings and Common Errors

- Always prefer `encoding="utf-8"` for text files unless you know otherwise.
- If you see `UnicodeDecodeError`, try specifying the correct encoding or use `errors="replace"` to avoid crashes.
- Use `try`/`except` to handle missing files or permission issues.

Examples of exceptions:
- `FileNotFoundError` — path doesn't exist
- `PermissionError` — no permission to read/write
- `IsADirectoryError` — tried to open a directory as a file

In [None]:
missing = BASE / "does_not_exist.txt"
try:
    print(missing.read_text(encoding="utf-8"))
except FileNotFoundError as e:
    print("Caught:", type(e).__name__, e)


In [None]:
# Encoding example: write UTF-8 and read with wrong encoding then fix
utf_file = BASE / "utf_demo.txt"
utf_file.write_text("Café ☕", encoding="utf-8")

try:
    print("Wrong decoding:")
    print(utf_file.read_text(encoding="latin-1"))  # may produce odd characters
except UnicodeDecodeError as e:
    print("Decode error:", e)

print("Correct decoding:", utf_file.read_text(encoding="utf-8"))

## Working with CSV Files (Comma-Separated Values)

CSV is a simple text format for tabular data. We'll use Python's built-in `csv` module.

- Use `csv.writer` to write rows (lists or tuples)
- Use `csv.reader` to read rows
- Use `newline=""` when opening files for CSV to avoid extra blank lines on some platforms

In [None]:
import csv

csv_path = BASE / "people.csv"
rows = [
    ["name", "age", "city"],
    ["Alice", 30, "Paris"],
    ["Bob", 25, "Berlin"],
]

with open(csv_path, mode="w", encoding="utf-8", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

print("CSV written:", csv_path)

In [None]:
# Read it back
with open(csv_path, encoding="utf-8") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

## Working with JSON Files (JavaScript Object Notation)

JSON stores structured data (dictionaries/lists). We'll use the built-in `json` module.

- `json.dump(obj, file)` writes JSON to a file
- `json.load(file)` reads JSON from a file
- Use `indent=2` for pretty printing
- Remember to use `encoding="utf-8"` for text files

In [None]:
import json

json_path = BASE / "config.json"
config = {
    "app": "demo",
    "version": 1,
    "features": ["read", "write", "append"],
}

with open(json_path, mode="w", encoding="utf-8") as f:
    json.dump(config, f, ensure_ascii=False, indent=2)

print("JSON written:", json_path)


In [None]:
with open(json_path, encoding="utf-8") as f:
    data = json.load(f)

print("Loaded JSON:", data)
print("Feature count:", len(data["features"]))

## Binary Files (Images, PDFs, etc.)

Binary files aren't text. Open them with modes like `"rb"` (read binary) or `"wb"` (write binary).

Use this when copying images or other non-text files to avoid corruption.

In [None]:
# Copy an image using binary mode
src_img = Path("./images/img1.png")
dst_img = BASE / "copy_img1.png"

if src_img.exists():
    with open(src_img, "rb") as src, open(dst_img, "wb") as dst:
        dst.write(src.read())
    print("Copied:", src_img, "->", dst_img)
else:
    print("Source image not found at", src_img)

## Mini-Exercises

Try these tasks yourself. Create new cells below each task to write your solution.

1) Text file practice
- Create a file `notes.txt` inside the demo folder
- Write three lines to it using `with open(..., "w", encoding="utf-8")`
- Append one more line using `"a"`
- Read it back and print line numbers

2) CSV practice
- Create a new CSV `scores.csv` with header `name,score`
- Add at least 3 rows
- Read it back and print the average score

3) JSON practice
- Create a JSON file `settings.json` with keys: `theme` (string), `autosave` (bool), `recent_files` (list)
- Read it back and print the number of recent files

Bonus) Robust reading
- Write a function `safe_read(path)` that returns file text if it exists, otherwise returns `"<missing>"` without raising

Tip: Reuse `BASE` for your file paths: `BASE / "notes.txt"`

## Optional Solutions (peek only if stuck)

The following cells show sample solutions. Try on your own first!

In [None]:
# Solution 1) Text file practice
notes = BASE / "notes.txt"
with open(notes, "w", encoding="utf-8") as f:
    f.write("Line A\nLine B\nLine C\n")
with open(notes, "a", encoding="utf-8") as f:
    f.write("Line D\n")

with open(notes, encoding="utf-8") as f:
    for i, line in enumerate(f, 1):
        print(f"{i:02d}:", line.rstrip())

In [None]:
# Solution 2) CSV practice
import csv
scores = BASE / "scores.csv"
with open(scores, "w", encoding="utf-8", newline="") as f:
    w = csv.writer(f)
    w.writerow(["name", "score"])
    w.writerows([["Ann", 80], ["Ben", 92], ["Cara", 76]])

vals = []
with open(scores, encoding="utf-8") as f:
    r = csv.DictReader(f)
    for row in r:
        vals.append(float(row["score"]))

print("Average:", sum(vals)/len(vals))

In [None]:
# Solution 3) JSON practice
import json
settings = BASE / "settings.json"
obj = {"theme": "light", "autosave": True, "recent_files": ["a.txt", "b.txt"]}
with open(settings, "w", encoding="utf-8") as f:
    json.dump(obj, f, ensure_ascii=False, indent=2)

with open(settings, encoding="utf-8") as f:
    data = json.load(f)
print("Recent count:", len(data.get("recent_files", [])))

In [None]:
# Bonus) Robust reading
from pathlib import Path

def safe_read(path: Path, encoding: str = "utf-8") -> str:
    try:
        return path.read_text(encoding=encoding)
    except FileNotFoundError:
        return "<missing>"

print(safe_read(BASE / "nope.txt"))