### Advanced Exercises: Writing CSV Files

In this notebook you will practice some more advanced (but still practical) ways of writing CSV files using Python's built-in `csv` module.

Each exercise describes a scenario and then shows one possible **solution implementation** that follows common best practices:

- Open CSV files with `newline=''` when writing.
- Use an explicit `encoding` (for example `utf-8`).
- Prefer context managers (`with open(...) as f:`).
- Use `csv.DictWriter` when you naturally work with dictionaries.
- Use dialects / quoting / line terminators when you need custom formatting.

In [1]:
import csv
from pathlib import Path

#### Exercise 1 – Write a UTF-8 CSV with Best Practices

You have the following in-memory data (notice the non-ASCII characters):

```python
rows = [
    ['first_name', 'last_name', 'country'],
    ['Łukasz', 'Kowalski', 'Poland'],
    ['Zoë', "D'Amico", 'Italy'],
    ['María', 'García', 'Spain'],
]
```

**Goal:**

1. Write these rows to a file called `people_utf8.csv` using the default `excel` dialect.
2. Follow these best practices:
   * open the file with `newline=''`;
   * specify `encoding='utf-8'`;
   * use a context manager (`with`).
3. After writing, read the file back and print its raw text content so you can manually inspect the result.

A possible solution is shown below.

In [2]:
rows = [
    ["first_name", "last_name", "country"],
    ["Łukasz", "Kowalski", "Poland"],
    ["Zoë", "D'Amico", "Italy"],
    ["María", "García", "Spain"],
]

output_path = Path("people_utf8.csv")

# --- Solution ---
with output_path.open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerows(rows)

# verify: read raw text back
print(output_path.read_text(encoding="utf-8"))

first_name,last_name,country
Łukasz,Kowalski,Poland
Zoë,D'Amico,Italy
María,García,Spain



#### Exercise 2 – Using `DictWriter` with a Custom Header Order

Suppose you are generating a sales report from a list of dictionaries:

```python
sales = [
    {'product': 'Laptop', 'units': 10, 'price': 1200.0},
    {'product': 'Mouse', 'units': 50, 'price': 25.5},
    {'product': 'Monitor', 'units': 20, 'price': 300.0},
]
```

**Goal:**

1. Write these records to a CSV file named `sales_report.csv`.
2. Use `csv.DictWriter`.
3. Ensure the column order in the file is exactly: `product`, `units`, `price`.
4. Write a header row.

Below is one possible solution.

In [3]:
sales = [
    {"product": "Laptop", "units": 10, "price": 1200.0},
    {"product": "Mouse", "units": 50, "price": 25.5},
    {"product": "Monitor", "units": 20, "price": 300.0},
]

fieldnames = ["product", "units", "price"]
sales_path = Path("sales_report.csv")

# --- Solution ---
with sales_path.open("w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(sales)

print(sales_path.read_text(encoding="utf-8"))

product,units,price
Laptop,10,1200.0
Mouse,50,25.5
Monitor,20,300.0



#### Exercise 3 – Register and Use a Custom Dialect

Your organization uses a semicolon-separated CSV format with these rules:

- Delimiter: `;`
- Quote character: `"` (double quote)
- Use `csv.QUOTE_MINIMAL` so that fields containing the delimiter or whitespace are quoted.

You are given this data:

```python
records = [
    ['id', 'name', 'role'],
    ['1', 'Alice Johnson', 'Data; Science'],
    ['2', 'Bob Stone', 'Developer'],
    ['3', 'Carol Smith', 'Data Engineer'],
]
```

**Goal:**

1. Register a dialect named `'org_semicolon'` with the options above.
2. Use `csv.writer(..., dialect='org_semicolon')` to write `records` to `org_records.csv`.
3. Verify the raw file content by printing it.

A possible solution is shown below.

In [4]:
records = [
    ["id", "name", "role"],
    ["1", "Alice Johnson", "Data; Science"],
    ["2", "Bob Stone", "Developer"],
    ["3", "Carol Smith", "Data Engineer"],
]

# --- Solution ---
csv.register_dialect(
    "org_semicolon",
    delimiter=";",
    quotechar="\"",
    quoting=csv.QUOTE_MINIMAL,
)

org_path = Path("org_records.csv")
with org_path.open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f, dialect="org_semicolon")
    writer.writerows(records)

print(org_path.read_text(encoding="utf-8"))

id;name;role
1;Alice Johnson;"Data; Science"
2;Bob Stone;Developer
3;Carol Smith;Data Engineer



#### Exercise 4 – Appending Rows to an Existing CSV File

Sometimes CSV files are used as simple logs that grow over time.

**Scenario:**

You want to maintain a log file named `event_log.csv` with the columns `timestamp`, `event`, `details`.

1. First, create the file with a header row and a single entry.
2. Later, you append two more entries without rewriting the whole file.

**Goal:**

- Show how to create the CSV (with a header) if it does not exist yet.
- Show how to append new rows using `mode='a'` (append), `newline=''`, and the same delimiter.

Below is a complete solution that does both steps.

In [5]:
log_path = Path("event_log.csv")

# --- Solution ---
# Step 1: create the file with a header and one row
header = ["timestamp", "event", "details"]
first_row = ["2025-01-01T10:00:00", "START", "Application started"]

with log_path.open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerow(header)
    writer.writerow(first_row)

# Step 2: append more rows later
extra_rows = [
    ["2025-01-01T10:05:00", "INFO", "User logged in"],
    ["2025-01-01T10:06:30", "ERROR", "Failed to connect to database"],
]

with log_path.open("a", newline="", encoding="utf-8") as f:
    writer = csv.writer(f)
    writer.writerows(extra_rows)

print(log_path.read_text(encoding="utf-8"))

timestamp,event,details
2025-01-01T10:00:00,START,Application started
2025-01-01T10:05:00,INFO,User logged in
2025-01-01T10:06:30,ERROR,Failed to connect to database



#### Exercise 5 – Handling Commas and Newlines Inside Fields

Sometimes data contains commas and newlines that must be preserved inside a single field. The CSV writer can handle this if quoting is configured correctly.

You are given this data:

```python
feedback_rows = [
    ['id', 'user', 'comment'],
    [
        1,
        'alice',
        'Great product!\nI really like the new features, especially the dashboard.',
    ],
    [
        2,
        'bob',
        'Not bad, but could be faster. Also, export to CSV would be nice.',
    ],
]
```

**Goal:**

1. Write these rows to a file called `feedback.csv`.
2. Use `quoting=csv.QUOTE_MINIMAL` and `quotechar='"'`.
3. Use a custom `lineterminator='\n'` so that each CSV row ends with a single linefeed (no `\r\n`).
4. Inspect the result by reading and printing the raw text.

A possible solution is shown below.

In [6]:
feedback_rows = [
    ["id", "user", "comment"],
    [
        1,
        "alice",
        "Great product!\nI really like the new features, especially the dashboard.",
    ],
    [
        2,
        "bob",
        "Not bad, but could be faster. Also, export to CSV would be nice.",
    ],
]

feedback_path = Path("feedback.csv")

# --- Solution ---
with feedback_path.open("w", newline="", encoding="utf-8") as f:
    writer = csv.writer(
        f,
        delimiter=",",
        quotechar="\"",
        quoting=csv.QUOTE_MINIMAL,
        lineterminator="\n",
    )
    writer.writerows(feedback_rows)

print(feedback_path.read_text(encoding="utf-8"))

id,user,comment
1,alice,"Great product!
I really like the new features, especially the dashboard."
2,bob,"Not bad, but could be faster. Also, export to CSV would be nice."

