# üìò P1.2.3.3 ‚Äì Python File Operations
## Topic: CSV Processing (Reading, Writing)

## üéØ Learning Objectives
By the end of this notebook, you will:
- Understand what CSV files are and when to use them
- Read CSV files with `csv.reader()` and `csv.DictReader()`
- Write CSV files with `csv.writer()` and `csv.DictWriter()`
- Handle CSV headers properly
- Process real-world tabular data

In [1]:
# Setup: ensure data folder exists
import os
os.makedirs("data", exist_ok=True)

## üìä What is CSV?
CSV = Comma-Separated Values

A simple text format for storing tabular data:
```
Name,Age,City
Alice,25,Bangalore
Bob,30,Mumbai
```

Used everywhere: data exports, logs, spreadsheets, databases

## üìñ Reading CSV Files: Basic Approach
Use `csv.reader()` to read rows as lists.

In [2]:
import csv

# Create sample CSV file first
with open("data/employees.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["Name", "Age", "Department"])
    writer.writerow(["Alice", "25", "Engineering"])
    writer.writerow(["Bob", "30", "Sales"])
    writer.writerow(["Charlie", "28", "Engineering"])

# Now read it
with open("data/employees.csv", "r") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

['Name', 'Age', 'Department']
['Alice', '25', 'Engineering']
['Bob', '30', 'Sales']
['Charlie', '28', 'Engineering']


## üîë Reading CSV with Headers (DictReader)
Use `csv.DictReader()` to access columns by name.

In [3]:
import csv

with open("data/employees.csv", "r") as f:
    reader = csv.DictReader(f)
    for row in reader:
        print(f"{row['Name']} works in {row['Department']}")

Alice works in Engineering
Bob works in Sales
Charlie works in Engineering


## ‚úçÔ∏è Writing CSV Files: Basic Approach
Use `csv.writer()` to write rows.

In [4]:
import csv

data = [
    ["Product", "Price", "Stock"],
    ["Laptop", "50000", "10"],
    ["Mouse", "500", "50"],
    ["Keyboard", "1500", "30"]
]

with open("data/products.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(data)

print("‚úÖ CSV written to data/products.csv")

‚úÖ CSV written to data/products.csv


## üìù Writing CSV with DictWriter
Write data as dictionaries for clearer code.

In [5]:
import csv

users = [
    {"username": "alice", "email": "alice@example.com", "role": "admin"},
    {"username": "bob", "email": "bob@example.com", "role": "user"},
    {"username": "charlie", "email": "charlie@example.com", "role": "user"}
]

with open("data/users.csv", "w", newline="") as f:
    fieldnames = ["username", "email", "role"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    
    writer.writeheader()
    writer.writerows(users)

print("‚úÖ Users saved to data/users.csv")

‚úÖ Users saved to data/users.csv


## üß™ Practical Example: Process Sales Data
Read CSV, filter data, write results.

In [6]:
import csv

# Create sample sales data
sales_data = [
    ["Product", "Quantity", "Price"],
    ["Laptop", "5", "50000"],
    ["Mouse", "20", "500"],
    ["Keyboard", "15", "1500"],
    ["Monitor", "3", "20000"]
]

with open("data/sales.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(sales_data)

# Process: Filter high-value items (> 10000)
high_value = []
with open("data/sales.csv", "r") as f:
    reader = csv.DictReader(f)
    for row in reader:
        total = int(row["Quantity"]) * int(row["Price"])
        if total > 10000:
            high_value.append({
                "Product": row["Product"],
                "Total": total
            })

# Write filtered results
with open("data/high_value_sales.csv", "w", newline="") as f:
    fieldnames = ["Product", "Total"]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(high_value)

print(f"Found {len(high_value)} high-value items")
for item in high_value:
    print(f"  {item['Product']}: ‚Çπ{item['Total']}")

Found 3 high-value items
  Laptop: ‚Çπ250000
  Keyboard: ‚Çπ22500
  Monitor: ‚Çπ60000


## ‚ö†Ô∏è Common CSV Issues & Solutions

**Issue 1: Newline characters**
‚úÖ Always use `newline=""` when opening CSV files for writing

**Issue 2: Different delimiters**
Some CSVs use `;` or `\t` instead of `,`
```python
csv.reader(f, delimiter=';')
```

**Issue 3: Quoting text with commas**
CSV automatically handles: `"Smith, John",25,"New York, NY"`

In [None]:
import csv

# Example: CSV with different delimiter
data = [["Name", "Score"], ["Alice", "95"], ["Bob", "87"]]

with open("data/scores.txt", "w", newline="") as f:
    writer = csv.writer(f, delimiter="\t")  # Tab-separated
    writer.writerows(data)

# Read it back
with open("data/scores.txt", "r") as f:
    reader = csv.reader(f, delimiter="\t")
    for row in reader:
        print(row)

### ‚úÖ Key Takeaways
- CSV is simple, universal format for tabular data
- `csv.reader()` reads rows as lists
- `csv.DictReader()` reads rows as dictionaries (easier to use)
- `csv.writer()` and `csv.DictWriter()` create CSV files
- Always use `newline=""` when writing CSV files
- **In AI/ML:** CSV is common for datasets, feature engineering outputs, and model predictions