# ### Week 3 â€“ Day 3 : File Handling (CSV & JSON)

Today weâ€™ll learn how to read and write data using Pythonâ€™s built-in file-handling modules.  
This is the foundation of real ETL work where youâ€™ll interact with raw data files every day.

**Learning Goals ðŸŽ¯**
- Open, read, and write text files (`.txt`)
- Work with CSV files using the `csv` module
- Parse and update JSON data using the `json` module
- Use `with open()` context managers to automate file closing


In [5]:
# Example 1 â€” Writing and Reading a Text File
file_path = "../../datasets/sample_notes.txt"


# Write to a file
with open(file_path, "w") as f:
    f.write("Python File Handling Basics\n")
    f.write("Week 3 Day 3 Example\n")

# Read the file back
with open(file_path, "r") as f:
    content = f.read()

print("File Contents:\n", content)


File Contents:
 Python File Handling Basics
Week 3 Day 3 Example



# Explanation
- `open(file, mode)` opens a file (`"w"` for write, `"r"` for read, `"a"` for append).  
- `with open()` is a **context manager** that automatically closes the file after use.  
- `f.write()` adds text to a file, and `f.read()` retrieves its contents.  


In [7]:
#Example 2 â€” Reading a CSV File
import csv

csv_path = "../../datasets/sales.csv"

with open(csv_path, "r") as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)


['order_id', 'region', 'country', 'product', 'quantity', 'price', 'order_date', 'customer_id']
['1', 'North America', 'USA', 'Laptop', '2', '899.99', '2024-01-10', '1']
['2', 'Europe', 'Germany', 'Phone', '5', '499.99', '2024-01-12', '2']
['3', 'Asia', 'India', 'Tablet', '3', '299.99', '2024-01-15', '3']
['4', 'North America', 'Canada', 'Monitor', '4', '199.99', '2024-01-18', '']
['5', 'Europe', 'France', 'Laptop', '1', '999.99', '2024-01-20', '4']
['6', 'Asia', 'Japan', 'Phone', '6', '699.99', '2024-01-22', '5']
['7', 'Europe', 'Germany', 'Monitor', '2', '199.99', '2024-01-23', '2']
['8', 'North America', 'USA', 'Tablet', '4', '299.99', '2024-01-25', '1']


# Explanation

In this example, we use Pythonâ€™s built-in **`csv`** module to read structured tabular data from a `.csv` file.

A CSV (Comma-Separated Values) file stores data in plain text form, where each line represents a record and each value is separated by commas.

Weâ€™ll:
1. Open the CSV file located in the `datasets/` folder  
2. Use the `csv.reader()` function to iterate over each row  
3. Print each row to understand how the data looks before doing any processing

 *In real data engineering projects, CSV files are one of the most common formats exchanged between systems (e.g., exports from databases, logs, or ETL outputs).*


In [9]:
import json

json_path = "../../datasets/profile.json"

# Read JSON
with open(json_path, "r") as f:
    data = json.load(f)

print("Original Data:\n", data)

# Modify JSON
data["skills"].append("Pandas")
data["certifications"] = ["AWS Data Engineer", "CPHQ"]

# Write new JSON
updated_json_path = "../../datasets/profile_updated.json"
with open(updated_json_path, "w") as f:
    json.dump(data, f, indent=4)

print("\nUpdated JSON saved to:", updated_json_path)
print("Updated Data:\n", data)

Original Data:
 {'name': 'Lakshmi Sharath', 'role': 'Data Engineer in Training', 'skills': ['Python', 'SQL', 'ETL']}

Updated JSON saved to: ../../datasets/profile_updated.json
Updated Data:
 {'name': 'Lakshmi Sharath', 'role': 'Data Engineer in Training', 'skills': ['Python', 'SQL', 'ETL', 'Pandas'], 'certifications': ['AWS Data Engineer', 'CPHQ']}


# Explanation
JSON files store structured data as key-value pairs.  
Weâ€™ll read, update, and save data using Pythonâ€™s `json` module.


# Explanation â€” Dataset File Handling
Weâ€™re using the **datasets** folder as a single source for all raw input files.  
Each path starts with `../../datasets/`, which tells Jupyter to move up two levels (from week3 â†’ root) and into the datasets directory.

This structure:
- Keeps data organized outside my notebooks  
- Prevents accidental overwriting  
- Matches real-world data engineering pipelines where data lives in a central storage layer  
