<a href="https://colab.research.google.com/github/blacktalenthubs/data-engineering-track/blob/main/week2_dealing_with_files.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Week 2: File Handling, JSON/CSV, and API Interaction

#### Topics Covered:

1. **File Handling:**
   - **Reading from and Writing to Files:**
     - **Definition:** Understanding how to open, read, write, and close files in Python.
     - **Example:**
       ```python
       # Writing to a file
       with open('example.txt', 'w') as file:
           file.write('Hello, world!')

       # Reading from a file
       with open('example.txt', 'r') as file:
           content = file.read()
           print(content)  # Output: Hello, world!
       ```

2. **Working with JSON Data:**
   - **Parsing and Generating JSON:**
     - **Definition:** JSON (JavaScript Object Notation) is a lightweight data interchange format. Parsing JSON means converting a JSON string into a Python dictionary, and generating JSON means converting a Python dictionary into a JSON string.
     - **Example:**
       ```python
       import json

       # Parsing JSON
       json_data = '{"name": "John", "age": 30}'
       data = json.loads(json_data)
       print(data['name'])  # Output: John

       # Generating JSON
       data = {'name': 'John', 'age': 30}
       json_data = json.dumps(data)
       print(json_data)  # Output: {"name": "John", "age": 30}
       ```

3. **CSV File Handling:**
   - **Reading from and Writing to CSV Files:**
     - **Definition:** CSV (Comma-Separated Values) files store tabular data in plain text. Reading CSV files means converting CSV content into a list of dictionaries or other Python structures, and writing CSV files means converting Python structures into CSV format.
     - **Example:**
       ```python
       import csv

       # Writing to a CSV file
       with open('example.csv', 'w', newline='') as csvfile:
           fieldnames = ['name', 'age']
           writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
           writer.writeheader()
           writer.writerow({'name': 'John', 'age': 30})
           writer.writerow({'name': 'Jane', 'age': 25})

       # Reading from a CSV file
       with open('example.csv', 'r') as csvfile:
           reader = csv.DictReader(csvfile)
           for row in reader:
               print(row['name'], row['age'])
       ```

4. **Making HTTP Requests:**
   - **Using the `requests` Library:**
     - **Definition:** The `requests` library allows you to send HTTP requests in Python. Common methods include GET (to retrieve data) and POST (to send data).
     - **Example:**
       ```python
       import requests

       # Sending a GET request
       response = requests.get('http://127.0.0.1:5000/users')
       print(response.json())

       # Sending a POST request
       data = {'name': 'Jane Doe', 'email': 'jane.doe@example.com', 'phone': '123-456-7890'}
       response = requests.post('http://127.0.0.1:5000/users', json=data)
       print(response.json())
       ```

5. **Interacting with APIs:**
   - **Sending GET and POST Requests, Handling Responses:**
     - **Definition:** APIs (Application Programming Interfaces) allow different software systems to communicate. Interacting with APIs involves sending requests to API endpoints and processing the responses.
     - **Example:**
       ```python
       import requests

       # Sending a GET request to the user API endpoint
       response = requests.get('http://127.0.0.1:5000/users')
       users_data = response.json()
       print(users_data)

       # Sending a POST request to the user API endpoint
       new_user = {'name': 'Alice', 'email': 'alice@example.com', 'phone': '123-456-7890'}
       response = requests.post('http://127.0.0.1:5000/users', json=new_user)
       print(response.json())
       ```

#### Mini Project:

**Description:**
- Create a Python script to fetch user data from the existing API endpoints created in Week 1, save the data to a JSON file, and then read and process the JSON data to generate a summary report.

**Steps:**
1. **Fetch User Data:**
   - Use the `requests` library to send a GET request to the user API endpoint.

2. **Save Data to a JSON File:**
   - Parse the JSON response and save the data to a local JSON file.

3. **Read and Process JSON Data:**
   - Read the JSON data from the file.
   - Extract relevant information (e.g., user names, emails).
   - Generate a summary report with the extracted information.

**Example Code:**
```python
import requests
import json
import csv

# Fetch user data from the API endpoint
response = requests.get('http://127.0.0.1:5000/users')
users_data = response.json()

# Save the data to a JSON file
with open('users_data.json', 'w') as json_file:
    json.dump(users_data, json_file)

# Read and process the JSON data
with open('users_data.json', 'r') as json_file:
    data = json.load(json_file)
    user_list = [{'name': user['name'], 'email': user['email']} for user in data]

    # Save to CSV for demonstration purposes
    with open('users_data.csv', 'w', newline='') as csvfile:
        fieldnames = ['name', 'email']
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(user_list)

    # Generate summary report
    summary_report = {
        'total_users': len(data),
        'users': user_list
    }

    print(summary_report)
```

**Outcome:**
- Students will learn how to handle various file formats (JSON, CSV) and interact with external APIs. This prepares them for real-world data extraction and processing tasks in data engineering.