# Session 16 🐍

☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️☀️

***

# 97. CSV (Comma-Separated Values) Files
CSV is a simple file format used to store tabular data, such as spreadsheets or databases. It's one of the most common data interchange formats because it's human-readable, easy to work with, and supported by virtually all data processing tools.

A CSV file stores tabular data (numbers and text) in plain text form, where:
- Each line represents a record (row)
- Each record consists of fields (columns) separated by commas
- The first line often contains column headers

***

### Example CSV:

***

# 98. CSV Variations
While "CSV" stands for "Comma-Separated Values", there are variations:
- TSV (Tab-Separated Values): Uses tabs instead of commas
- SSV (Space-Separated Values): Uses spaces
- Custom delimiters: Some files use pipes (|) or semicolons (;)

***

# 99. Working with CSV in Python
Python's standard library includes a csv module for reading and writing CSV files.

***

## 99-1. Reading CSV Files

In [None]:
import csv

# Basic reading
with open('data.csv', 'r') as file:
    reader = csv.reader(file)
    for row in reader:
        print(row)  # Each row is a list of strings

In [None]:
# Reading as dictionaries (using headers)
with open('data.csv', 'r') as file:
    reader = csv.DictReader(file)
    for row in reader:
        print(row['name'], row['age'])  # Access by column name

***

## 99-2. Writing CSV Files

In [None]:
import csv

# Basic writing
data = [
    ['name', 'age', 'city'],
    ['Alice', '28', 'New York'],
    ['Bob', '32', 'Chicago']
]

with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerows(data)

In [None]:
# Writing from dictionaries
data = [
    {'name': 'Alice', 'age': 28, 'city': 'New York'},
    {'name': 'Bob', 'age': 32, 'city': 'Chicago'}
]

with open('output.csv', 'w', newline='') as file:
    writer = csv.DictWriter(file, fieldnames=['name', 'age', 'city'])
    writer.writeheader()
    writer.writerows(data)

***

# 100. JSON (JavaScript Object Notation)
JSON is a lightweight data interchange format that's easy for humans to read and write, and easy for machines to parse and generate. It's become the standard format for data exchange in web applications and APIs.

JSON is a text format that's completely language-independent but uses conventions familiar to programmers of C-style languages (including JavaScript, Python, Java, and many others).

**Key Characteristics:**
- Human-readable - Easy to understand at a glance
- Lightweight - Minimal syntax overhead
- Structured - Organizes data in a clear hierarchy
- Language-independent - Works with virtually all programming languages

***

# 101. JSON Syntax
JSON is built on two universal data structures:

***

## 101-1. Objects: 
Unordered collections of key-value pairs (like Python dictionaries)

In [None]:
{
  "name": "John Doe",
  "age": 30,
  "isStudent": false
}

***

## 101-2. Arrays: 
Ordered lists of values (like Python lists)

In [None]:
["apple", "banana", "orange"] 

***

# 102. Data Types in JSON:
- Strings: "text" (must use double quotes)
- Numbers: 42 or 3.14
- Booleans: true or false
- Null: null
- Objects: {...}
- Arrays: [...]

***

# 103. JSON vs Python Dictionaries
While similar to Python dictionaries, JSON has some differences:
- JSON keys must be strings (in double quotes)
- JSON only supports the data types listed above
- JSON doesn't support Python-specific features like tuples, sets, or custom objects

***

# 104. Working with JSON in Python
Python includes a **json module** in its standard library.

***

## 104-1. Serialization (Python → JSON)

In [1]:
import json

data = {
    "name": "Alice",
    "age": 25,
    "courses": ["Math", "Science"]
}

json_string = json.dumps(data)  # Convert to JSON string
print(json_string)

{"name": "Alice", "age": 25, "courses": ["Math", "Science"]}


***

## 104-2. Deserialization (JSON → Python)

In [2]:
json_data = '{"name": "Bob", "age": 30, "isActive": true}'
python_dict = json.loads(json_data)  # Convert to Python dict
print(python_dict["name"])  

Bob


***

## 104-3. Reading/Writing JSON Files

In [None]:
# Write to file
with open('data.json', 'w') as f:
    json.dump(data, f)

# Read from file
with open('data.json', 'r') as f:
    loaded_data = json.load(f)

***

***

# Some Excercises

**1.** Write a function **csv_stats(file_path)** that:
- Reads a CSV file containing numbers in a single column.
- Returns a dictionary with { "min": _, "max": _, "avg": _ }.
- Uses the csv module and handles file exceptions.

___

**2.** Write a function **filter_csv(input_path, output_path, condition_func)** that:
- Reads a CSV file with headers.
- Writes rows to a new CSV file where condition_func(row) returns True.
- Preserves headers in the output.

---

**3.** Write a function **csv_to_json(csv_path, json_path)** that:
- Converts a CSV file to a JSON file.
- Each row becomes a dictionary with headers as keys.
- Output JSON should be a list of objects.

---

**4.**  Write a function **json_to_csv(json_path, csv_path)** that:
- Converts a JSON array of objects to a CSV file.
- Uses the first object’s keys as headers.
- Handles missing values (replace with "").

***

**5.** Write a function **merge_csvs(file1, file2, output_path)** that:
- Combines two CSV files with identical headers.
- Skips duplicate rows (all fields must match).
- Uses csv.DictReader and csv.DictWriter.

**Note:** Assume both files have the same structure.

***

**6.** Write a function **validate_json(json_path, schema)** that:
- Checks if a JSON file matches a schema (given as a Python dict).
- Returns True if valid, False otherwise.
- Validates:
    - Required fields.
    - Field types (e.g., "age" must be int).

***

**7.** Write a function **flatten_json(json_path, csv_path)** that:
- Flattens a nested JSON file (e.g., {"user": {"name": "Alice"}} → {"user.name": "Alice"}).
- Writes the flattened data to CSV.
- Handles arbitrary nesting (use dot notation for keys).

***

**8.** Write a function **parse_tsv(tsv_path)** that:
- Reads a TSV file (tab-separated).
- Returns data as a list of dictionaries.
- Uses csv.reader with a custom delimiter.

***

#                                                        🌞 https://github.com/AI-Planet 🌞