# **10. Input/Output Operations**

# 🌐 3. JSON Files in Pandas (Reading and Writing)

In [29]:
import pandas as pd 

## 1️⃣ What It Does and When to Use It

### ✅ What it does:

* **JSON (JavaScript Object Notation)** is a lightweight format for storing and exchanging data as key-value pairs.
* In pandas:

  * `pd.read_json()` → **Reads** JSON data into a DataFrame.
  * `df.to_json()` → **Writes** a DataFrame to a JSON-formatted file.

### 📌 When to use:

* When working with **web APIs**, **NoSQL databases**, or **config files**.
* When exchanging data with **JavaScript applications** or other systems using REST APIs.
* Useful for **nested** or semi-structured data formats.

## 2️⃣ Syntax and Key Parameters

### 🔹 Reading JSON — `pd.read_json()`

```python
pd.read_json(path_or_buf, orient=None, lines=False)
```

| Parameter     | Description                                                     |
| ------------- | --------------------------------------------------------------- |
| `path_or_buf` | File path, URL, or JSON string                                  |
| `orient`      | Expected format: `'records'`, `'index'`, `'split'`, `'columns'` |
| `lines`       | Whether file is line-delimited JSON (newline-separated)         |

---

### 🔹 Writing JSON — `df.to_json()`

```python
df.to_json(path_or_buf=None, orient='columns', lines=False)
```

| Parameter     | Description                                                   |
| ------------- | ------------------------------------------------------------- |
| `path_or_buf` | Path or buffer to write to                                    |
| `orient`      | Output format: `'records'`, `'index'`, `'split'`, `'columns'` |
| `lines`       | Write as line-delimited JSON                                  |
| `indent`      | Format JSON with indentation                                  |


## 3️⃣ Examples of Reading/Writing

### 📥 Reading from JSON

In [30]:
# Read standard JSON file
df = pd.read_json('data files/json/employees.json')

df

Unnamed: 0,EmployeeID,Name,Department,Salary,JoiningDate
0,101,Alice,HR,50000,2020-01-15
1,102,Bob,Engineering,75000,2019-07-23
2,103,Charlie,Sales,62000,2021-03-12
3,104,David,Marketing,58000,2018-11-30


In [31]:
# If line-delimited JSON (1 row per line)
df = pd.read_json('data files/json/employees.json', lines=True)

df

Unnamed: 0,0,1,2,3
0,"{'EmployeeID': 101, 'Name': 'Alice', 'Departme...","{'EmployeeID': 102, 'Name': 'Bob', 'Department...","{'EmployeeID': 103, 'Name': 'Charlie', 'Depart...","{'EmployeeID': 104, 'Name': 'David', 'Departme..."


In [32]:
df[0]

0    {'EmployeeID': 101, 'Name': 'Alice', 'Departme...
Name: 0, dtype: object

In [33]:
# With a specific orientation (e.g., 'records')
df = pd.read_json('data files/json/employees.json', orient='records')

df

Unnamed: 0,EmployeeID,Name,Department,Salary,JoiningDate
0,101,Alice,HR,50000,2020-01-15
1,102,Bob,Engineering,75000,2019-07-23
2,103,Charlie,Sales,62000,2021-03-12
3,104,David,Marketing,58000,2018-11-30


### 📤 Writing to JSON

In [34]:
# Save as standard JSON (default orientation is 'columns')
df.to_json('data files/json/employees write.json')

In [35]:
# Save as list of records (better for API exchange)
df.to_json('data files/json/employees records.json', orient='records')

In [None]:
# Save as line-delimited JSON (useful for streaming large data)
df.to_json(
    'data files/json/employees records lines.json', orient='records', lines=True
)

In [None]:
# Save pretty-formatted JSON
df.to_json(
    'data files/json/employees records pretty.json', orient='records', indent=4
)

## 4️⃣ Common Pitfalls

| Pitfall                  | Description & Solution                                                                         |
| ------------------------ | ---------------------------------------------------------------------------------------------- |
| **Wrong orientation**    | `pd.read_json()` expects a specific format. Use `orient='records'` for row-wise lists.         |
| **Missing `lines=True`** | For newline-delimited JSON, forgeting `lines=True` will cause errors.                          |
| **Nested JSON**          | `read_json()` struggles with deeply nested objects — use `json_normalize()` or custom parsers. |
| **Loss of formatting**   | Date formats and data types may not be preserved unless explicitly handled.                    |
| **Encoding issues**      | Use `encoding='utf-8'` if dealing with special characters.                                     |


## 5️⃣ Real-World Usage

### 🌐 Web APIs

* APIs often return responses in JSON format — parse them into DataFrames using `pd.read_json()`.

### 🔍 Config & Metadata Storage

* JSON is used for saving experiment settings, user preferences, etc.

### 📦 NoSQL/Document Databases

* Databases like MongoDB use JSON-style formats — useful when pulling/exporting collections.

### 📊 Streaming Data

* Line-delimited JSON (`lines=True`) is used for processing logs, sensor data, etc.

## ✅ Summary Table

| Task                | Method             |
| ------------------- | ------------------ |
| Read JSON file      | `pd.read_json()`   |
| Write JSON file     | `df.to_json()`     |
| Row-wise records    | `orient='records'` |
| Line-delimited JSON | `lines=True`       |

<center><b>Thanks</b></center>