## Structured, Semi-Structured, and Unstructured Data

**Structured Data:**
- Organized in a fixed format (e.g., tables, relational databases, CSV files).
- Easy to store and query using SQL.

**Semi-Structured Data:**
- Contains elements of structure but is not rigidly formatted (e.g., JSON, XML).
- Requires special parsers to extract useful information.

**Unstructured Data:**
- No predefined format (e.g., plain text, images, PDFs).
- Requires advanced processing like NLP or image recognition.


In [None]:
import pandas as pd
import json
import re

### Processing Structured Data (CSV)

In [None]:
csv_data = """id,name,age\n1,Alice,30\n2,Bob,25\n3,Charlie,35"""
with open("data.csv", "w") as f:
    f.write(csv_data)
df = pd.read_csv("data.csv")
print(df.head())

### Processing Semi-Structured Data (JSON)

In [None]:
json_data = '{"employees": [{"id": 1, "name": "Alice", "age": 30}, {"id": 2, "name": "Bob", "age": 25}]}'
with open("data.json", "w") as f:
    f.write(json_data)
with open("data.json", "r") as f:
    data = json.load(f)
print(data["employees"][0])

### Processing Unstructured Data (TXT)

In [None]:
txt_data = "Alice is 30 years old. Bob is 25 years old."
with open("data.txt", "w") as f:
    f.write(txt_data)
with open("data.txt", "r") as f:
    text = f.read()
ages = re.findall(r'\d+', text)
print("Extracted ages:", ages)