# Phase 8: Building Mini-Odibi Core
## Putting it all together -- the architecture

This is where everything you have learned comes together. You will build the core 
of mini-odibi from scratch:

1. **Config system** (Pydantic models from Phase 6)
2. **Base engine** (ABC from Phase 4)
3. **Pandas engine** (Pandas from Phase 7)
4. **Connections** (OOP from Phase 4)
5. **Node execution** (everything combined)

After this phase, you will have a working framework that reads YAML, loads data, 
and writes output. This is the real deal.

**IMPORTANT:** From this point on, you write `.py` files in `learning/mini_odibi/`. 
This notebook guides you through what to build. The code goes in real files.

---
## Step 1: Project Structure

Create these files in `learning/mini_odibi/`:
```
mini_odibi/
  __init__.py          # Empty for now
  config.py            # Pydantic config models
  exceptions.py        # Custom exceptions
  engine/
    __init__.py
    base.py            # Abstract BaseEngine
    pandas_engine.py   # Pandas implementation
  connections.py       # Connection classes
  node.py             # Node execution
```

---
## Step 2: exceptions.py

Start with the simplest file. Create custom exceptions:

In [None]:
# Write this to mini_odibi/exceptions.py
# DO NOT copy-paste. Type it yourself.

# Here is the pattern from odibi/exceptions.py:

class MiniOdibiError(Exception):
    """Base exception for mini-odibi."""\n    pass

class ConfigError(MiniOdibiError):
    """Configuration validation failed."""\n    def __init__(self, message, field=None):
        self.field = field
        super().__init__(f"Config error{f\" ({field})\" if field else \"\"}: {message}")

class ConnectionError(MiniOdibiError):
    """Connection failed."""\n    pass

class ExecutionError(MiniOdibiError):
    """Node execution failed."""\n    def __init__(self, node_name, message, original_error=None):
        self.node_name = node_name
        self.original_error = original_error
        super().__init__(f"Node '{node_name}' failed: {message}")

---
## Step 3: config.py

Build Pydantic models for your configuration. Reference Phase 6.

In [None]:
# Write this to mini_odibi/config.py
# Use what you learned in Phase 6
# YOUR IMPLEMENTATION:
#
# 1. Create an EngineType enum (str, Enum): pandas, spark, polars
# 2. Create a WriteMode enum (str, Enum): overwrite, append, upsert
# 3. Create TransformConfig(BaseModel): type (str), params (dict)
# 4. Create NodeConfig(BaseModel): name, source, format, write_mode, keys, transforms
# 5. Create ConnectionConfig(BaseModel): name, type, base_path
# 6. Create PipelineConfig(BaseModel): name, engine, connection, nodes
#
# Add validators:
#   - name must not be empty
#   - keys required when write_mode is upsert
#
# Test by creating a PipelineConfig from a dict

---
## Step 4: engine/base.py

Create the abstract base engine. Reference Phase 4, Section 5.

In [None]:
# Write this to mini_odibi/engine/base.py
# YOUR IMPLEMENTATION:
#
# Create BaseEngine(ABC) with abstract methods:
#   - read(self, path, format, options=None) -> Any
#   - write(self, df, path, format, mode="overwrite", options=None) -> None
#   - count_rows(self, df) -> int
#   - get_columns(self, df) -> List[str]
#   - filter_rows(self, df, condition) -> Any
#
# Non-abstract methods:
#   - describe(self) -> str  (returns engine class name)

---
## Step 5: engine/pandas_engine.py

Implement the engine contract with Pandas. Reference Phase 7.

In [None]:
# Write this to mini_odibi/engine/pandas_engine.py
# YOUR IMPLEMENTATION:
#
# Create PandasEngine(BaseEngine) that implements all abstract methods:
#   - read: use pd.read_csv, pd.read_parquet, pd.read_json based on format
#   - write: use df.to_csv, df.to_parquet based on format and mode
#   - count_rows: return len(df)
#   - get_columns: return list(df.columns)
#   - filter_rows: use df.query(condition)
#
# Handle errors gracefully with try/except

---
## Step 6: connections.py

Create the connection layer. Reference Phase 4.

In [None]:
# Write this to mini_odibi/connections.py
# YOUR IMPLEMENTATION:
#
# Create BaseConnection(ABC) with:
#   - abstract method: resolve_path(self, relative_path) -> str
#   - connect() and disconnect() methods
#
# Create LocalConnection(BaseConnection):
#   - __init__(self, base_path="./data")
#   - resolve_path joins base_path with relative_path using Path

---
## Step 7: node.py

The node ties everything together -- it reads, transforms, and writes data.

In [None]:
# Write this to mini_odibi/node.py
# YOUR IMPLEMENTATION:
#
# Create a Node class that:
#   - Takes: config (NodeConfig), engine (BaseEngine), connection (BaseConnection)
#   - Has an execute() method that:
#     1. Resolves the source path using connection
#     2. Reads data using engine.read()
#     3. Logs the row count
#     4. Returns the DataFrame
#   - Wraps everything in try/except for error handling
#   - Tracks duration using time.time()

---
## Step 8: Test Your Core

Create a sample CSV file and test the full flow.

In [None]:
# Test your mini-odibi core
# Create a test CSV first:

import csv
from pathlib import Path

# Create test data directory
test_dir = Path("learning/mini_odibi/test_data")
test_dir.mkdir(exist_ok=True)

# Write sample CSV
with open(test_dir / "customers.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=["id", "name", "email"])
    writer.writeheader()
    writer.writerow({"id": 1, "name": "Alice", "email": "alice@test.com"})
    writer.writerow({"id": 2, "name": "Bob", "email": "bob@test.com"})
    writer.writerow({"id": 3, "name": "Charlie", "email": "charlie@test.com"})

print("Test data created!")

# Now test your mini-odibi:
# from mini_odibi.config import NodeConfig, PipelineConfig
# from mini_odibi.engine.pandas_engine import PandasEngine
# from mini_odibi.connections import LocalConnection
# from mini_odibi.node import Node
#
# engine = PandasEngine()
# conn = LocalConnection("learning/mini_odibi/test_data")
# config = NodeConfig(name="customers", source="customers.csv", format="csv")
# node = Node(config, engine, conn)
# df = node.execute()
# print(df)

---
## Checkpoint

If your test runs and prints the DataFrame, you have a working mini-odibi core.
You built it yourself: config system, engine abstraction, connections, node execution.

**Next:** Phase 9 -- Adding features (registry, transformers, validation, pipeline orchestration).