# Phase 9: Building Mini-Odibi Features
## Adding the capabilities that make a framework useful

In Phase 8, you built the skeleton. Now you add muscle:

1. **Transformer registry** -- register and look up transform functions by name
2. **Built-in transformers** -- rename, filter, cast, add columns
3. **Validation engine** -- data quality checks
4. **Pipeline orchestration** -- run multiple nodes in sequence

After this phase, mini-odibi will be a real, working framework.

---
## Step 1: Transformer Registry

A registry maps string names to functions. When a YAML config says 
`type: rename_columns`, the registry finds the corresponding Python function.

This is exactly how `odibi/registry.py` works.

In [None]:
# Write this to mini_odibi/registry.py
# YOUR IMPLEMENTATION:
#
# Create a TransformRegistry class with:
#   - _functions: Dict[str, Callable] (class variable)
#   - register(name, func) classmethod
#   - get(name) classmethod -- returns the function or raises ValueError
#   - list_functions() classmethod -- returns list of registered names
#
# Create a @transform decorator that registers a function:
#   @transform
#   def rename_columns(df, mapping):
#       return df.rename(columns=mapping)
#
# This combines Phase 4 (classes) and Phase 5 (decorators)

---
## Step 2: Built-in Transformers

Build these transformer functions and register them:

In [None]:
# Write this to mini_odibi/transformers.py
# YOUR IMPLEMENTATION:
#
# @transform
# def rename_columns(df, mapping):
#     """Rename columns using a mapping dict."""\n#     return df.rename(columns=mapping)
#
# @transform
# def filter_rows(df, condition):
#     """Filter rows using a query condition."""\n#     return df.query(condition)
#
# @transform
# def drop_columns(df, columns):
#     """Drop specified columns."""\n#     return df.drop(columns=columns)
#
# @transform
# def add_column(df, name, expression):
#     """Add a computed column."""\n#     df[name] = df.eval(expression)
#     return df
#
# @transform
# def cast_types(df, type_mapping):
#     """Cast column types."""\n#     return df.astype(type_mapping)

---
## Step 3: Validation Engine

Build a simple validation system that checks data quality rules.

In [None]:
# Write this to mini_odibi/validation.py
# YOUR IMPLEMENTATION:
#
# Create validation test functions:
#   - not_null(df, columns) -> list of error strings
#   - unique(df, columns) -> list of error strings
#   - accepted_values(df, column, values) -> list of error strings
#
# Create a validate_node(df, tests) function that:
#   - Runs each test
#   - Collects all errors
#   - Returns (passed: bool, errors: list)
#
# This mirrors odibi/validation/engine.py

---
## Step 4: Pipeline Orchestration

Build the pipeline that runs multiple nodes in sequence.

In [None]:
# Write this to mini_odibi/pipeline.py
# YOUR IMPLEMENTATION:
#
# Create a Pipeline class that:
#   - Takes a PipelineConfig
#   - Creates the engine based on config.engine
#   - Creates the connection based on config.connection
#   - Has a run() method that:
#     1. Loops through config.nodes
#     2. Creates a Node for each
#     3. Executes each node
#     4. Applies transforms from the registry
#     5. Runs validation if configured
#     6. Writes output
#     7. Prints a summary at the end
#
# This is the culmination of everything you have learned.

---
## Step 5: End-to-End Test

Create a YAML config and run your complete pipeline.

In [None]:
# Create a YAML config file and run it through your pipeline
# YOUR IMPLEMENTATION:
#
# 1. Create sample CSV files in test_data/
# 2. Write a pipeline.yaml config
# 3. Load the config with yaml.safe_load -> PipelineConfig
# 4. Create a Pipeline and call .run()
# 5. Verify the output files were created
#
# If this works, you have built a real data pipeline framework from scratch.

---
## Checkpoint

You now have a working mini-odibi with:
- YAML config -> Pydantic models
- Abstract engine with Pandas implementation
- Transform registry with built-in transformers
- Data validation
- Pipeline orchestration

You built every line yourself. You understand every design decision because you made them.

**Next:** Phase 10 -- Testing and Professional Skills.