# 🧪 Full Pipeline Execution (Notebook Mode)

This notebook demonstrates the full pipeline execution using the master controller script `run_toolkit_pipeline.py`.

- Controlled via: `config/run_toolkit_config.yaml`
- Executes all pipeline modules in sequence (M01–M10)
- Outputs: Dashboards, reports, plots, and the final certified dataset
- ✅ Set `notebook: true` in the YAML to enable inline dashboards

>📂 Final outputs are exported to the `exports/` and `data/processed/` directories.

---

<details>
<summary><strong>📎 Notes & Use Cases</strong></summary>

**🧭 Notes**
- Fully modular pipeline execution from raw to certified clean data
- Configurable behavior using a single YAML file
- Can be executed interactively (with displays) or headlessly (silent mode)

**💼 Use Cases**
- End-to-end QA audits for new or synthetic datasets
- Validating preprocessing logic during exploratory workflows
- Certifying pipeline output before downstream modeling
- Showcasing toolkit capabilities in interviews or portfolio reviews

</details>

<details>
<summary><strong>🔁 Alternate Modes</strong></summary>

- Set `notebook: false` in the YAML to run this notebook silently (ideal for automation or CI).
- Run the pipeline as a CLI script outside notebooks with:

```bash
python run_toolkit_pipeline.py --config config/run_toolkit_config.yaml


In [None]:
from analyst_toolkit.run_toolkit_pipeline import run_full_pipeline

final_df = run_full_pipeline(config_path="config/run_toolkit_config.yaml")

## 🛠️ Next Steps

This notebook demonstrates the full analyst pipeline using notebook mode. The following enhancements are planned or encouraged for production workflows:

#### ✅ CLI and Automation
- Use the CLI version for scheduled or automated runs:
  
  ```bash
  python run_toolkit_pipeline.py --config config/run_toolkit_config.yaml
  ```

- Integrate into GitHub Actions or cron jobs for continuous data QA
- Swap YAML configs to support different datasets or audit targets

#### 🚀 Planned Iterations
- Add dynamic changelog to fallow data end to end.
- Extend to namespace, and add addtional modules;
  - ML Module Evaluation Suite
  - Visual EDA Suite
- Optional integration with cloud storage (GCS / S3) for inputs and outputs
- Create a streamlined CLI onboarding script (e.g., init_pipeline.py) to scaffold configs

#### 📦 Packaging Notes
- The toolkit is TOML-packaged and installable as a local Python module
- Follows modular design to support interactive, notebook, and script-based workflows
