# Data-FlyWheel Blueprint Orchestrated By MLRun

This notebook demonstrates how MLRun orchestrates NVIDIA NIM microservices, providing automatic tracking, logging, and MLOps best practices. Each workflow step is modular - NIM microservices can work independently or as part of orchestrated workflows.

## Key Benefits

* **Infrastructure Abstraction**: MLRun eliminates glue code complexity, letting you focus on your use case rather than infrastructure management.
* **Complete Lifecycle Management**: From development to production, MLRun handles resource management, auto-scaling, and real-time monitoring.
* **Future Development**: Ongoing iterations will further reduce boilerplate code, making the transition from concept to production even more streamlined.

To learn more about MLRun, visit [mlrun.org](https://mlrun.org).

## 1. Creating an MLRun project
MLRun Project is a container for all your work on a particular ML application.
Projects host functions, workflows, artifacts (datasets, models, etc.), features (sets, vectors), and configuration (parameters, secrets, source, etc.).

Like in the previous notebook, we need to set the NGC API Key as well:
- `NGC_API_KEY` - Following the instructions at [Generating NGC API Keys](https://docs.nvidia.com/ngc/gpu-cloud/ngc-private-registry-user-guide/index.html#generating-api-key)

In [None]:
import mlrun
import os
from getpass import getpass

os.environ['NGC_API_KEY'] = getpass("Enter your NGC API Key")

In [None]:
project = mlrun.get_or_create_project(
    name="data-flywheel",
    context="../../",
    parameters={
        "source": "git://github.com/mlrun/nvidia-data-flywheel.git#main",
    },    
)

Now the project is set with all you need to run the Data-FlyWheel workflow.

It contains all the necessary functions and the workflow itself.

## 2. Running Data-FlyWheel job with MLRun

> **Notice**: For this initial version, the workflow can be run only with one configuration at a time.
> To run multiple configurations, run each configuration separately.

In the image below you can see an example of the view from the MLRun UI of the Data-Flywheel workflow in action. It can be found in project > Jobs and Workflows > Monitor Workflows.
![Workflow UI](../../img/workflow-ui.png)

In [None]:
configs = [
    {
        "model_name": "meta/llama-3.2-1b-instruct",
        "context_length": 8192,
        "gpus": 1,
        "pvc_size": "25Gi",
        "tag": "1.8.3",
        "customization_enabled": True
    }
]

## 2.1. Initial Run

For this tutorial, we will target the primary customer service agent by setting the `workload_id` to "primary_assistant" and we will set `client_id` to "aiva-1" which has **300** data points.

In [None]:
data_fly_wheel_workflow = project.run(
    name="data-flywheel-job",
    arguments={
        "workload_id": "primary_assistant",
        "client_id": "aiva-1",
        "configs": configs,
    },
    watch=True,
    engine="remote",
    dirty=True,
)

After the workflow is done we can see the evaluation results in the MLRUN UI - look for the artifact `finalize_evaluation-results-plot`.

Or view it like this:

In [None]:
project.get_artifact("finalize_evaluation-results-plot").to_dataitem().show()

## 2.2. Show Continuous Improvement (Optional)
To extend the flywheel run with additional data, we’ll launch a new job using `client_id` set to "aiva-2", which includes **500** data points, to evaluate the impact of increased data volume on performance.

In [None]:
data_fly_wheel_workflow = project.run(
    name="data-flywheel-job",
    arguments={
        "workload_id": "primary_assistant",
        "client_id": "aiva-2",
        "configs": configs,
    },
    watch=True,
    engine="remote",
    dirty=True,
)

In [None]:
project.get_artifact("finalize_evaluation-results-plot").to_dataitem().show()

Assuming we have now collected even more data points, let's kick off another flywheel run by setting `client_id` to "aiva-3" which includes **1,000** records.

In [None]:
data_fly_wheel_workflow = project.run(
    name="data-flywheel-job",
    arguments={
        "workload_id": "primary_assistant",
        "client_id": "aiva-3",
        "configs": configs,
    },
    watch=True,
    engine="remote",
    dirty=True,
)

In [None]:
project.get_artifact("finalize_evaluation-results-plot").to_dataitem().show()

After the run with 1,000 data points, we should observe the customized model’s score approaching 1.0. This indicates that the `LLama-3.2-1B-instruct` model achieves accuracy comparable to the much larger `LLama-3.3-70B-instruct` base model deployed in AI Virtual Assistant, while significantly reducing latency and compute usage thanks to its smaller size.