# What is MLRun and Why It Matters

MLRun is an open-source MLOps orchestration framework that integrates feature stores, model training, deployment, and monitoring into a single, composable environment. It’s Kubernetes-native and designed for real-time and batch ML pipelines with traceability and governance baked in.

> MLRun is for what we call AutoMLOps, where the entire operationalization process is automated. MLRun uses serverless function technology: write the code once, using your preferred development environment and simple “local” semantics, and then run it as-is on different platforms and at scale. MLRun automates the build process, execution, data movement, scaling, versioning, parameterization, output tracking, CI/CD integration, deployment to production, monitoring, and more. MLRun provides an open pluggable architecture, so you have the option to use MLFlow (or any other tool) for the development side, and then use MLRun to automate the production distributed training environment without adding glue logic.  
> [source](https://www.iguazio.com/blog/kubeflow-vs-mlflow-vs-mlrun/)

In [1]:
import mlrun

> 2025-08-07 14:42:52,103 [info] Server and client versions are not the same but compatible: {"parsed_client_version":"Version(major=1, minor=7, patch=2, prerelease=None, build=None)","parsed_server_version":"Version(major=1, minor=9, patch=1, prerelease=None, build=None)"}


In [None]:
# Show the API server URL
mlrun.get_run_db()

In [None]:
# Set the base project name
project_name = "mlrun-demo"

# Initialize the MLRun project object
project = mlrun.get_or_create_project(
    name=project_name, 
    context="./",
    user_project=True)

# Display the current project name
project_name = project.metadata.name
print(f'Full project name: {project_name}')

## 1. FeatureSet Ingest

- https://docs.mlrun.org/en/latest/feature-store/feature-sets.html
- https://www.iguazio.com/blog/the-complete-guide-to-using-the-iguazio-feature-store-with-azure-ml-part-2/

In [None]:
import pandas as pd
import mlrun.feature_store as fstore
from mlrun.feature_store import FeatureSet
from mlrun.datastore import ParquetTarget

In [None]:
# read the source data from the CSV file
df_source = pd.read_csv("data/iris.csv")

# create a str primary key for the feature set
df_source.reset_index(drop=False, inplace=True)
df_source.rename(columns={"index": "id"}, inplace=True)
df_source["id"] = df_source["id"].astype(str)


df_source.head()

In [None]:
# create the feature set
fs_iris = FeatureSet(name="iris_features",
                     entities=["id"])

# # Add a local Parquet target
# fs_iris.set_targets([ParquetTarget(path=project.artifact_path)], with_defaults=False)

# ingest the source data
fs_iris.ingest(df_source)
# df_iris = fstore.ingest(featureset=fs_iris,
#                         source=df_source)

# create the dataset
fv_iris = fstore.FeatureVector(name="iris_vector",
                                   features=["iris_features.*"], 
                                   label_feature="iris_features.label",
                                   with_indexes=True)
fv_iris.save()

In [None]:
# # Delete a feature set by name and project
# fstore.delete_feature_set(name="iris_features",
#                           project=project_name,
#                           force=True)


In [None]:
## Retrieve the feature set
print(f"Retrieving the feature set from:\n{fv_iris.uri}")

offline_features = fstore.get_feature_vector(fv_iris.uri).get_offline_features()
offline_features.to_dataframe().head()

## 2. Register and Run Training

- https://www.iguazio.com/blog/the-complete-guide-to-using-the-iguazio-feature-store-with-azure-ml-part-3/

In [None]:
# create the function for training the model
fn_train = project.set_function(
    func="01_train.py",
    name="train",
    kind="job",
    image="mlrun/mlrun")

In [None]:
# run the training function
run = fn_train.run(
    inputs={"dataset": fv_iris.uri},
    handler="train_model",
    artifact_path=project.artifact_path,
    local=True)