# What is MLRun and Why It Matters

MLRun is an open-source MLOps orchestration framework that integrates feature stores, model training, deployment, and monitoring into a single, composable environment. It’s Kubernetes-native and designed for real-time and batch ML pipelines with traceability and governance baked in.

In [1]:
import mlrun

In [2]:
# Show the API server URL
mlrun.get_run_db()

HTTPRunDB('http://dragon:30070')

In [3]:
# Set the base project name
project_name = "mlrun-demo"

# Initialize the MLRun project object
project = mlrun.get_or_create_project(
    name=project_name, 
    context="./",
    user_project=True)

# Display the current project name
project_name = project.metadata.name
print(f'Full project name: {project_name}')

> 2025-07-09 14:25:30,411 [info] Loading project from path: {"path":"./","project_name":"mlrun-demo","user_project":true}
> 2025-07-09 14:25:30,449 [info] Project loaded successfully: {"path":"./","project_name":"mlrun-demo-johannes","stored_in_db":true}
Full project name: mlrun-demo-johannes


## 1. FeatureSet Ingest

- https://docs.mlrun.org/en/latest/feature-store/feature-sets.html
- https://www.iguazio.com/blog/the-complete-guide-to-using-the-iguazio-feature-store-with-azure-ml-part-2/

In [4]:
import pandas as pd
import mlrun.feature_store as fstore
from mlrun.feature_store import FeatureSet
from mlrun.datastore import ParquetTarget

In [5]:
# read the source data from the CSV file
df_source = pd.read_csv("data/iris.csv")
df_source.head()

Unnamed: 0,sepal_length_cm,sepal_width_cm,petal_length_cm,petal_width_cm,target,label
0,5.1,3.5,1.4,0.2,0,setosa
1,4.9,3.0,1.4,0.2,0,setosa
2,4.7,3.2,1.3,0.2,0,setosa
3,4.6,3.1,1.5,0.2,0,setosa
4,5.0,3.6,1.4,0.2,0,setosa


In [6]:
# create the feature set
fs_iris = FeatureSet(name="iris_features",
                     entities=["sepal_length_cm"])

# # Add a local Parquet target
# fs_iris.set_targets([ParquetTarget(path=project.artifact_path)], with_defaults=False)

# ingest the source data
df_iris = fstore.ingest(featureset=fs_iris,
                        source=df_source)

# create the dataset
fv_iris = fstore.FeatureVector(name="iris_vector",
                                   features=["iris_features.*"], 
                                   label_feature="iris_features.label",
                                   with_indexes=True)
fv_iris.save()



In [7]:
# # Delete a feature set by name and project
# fstore.delete_feature_set(name="iris_features",
#                           project=project_name,
#                           force=True)


In [None]:
## Retrieve the feature set
print(f"Retrieving the feature set from:\n{fv_iris.uri}")

fstore.FeatureVector.get_offline_features(fv_iris.uri).to_dataframe()

Retrieving the feature set from:
store://feature-vectors/mlrun-demo-johannes/iris_vector


Unnamed: 0_level_0,sepal_width_cm,petal_length_cm,petal_width_cm,target,label
sepal_length_cm,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
5.1,3.5,1.4,0.2,0,setosa
4.9,3.0,1.4,0.2,0,setosa
4.7,3.2,1.3,0.2,0,setosa
4.6,3.1,1.5,0.2,0,setosa
5.0,3.6,1.4,0.2,0,setosa
...,...,...,...,...,...
6.7,3.0,5.2,2.3,2,virginica
6.3,2.5,5.0,1.9,2,virginica
6.5,3.0,5.2,2.0,2,virginica
6.2,3.4,5.4,2.3,2,virginica


## 2. Register and Run Training

In [9]:
# create the function for training the model
fn_train = project.set_function(
    func="01_train.py",
    name="train",
    kind="job",
    image="mlrun/mlrun")

In [None]:
# run the training function
run = fn_train.run(
    inputs={"dataset": fv_iris.uri},
    handler="train_model",
    artifact_path=project.artifact_path,
    local=True,)

> 2025-07-09 14:25:34,401 [info] Storing function: {"db":"http://dragon:30070","name":"train-train-model","uid":"3fce051dacd641c192eb7dca2ca81cf8"}
s3://mlrun/projects/mlrun-demo-johannes/FeatureStore/iris_features/parquet/sets/iris_features/1752063930692_808/


project,uid,iter,start,state,kind,name,labels,inputs,parameters,results
mlrun-demo-johannes,...a81cf8,0,Jul 09 12:25:34,completed,run,train-train-model,v3io_user=johanneskind=localowner=johanneshost=m-vodacom-joaf.lan,dataset,,





> 2025-07-09 14:25:35,950 [info] Run execution finished: {"name":"train-train-model","status":"completed"}
