# MLflow on ProKube Platform - MLFlow with Images

This notebook demonstrates how to use MLflow in the ProKube platform with Personal Access Token (PAT) authentication on a slightly more complex example then the Quick Start.

## Prerequisites

- Generate your Personal Access Token:
  - Open [MLflow UI](/mlflow) in your browser
  - Navigate to the [Permissions](/mlflow/oidc/ui/#) page
  - Click on "Create access key" button
  - Copy the generated token and store it securely
  - Note: You won't be able to see the token again!

- Configure the credentials below with your values

## Authentication

The ProKube MLflow setup uses OIDC authentication with PAT support for programmatic access.


In [None]:
import os

# Set your MLflow credentials
os.environ['MLFLOW_TRACKING_URI'] = 'https://<your-domain>/mlflow'
os.environ['MLFLOW_TRACKING_USERNAME'] = '<your-email>'  
os.environ['MLFLOW_TRACKING_PASSWORD'] = '<your-token>'

In [None]:
import mlflow
from mlflow.models import infer_signature
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix
import matplotlib.pyplot as plt
import pandas as pd

# Each user should use their own exeriment and model name to avoid conflicts
username = os.getenv('MLFLOW_TRACKING_USERNAME').split('@')[0]

# Set experiment - multiple users can use the same experiment name
mlflow.set_experiment(f"MLflow Images {username}")

# Load and prepare data
X, y = datasets.load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model with MLflow tracking
with mlflow.start_run():
    # Configure and train model
    model = LogisticRegression(max_iter=200, solver='lbfgs')
    model.fit(X_train, y_train)
    
    # Make predictions
    y_pred = model.predict(X_test)
    accuracy = accuracy_score(y_test, y_pred)
    
    # Log parameters and metrics
    mlflow.log_params({
        "solver": "lbfgs",
        "max_iter": 200,
        "test_size": 0.2
    })
    mlflow.log_metric("accuracy", accuracy)
    
    # Log model with signature and example
    signature = infer_signature(X_train, model.predict(X_train))
    
    # Register model - NOTE: Model names must be globally unique in MLflow!
    # Each user should use their own model name to avoid conflicts
    username = os.getenv('MLFLOW_TRACKING_USERNAME').split('@')[0]
    model_info = mlflow.sklearn.log_model(
        model, 
        "iris_model",
        signature=signature,
        input_example=X_train[:5],
        registered_model_name=f"iris-model-{username}"
    )
    
    # Create and log confusion matrix
    cm = confusion_matrix(y_test, y_pred)
    fig, ax = plt.subplots(figsize=(6, 6))
    im = ax.imshow(cm, interpolation='nearest', cmap='Blues')
    ax.figure.colorbar(im, ax=ax)
    
    # Add value annotations
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            ax.text(j, i, str(cm[i, j]), ha="center", va="center",
                   color="white" if cm[i, j] > cm.max() / 2 else "black")
    
    ax.set(title='Confusion Matrix', ylabel='True label', xlabel='Predicted label')
    plt.tight_layout()
    mlflow.log_figure(fig, "confusion_matrix.png")
    plt.close()
    
    # Log feature importance
    feature_names = datasets.load_iris().feature_names
    importance_df = pd.DataFrame({
        'feature': feature_names,
        'coefficient': model.coef_[0]
    }).sort_values('coefficient', ascending=False)
    
    fig, ax = plt.subplots(figsize=(8, 4))
    ax.barh(importance_df['feature'], importance_df['coefficient'])
    ax.set(title='Feature Importance', xlabel='Coefficient Value')
    plt.tight_layout()
    mlflow.log_figure(fig, "feature_importance.png")
    plt.close()
    
    # Log dataset info
    mlflow.log_dict({
        "dataset": "iris",
        "n_samples_train": len(X_train),
        "n_samples_test": len(X_test),
        "n_features": X_train.shape[1],
        "classes": ["setosa", "versicolor", "virginica"]
    }, "dataset_info.json")



After a successful run, you should see the direct link to your experiments and run above this line