## MLflow Prompt Management Lab 

## Step 1: Install Required Dependencies
Install the necessary packages for MLflow, pandas, scikit-learn, and pyngrok:

In [1]:
!pip install mlflow pandas scikit-learn pyngrok -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m24.7/24.7 MB[0m [31m54.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m83.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m247.0/247.0 kB[0m [31m21.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m147.8/147.8 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m114.9/114.9 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m85.0/85.0 kB[0m [31m7.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m676.2/676.2 kB[0m [31m49.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m203.4/203.4 kB[0m [31m19.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [2]:
!pip install langchain-google-genai -q

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/70.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m70.6/70.6 kB[0m [31m4.8 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
# Install ngrok
!pip install pyngrok -q

# Authenticate ngrok with your API key
from pyngrok import ngrok
ngrok.set_auth_token("Your_api_key")  # Replace with your ngrok authtoken



In [4]:
import subprocess
from pyngrok import ngrok

# Start MLflow server in the background
get_ipython().system_raw("mlflow server --host 127.0.0.1 --port 5000 &")

# Create an ngrok tunnel to the MLflow server
public_url = ngrok.connect(5000, "http")
print(f"MLflow UI is available at: {public_url}")

MLflow UI is available at: NgrokTunnel: "https://c8e918c776c4.ngrok-free.app" -> "http://localhost:5000"


## Step 2: Initialize MLflow and Set Up Experiment
Set up MLflow tracking and create an experiment for prompt management:

In [5]:
import mlflow
import mlflow.tracking
import pandas as pd
from datetime import datetime
import os
import subprocess

# Start MLflow - this will track everything for us
mlflow.set_tracking_uri(" http://127.0.0.1:5000")
mlflow.set_experiment("Prompt Management Lab")

print("✅ MLflow is ready to manage our prompts!")

2025/07/18 04:24:17 INFO mlflow.tracking.fluent: Experiment with name 'Prompt Management Lab' does not exist. Creating a new experiment.


✅ MLflow is ready to manage our prompts!


## Step 3: Register a Prompt Template
Register your first prompt template in MLflow:

In [8]:
import mlflow

system_prompt = mlflow.genai.register_prompt(
    name="chatbot_prompt",
    template="You are a chatbot that can answer questions about IT. Answer this question: {{question}}",
    commit_message="Initial version of chatbot",
)

2025/07/18 04:31:16 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for prompt version to finish creation. Prompt name: chatbot_prompt, version 1


## Step 4: Create LangChain Integration
Convert the MLflow prompt to LangChain format and build a processing chain:

In [None]:
!pip install langchain-google-genai -q

In [None]:
from langchain.schema.output_parser import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_google_genai import ChatGoogleGenerativeAI

GEMINI_API_KEY= "Your_api_key"
# Convert MLflow prompt to LangChain format
prompt = ChatPromptTemplate.from_template(system_prompt.to_single_brace_format())
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0.7,google_api_key = GEMINI_API_KEY )
# Build the chain: prompt → LLM → output parser
chain = prompt | llm | StrOutputParser()

# Test the chain
question = "What is MLflow?"
print(chain.invoke({"question": question}))
# MLflow is an open-source platform for mana

MLflow is an open-source platform to manage the complete machine learning (ML) lifecycle. Think of it as a toolkit that helps you develop, track, deploy, and manage your ML models more effectively. It's designed to address the common challenges faced when building and deploying machine learning applications, such as:

*   **Experiment Tracking:** Keeping track of different model versions, parameters, metrics, and artifacts (like code, data, and models) during experimentation.
*   **Reproducibility:** Ensuring that you can reliably recreate your experiments and models later on.
*   **Model Packaging:** Packaging models in a standardized format so they can be deployed to different environments.
*   **Model Deployment:** Deploying models to various platforms, such as cloud services, on-premises servers, or edge devices.
*   **Model Registry:** Providing a central repository to store, version, and manage your trained models.

**Here's a breakdown of the key components of MLflow:**

*   **M

## Step 5: Enable Model Tracking and Autologging
Set up automatic tracking for all LLM interactions:

In [10]:
# Set the active model for linking traces
mlflow.set_active_model(name="langchain_model")

# Enable autologging - all traces will be automatically linked to the active model
mlflow.langchain.autolog()

2025/07/18 04:31:33 INFO mlflow.tracking.fluent: LoggedModel with name 'langchain_model' does not exist, creating one...
2025/07/18 04:31:34 INFO mlflow.tracking.fluent: Active model is set to the logged model with ID: m-f72a42242da04d1d9353c81087813274


## Step 6: Run Multiple Test Questions and Track Results
Execute multiple questions and verify trace tracking:

In [11]:
questions = [
    {"question": "What is MLflow Tracking and how does it work?"},
    {"question": "What is Unity Catalog?"},
    {"question": "What are user-defined functions (UDFs)?"},
]
outputs = []

for question in questions:
    outputs.append(chain.invoke(question))

# Verify traces are linked to the active model
active_model_id = mlflow.get_active_model_id()
mlflow.search_traces(model_id=active_model_id)

Unnamed: 0,trace_id,trace,client_request_id,state,request_time,execution_duration,request,response,trace_metadata,tags,spans,assessments
0,6c254e6432264b589ba095e7b7d0ecb3,Trace(trace_id=6c254e6432264b589ba095e7b7d0ecb3),,TraceState.OK,1752813110810,5120,{'question': 'What are user-defined functions ...,User-Defined Functions (UDFs) are essentially ...,"{'mlflow.user': 'root', 'mlflow.source.git.com...",{'mlflow.artifactLocation': 'mlflow-artifacts:...,"[{'trace_id': 'iJFvmjPcKRmyFOjsr8/XsQ==', 'spa...",[]
1,2af464e94401400bb95a21fe4e07718b,Trace(trace_id=2af464e94401400bb95a21fe4e07718b),,TraceState.OK,1752813106379,4415,{'question': 'What is Unity Catalog?'},Unity Catalog is a comprehensive data governan...,"{'mlflow.user': 'root', 'mlflow.source.git.com...",{'mlflow.artifactLocation': 'mlflow-artifacts:...,"[{'trace_id': 'OR9xmpz3XaWXGYbkydoWNw==', 'spa...",[]
2,e5b5a50b76124bb0988f0cedce2379ab,Trace(trace_id=e5b5a50b76124bb0988f0cedce2379ab),,TraceState.OK,1752813097503,8859,{'question': 'What is MLflow Tracking and how ...,"Okay, here's an explanation of MLflow Tracking...","{'mlflow.user': 'root', 'mlflow.source.git.com...",{'mlflow.artifactLocation': 'mlflow-artifacts:...,"[{'trace_id': 'AJ9RlC8XTt2Ehfjjv9oktg==', 'spa...",[]
