# **1. Introduction to MLOps**
MLOps is a set of best practices for automating and managing the ML lifecycle, including:

* Model versioning and tracking
* Experimentation and reproducibility
* Continuous integration/continuous deployment (CI/CD)
* Model monitoring and performance tracking

MLFlow and LangSmith are widely used for these tasks.

# **2. Overview of MLFlow**
MLFlow is an open-source MLOps platform with four key components:

**a. MLFlow Tracking**

* Logs and tracks ML experiments (parameters, metrics, artifacts)
* Supports multiple frameworks (TensorFlow, PyTorch, Scikit-learn)
* Enables experiment comparison and visualization

**b. MLFlow Projects**

* Standardizes ML code using a project format (e.g., conda.yaml or requirements.txt)
* Ensures reproducibility across different environments

**c. MLFlow Models**

* Packages models for deployment using a universal format
* Supports multiple deployment targets (Docker, Kubernetes, cloud services)

**d. MLFlow Registry**
* Manages and version-controls models
* Provides model lifecycle stages: staging, production, archived

# **3. Introduction to LangSmith**
LangSmith is a debugging, monitoring, and evaluation platform specifically designed for LLM (Large Language Model) applications. It helps in:

  * Tracing, debugging, and visualizing LLM execution paths
  * Evaluating performance using metrics and logs
  * Monitoring and improving GenAI models

LangSmith is useful for GenAI applications that involve:

  * LLM-based chatbots
  * AI-driven content generation
  * Text summarization and retrieval-augmented generation (RAG)

# **4. Implementing MLOps for GenAI with MLFlow and LangSmith**

In [3]:
!pip install mlflow -q


In [4]:
import mlflow
import subprocess

In [5]:
# Define the MLflow tracking URI with SQLite
MLFLOW_TRACKING_URI = "sqlite:///mlflow.db"

# Start the MLflow server using subprocess
subprocess.Popen(["mlflow", "ui", "--backend-store-uri", MLFLOW_TRACKING_URI, "--port", "5000"])

<Popen: returncode: None args: ['mlflow', 'ui', '--backend-store-uri', 'sqli...>

In [6]:
# Set MLflow tracking URI
mlflow.set_tracking_uri(MLFLOW_TRACKING_URI)

In [7]:
# Set or create an experiment
mlflow.set_experiment("GenAI Experiment")

2025/01/30 10:33:05 INFO mlflow.tracking.fluent: Experiment with name 'GenAI Experiment' does not exist. Creating a new experiment.


<Experiment: artifact_location='/content/mlruns/1', creation_time=1738233185029, experiment_id='1', last_update_time=1738233185029, lifecycle_stage='active', name='GenAI Experiment', tags={}>

In [8]:
!pip install langsmith




In [12]:
from google.colab import userdata
import os
import openai

openai_api= userdata.get("OPENAI_API_KEY")
langsmith_api= userdata.get("langsmith_api")

In [13]:
import os

# LangSmith variables
os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGSMITH_API_KEY"] = langsmith_api
os.environ["LANGSMITH_PROJECT"] = "genai_exp"


os.environ["OPENAI_API_KEY"] = openai_api


In [14]:
import openai
from langsmith.wrappers import wrap_openai
from langsmith import traceable

# Wrap OpenAI client with LangSmith tracking
client = wrap_openai(openai.Client())

In [15]:
@traceable  # Automatically trace this function
def pipeline(user_input: str):
    result = client.chat.completions.create(
        messages=[{"role": "user", "content": user_input}],
        model="gpt-3.5-turbo"  # Specify the model
    )
    return result.choices[0].message.content


In [16]:
response = pipeline("Hello, world!")
print(response)

Hello! How can I assist you today?
