
---

## 🔹 1. ⚙️ **MLflow Tracking** (🧠 Core Component)

---

### 📌 **What It Does**

MLflow Tracking logs and monitors experiments in GenAI & Agentic AI workflows — including prompts, LLM settings, performance metrics, and intermediate outputs.

---

### 🚀 **Common Use in GenAI/Agentic AI**

| Scenario                    | How MLflow Helps                                           |
| --------------------------- | ---------------------------------------------------------- |
| Prompt Engineering & Tuning | Track temperature, top\_p, stop sequences                  |
| Agent Workflow Logging      | Store each tool call, response, and retry attempt          |
| LLM Evaluation              | Record BLEU, ROUGE, accuracy, latency, bias, hallucination |
| Fine-tuning LLMs            | Compare multiple training runs with different configs      |

---

### ⚙️ **Key Functions with Usage**

| Function                 | Description                                                      | Example Code                                         |
| ------------------------ | ---------------------------------------------------------------- | ---------------------------------------------------- |
| `mlflow.start_run()`     | Start an experiment run context                                  | `mlflow.start_run(run_name="gpt4o_eval")`            |
| `mlflow.log_params()`    | Log all hyperparameters (e.g., temp, top\_p, retriever\_type)    | `mlflow.log_params({"temp": 0.7, "top_k": 20})`      |
| `mlflow.log_metrics()`   | Log numeric metrics like accuracy, BLEU, latency, etc.           | `mlflow.log_metrics({"BLEU": 0.72, "latency": 102})` |
| `mlflow.log_artifacts()` | Save artifacts: prompt templates, tokenizer files, configs, etc. | `mlflow.log_artifacts("./outputs/prompts")`          |
| `mlflow.get_run()`       | Retrieve metadata, params, metrics of a specific run             | `mlflow.get_run(run_id="12345abcde")`                |

---


### 🧠 Tip for Agent Workflows

Integrate MLflow into custom LangGraph step nodes or LangChain callbacks to **auto-log** every chain, retriever, or tool interaction — useful for long-running agents.

---

### 📦 Artifacts Example

| Type           | What to Store               | Why It Matters                             |
| -------------- | --------------------------- | ------------------------------------------ |
| `prompts/`     | Prompt templates (JSON/txt) | For versioning and fine-tuning comparisons |
| `configs.yaml` | Chain/agent configuration   | For reproducibility                        |
| `response.txt` | Output from LLM             | Evaluation, audit, or feedback loops       |

---

### 🧩 Visualization Concept (Cheat Sheet)

```
[Start Run] → [Log Params] → [Run Model/Agent] → [Log Metrics + Artifacts] → [End Run]
```

🟢 Use `mlflow ui` to open the tracking dashboard and compare LLM runs.

---


In [None]:
### ✅ Real-Time LangChain / LangGraph Example

import mlflow
from langchain.chat_models import ChatOpenAI

# Start run
with mlflow.start_run(run_name="retrieval-qa-agent"):

    # Log parameters
    mlflow.log_params({
        "model_name": "gpt-4o",
        "temperature": 0.2,
        "retriever": "Chroma",
    })

    # Your LangChain logic
    llm = ChatOpenAI(model="gpt-4o", temperature=0.2)
    result = llm.invoke("Explain LangGraph")

    # Log a metric and output
    mlflow.log_metrics({"response_time": 1.2})
    with open("response.txt", "w") as f:
        f.write(result.content)
    mlflow.log_artifact("response.txt")

