<a href="https://colab.research.google.com/github/Bluedata-Consulting/GAAPB01-training-code-base/blob/main/Concept_langfuse_prompt_management_version_Control.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Prompt Versioning, Testing & Tracing with **Langfuse**
_A hands‑on guide for Generative AI Architects (local deployment)_

---
## 1  Why Langfuse?
- **Prompt management** – track versions, diff, roll back.
- **Prompt testing** – automated & human/LLM evaluations.
- **Tracing** – token‑level observability.


## 2  Spin up Langfuse locally
```bash
docker compose up -d
open http://localhost:3000   # admin / admin
```


In [31]:
!pip install langfuse --quiet

In [32]:
import os
os.environ["LANGFUSE_PUBLIC_KEY"] = "pk-lf-xxxxxxxxxxx"
os.environ["LANGFUSE_SECRET_KEY"] = "sk-lf-xxxxxxxxxxxxxxxxxxxx"
os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com" #"http://localhost:3000"

## 3  Prompt Creation via Python SDK

In [33]:
from langfuse import Langfuse
import langfuse as lf
import os, textwrap


os.environ['AZURE_OPENAI_ENDPOINT'] = "https://azure-openai-may2-25.openai.azure.com"
os.environ['AZURE_OPENAI_API_KEY'] = "xxxxxxxxxxxxxxxxx"

lf = Langfuse(public_key=os.getenv("LANGFUSE_PUBLIC_KEY"),
              secret_key=os.getenv("LANGFUSE_SECRET_KEY"),
              host=os.getenv("LANGFUSE_HOST", "http://localhost:3000"))

In [34]:
# Create a text prompt
textprompt = lf.create_prompt(
    name="movie-critic",
    type="text",
    prompt="As a {{criticlevel}} movie critic, do you like {{movie}}?",
    labels=["production"],  # directly promote to production
    config={
        "model": "gpt-4o",
        "temperature": 0.7,
        "supported_languages": ["en", "fr"],
    },  # optionally, add configs (e.g. model parameters or model tools) or tags
)

print("Prompt:", textprompt.name, "Version:", textprompt.version)


Prompt: movie-critic Version: 3


In [35]:
print("Prompt:", textprompt.prompt)

Prompt: As a {{criticlevel}} movie critic, do you like {{movie}}?


In [36]:
# Create a chat prompt
chatprompt = lf.create_prompt(
    name="movie-critic-chat",
    type="chat",
    prompt=[
      { "role": "system", "content": "You are an {{criticlevel}} movie critic" },
      { "role": "user", "content": "Do you like {{movie}}?" },
    ],
    labels=["production"],  # directly promote to production
    config={
        "model": "gpt-4o",
        "temperature": 0.7,
        "supported_languages": ["en", "fr"],
    },  # optionally, add configs (e.g. model parameters or model tools) or tags
)

print("Prompt:", chatprompt.name, "Version:", chatprompt.version)

Prompt: movie-critic-chat Version: 2


In [37]:
print("Prompt:", chatprompt.prompt)

Prompt: [{'role': 'system', 'content': 'You are an {{criticlevel}} movie critic'}, {'role': 'user', 'content': 'Do you like {{movie}}?'}]


In [38]:
prompt_text = textwrap.dedent("\nYou are an elite **Project Management Advisor** tasked with producing a _comprehensive_ delivery plan.\n\n**Project\u00a0Name:** {{project_name}}  \n**Project\u00a0Scope\u00a0(1\u00a0sentence):** {{project_scope}}\n\n---\n\n## Required Output (markdown):\n\n1. **Executive Summary** \u2013 3\u00a0bullet points.  \n2. **Detailed Work\u2011breakdown Structure (WBS)**  \n   - Use a markdown table with columns: *Work\u2011stream*, *Tasks*, *Owner*, *Start*, *End*.  \n   - At least 6\u00a0work\u2011streams, each with 3\u20115\u00a0tasks.\n3. **Milestones** \u2013 list every high\u2011level milestone in the format `YYYY\u2011MM\u2011DD\u00a0\u2013\u00a0Milestone name`.  \n4. **Risk Register** \u2013 another markdown table with *Risk*, *Impact (1\u20115)*, *Probability\u00a0(%) *, *Mitigation*.  \n5. **Stakeholder\u00a0Map** \u2013 classify the following stakeholders ({{stakeholder_list}}) under *Manage\u00a0Closely*, *Keep\u00a0Satisfied*, *Keep\u00a0Informed*, *Monitor*.  \n6. **Budget\u00a0Summary** \u2013 break down the total budget **{{budget}}** into at least 4\u00a0cost lines (**%** of total).\n7. **Timeline Visualization** \u2013 an ASCII Gantt chart up to **deadline\u00a0{{deadline}}**.\n8. End with an encouraging emoji.\n\n**Rules**\n\n- Sections must appear in order above.  \n- Do **not** hallucinate dates beyond the supplied deadline.  \n- Keep answer under 2000\u00a0tokens.\n")

prompt_v1 = lf.create_prompt(
    name="project-plan-generator",
    prompt=prompt_text,
    config={"model":"gpt-4o-mini","temperature":0.2},
    labels=["production"],

)

print("Prompt Name:", prompt_v1.name, "Version:", prompt_v1.version)




Prompt Name: project-plan-generator Version: 3


In [39]:
print(prompt_v1.prompt)


You are an elite **Project Management Advisor** tasked with producing a _comprehensive_ delivery plan.

**Project Name:** {{project_name}}  
**Project Scope (1 sentence):** {{project_scope}}

---

## Required Output (markdown):

1. **Executive Summary** – 3 bullet points.  
2. **Detailed Work‑breakdown Structure (WBS)**  
   - Use a markdown table with columns: *Work‑stream*, *Tasks*, *Owner*, *Start*, *End*.  
   - At least 6 work‑streams, each with 3‑5 tasks.
3. **Milestones** – list every high‑level milestone in the format `YYYY‑MM‑DD – Milestone name`.  
4. **Risk Register** – another markdown table with *Risk*, *Impact (1‑5)*, *Probability (%) *, *Mitigation*.  
5. **Stakeholder Map** – classify the following stakeholders ({{stakeholder_list}}) under *Manage Closely*, *Keep Satisfied*, *Keep Informed*, *Monitor*.  
6. **Budget Summary** – break down the total budget **{{budget}}** into at least 4 cost lines (**%** of total).
7. **Timeline Visualization** – an ASCII Gantt chart up

## Fetch prompt from Langfuse Prompt Hub

In [40]:
prompt = lf.get_prompt("project-plan-generator")
print(prompt.prompt)


You are an elite **Project Management Advisor** tasked with producing a _comprehensive_ delivery plan.

**Project Name:** {{project_name}}  
**Project Scope (1 sentence):** {{project_scope}}

---

## Required Output (markdown):

1. **Executive Summary** – 3 bullet points.  
2. **Detailed Work‑breakdown Structure (WBS)**  
   - Use a markdown table with columns: *Work‑stream*, *Tasks*, *Owner*, *Start*, *End*.  
   - At least 6 work‑streams, each with 3‑5 tasks.
3. **Milestones** – list every high‑level milestone in the format `YYYY‑MM‑DD – Milestone name`.  
4. **Risk Register** – another markdown table with *Risk*, *Impact (1‑5)*, *Probability (%) *, *Mitigation*.  
5. **Stakeholder Map** – classify the following stakeholders ({{stakeholder_list}}) under *Manage Closely*, *Keep Satisfied*, *Keep Informed*, *Monitor*.  
6. **Budget Summary** – break down the total budget **{{budget}}** into at least 4 cost lines (**%** of total).
7. **Timeline Visualization** – an ASCII Gantt chart up

In [41]:
prompt.config

{'model': 'gpt-4o-mini', 'temperature': 0.2}

## 4  Generate Plan using OpenAI SDK

In [42]:
from langfuse.openai import AzureOpenAI
client = AzureOpenAI(api_version="2024-12-01-preview")

In [43]:

def generate_plan(vars):
    filled = prompt.compile(**vars)
    response = client.chat.completions.create(
        model="telcogpt",
        messages=[{"role":"system","content":"You are an expert project‑management advisor."},
                  {"role":"user","content":filled}],
        temperature=0.2
    )
    return response.choices[0].message.content




In [44]:
sample = generate_plan(dict(
    project_name="AI Platform Overhaul",
    project_scope="Rebuild the ML pipeline end‑to‑end to support GenAI workloads.",
    deadline="2025-10-01",
    stakeholder_list="CTO, Head of Data, Product Manager, Cloud Ops Lead",
    budget="$500,000"
))
print(sample[:500], "...")

# AI Platform Overhaul Delivery Plan

---

## 1. Executive Summary

- Rebuild the entire ML pipeline to enable scalable, efficient GenAI workloads by October 2025.  
- Focus on modular architecture, cloud-native infrastructure, and robust data governance to future-proof the platform.  
- Mitigate risks through phased delivery, continuous stakeholder engagement, and rigorous testing.

---

## 2. Detailed Work-breakdown Structure (WBS)

| Work-stream            | Tasks                              ...


In [47]:
current = lf.get_prompt(
    name   = "project-plan-generator",   # your prompt name
)

print(f"Current version: {current.version}")




Current version: 3


In [50]:

# **Create a new version** (safe / preferred)
new_text = textwrap.dedent(current.prompt) + "\n\n9. **Success Criteria** – list 3 KPIs."

prompt_v_next = lf.create_prompt(
    name   = current.name,          # same name auto‑increments the version
    prompt = new_text,              # the modified body
    config = current.config,        # keep model + temperature the same
    labels = ["candidate"],         # start as 'candidate' until tests pass
)

print(f"Created version {prompt_v_next.version} (name {prompt_v_next.name})")

Created version 5 (name project-plan-generator)


## 5  Best Practices Checklist

|  Area           |  Guideline                                                       |
| --------------- | ---------------------------------------------------------------- |
|  Observability  |  Log **every** prod call (no sampling) for full forensic traces. |
|  PII            | Hash sensitive text or set `metadata={"pii":True}` for masking.  |
|  Cost           | Tighten retention in prod, use read‑only ClickHouse in staging.  |
|  SemVer         | Use semantic prompt versions (`1.3.0`) or Git SHA in `comment=`. |
|  Rollback       | Keep `production-previous` label ready for instant rollback.     |
|         |                                                                  |
