Scripts to create a Dbizi dataset and evaluate an assistant by mBerasategui-ehu · Pull Request #227 · Lamb-Project/lamb

mBerasategui-ehu · 2026-01-23T09:49:36Z

Added 2 scripts in scripts/langsmith:
-dbizi_dataset.py (creates a dataset in LangSmith with 15 questions about dbizi)
-evaluate_assistant.py (an llm-as-a-judge evaluates a given lamb assistant with the given dataset in LangSmith)

To try them, first configure these LangSmith env variables in backend/.env:
-LANGCHAIN_TRACING_V2=true
-LANGCHAIN_API_KEY=YOUR_LANGCHAIN_API_KEY
-LANGCHAIN_PROJECT=lamb-assistants
-LANGCHAIN_ENDPOINT=https://api.smith.langchain.com

And these 2 variables in evaluate_assistant.py:
-JWT_TOKEN = "your_jwt_token_here"
-ASSISTANT_ID = 1 (this should be the id of the dbizi assistant in lamb)

Then, run the scripts:
-first, dbizi_dataset.py
-then, evaluate_assistant.py

mBerasategui-ehu · 2026-01-26T11:58:14Z

Fixed issues in evaluate_assistant.py and added dbizi_dataset_eus.py (same as dizi_dataset.py but in basque).

mBerasategui-ehu · 2026-01-27T11:41:15Z

Now these variables are loaded from lamb-kb-server-stable/backend/.env instead of having them hardcoded:
API_BASE_URL = os.getenv("API_BASE_URL")
JWT_TOKEN = os.getenv("JWT_TOKEN")
ASSISTANT_ID = int(os.getenv("ASSISTANT_ID"))
DATASET_NAME = os.getenv("DATASET_NAME")

mBerasategui-ehu · 2026-01-28T08:40:07Z

Now EVALUATOR_MODEL is chosen via env var.

juananpe · 2026-01-27T10:07:13Z

scripts/langsmith/evaluate_assistant.py

+load_dotenv(project_root / "backend" / ".env")
+
+# Configuration
+API_BASE_URL = "http://localhost:9099"


Please, use env variables here to avoid hardcoding them

juananpe · 2026-01-27T10:07:37Z

scripts/langsmith/evaluate_assistant.py

+    try:
+        # Call the LLM judge (gpt-4.1)
+        response = openai_client.chat.completions.create(
+            model="gpt-4.1",


should be possible to define the model via env vars

mBerasategui-ehu added 2 commits January 23, 2026 10:42

Scripts to create a Dbizi dataset and evaluate an assistant

bccc27c

Dbizi dataset in basque and fix evaluate_assistant.py

1d71640

load variables from dotenv

e51febb

Choose the evaluator model env var instead of hardcoded

0beb82a

juananpe approved these changes Jan 29, 2026

View reviewed changes

juananpe merged commit 95588db into Lamb-Project:langsmith Jan 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scripts to create a Dbizi dataset and evaluate an assistant#227

Scripts to create a Dbizi dataset and evaluate an assistant#227
juananpe merged 4 commits intoLamb-Project:langsmithfrom
mBerasategui-ehu:langsmith

mBerasategui-ehu commented Jan 23, 2026

Uh oh!

mBerasategui-ehu commented Jan 26, 2026

Uh oh!

mBerasategui-ehu commented Jan 27, 2026

Uh oh!

mBerasategui-ehu commented Jan 28, 2026

Uh oh!

juananpe Jan 27, 2026

Uh oh!

juananpe Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mBerasategui-ehu commented Jan 23, 2026

Uh oh!

mBerasategui-ehu commented Jan 26, 2026

Uh oh!

mBerasategui-ehu commented Jan 27, 2026

Uh oh!

mBerasategui-ehu commented Jan 28, 2026

Uh oh!

juananpe Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

juananpe Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants