Skip to content

Commit e0bec90

Browse files
committed
docs: add notebook external-evaluation-pipelines
1 parent 4e6d921 commit e0bec90

File tree

10 files changed

+4894
-97
lines changed

10 files changed

+4894
-97
lines changed

cookbook/_routes.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,5 +118,9 @@
118118
{
119119
"notebook": "integration_ollama.ipynb",
120120
"docsPath": "docs/integrations/ollama"
121+
},
122+
{
123+
"notebook": "example_external_evaluation_pipelines.ipynb",
124+
"docsPath": "docs/scores/external-evaluation-pipelines"
121125
}
122126
]

cookbook/example_external_evaluation_pipelines.ipynb

Lines changed: 3925 additions & 0 deletions
Large diffs are not rendered by default.

pages/docs/scores/_meta.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,6 @@
44
"annotation": "Annotation in UI",
55
"user-feedback": "User Feedback",
66
"model-based-evals": "Model-based Evaluation",
7+
"external-evaluation-pipelines": "External Evaluation Pipelines",
78
"custom": "Custom via SDKs/API"
89
}

pages/docs/scores/custom.mdx

Lines changed: 112 additions & 87 deletions
Large diffs are not rendered by default.

pages/docs/scores/external-evaluation-pipelines.md

Lines changed: 419 additions & 0 deletions
Large diffs are not rendered by default.

pages/docs/scores/model-based-evals.mdx

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -104,10 +104,14 @@ Langfuse ->> User: Analyze evaluation scores via UI & API
104104

105105
You can run your own model-based evals on data in Langfuse by fetching traces from Langfuse (e.g. via the Python SDK) and then adding evaluation results as [`scores`](/docs/scores) back to the traces in Langfuse. This gives you full flexibility to run various eval libraries on your production data and discover which work well for your use case.
106106

107-
Popular evaluation packages:
107+
The example notebook is a good template to get started with building your own evaluation pipeline.
108108

109-
- OpenAI Evals
110-
- Langchain Evaluators ([Example notebook](/guides/cookbook/evaluation_with_langchain))
111-
- RAGAS for RAG applications ([Example notebook](/guides/cookbook/evaluation_of_rag_with_ragas))
112-
- UpTrain evals ([Example notebook](/guides/cookbook/evaluation_with_uptrain))
113-
- Whylabs Langkit
109+
import { FileCode, BookOpen } from "lucide-react";
110+
111+
<Cards num={2}>
112+
<Card
113+
title="Example: External Evaluation Pipeline"
114+
href="/docs/scores/external-evaluation-pipelines"
115+
icon={<FileCode />}
116+
/>
117+
</Cards>

pages/docs/scores/overview.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ Langfuse supports all forms of evaluation methods due to its open architecture a
2727

2828
There are two ways to run model-based evaluations in Langfuse:
2929

30-
- [Via the Langfuse UI (beta)](/docs/scores/model-based-evals#ui)
31-
- [Via external evaluation pipelines](/docs/scores/model-based-evals#evaluation-pipeline)
30+
- [Via the Langfuse UI (beta)](/docs/scores/model-based-evals)
31+
- [Via external evaluation pipelines](/docs/scores/external-evaluation-pipelines)
3232

3333
### 2. Manual Annotation (in UI)
3434

@@ -48,9 +48,9 @@ Learn how to configure and utilize `scores` in Langfuse to assess quality, accur
4848

4949
import { FileCode, BookOpen } from "lucide-react";
5050

51-
<Cards num={3}>
51+
<Cards num={2}>
5252
<Card
53-
title="Getting Started with Langfuse Scoring"
53+
title="Getting Started with Langfuse Evaluation"
5454
href="/docs/scores/getting-started"
5555
icon={<FileCode />}
5656
/>

0 commit comments

Comments
 (0)