Python-Repository-Hub
diff --git a/‎cookbook/_routes.json
Lines changed: 4 additions & 0 deletions b/‎cookbook/_routes.json
Lines changed: 4 additions & 0 deletions
diff --git a/‎cookbook/example_external_evaluation_pipelines.ipynb
Lines changed: 3925 additions & 0 deletions b/‎cookbook/example_external_evaluation_pipelines.ipynb
Lines changed: 3925 additions & 0 deletions
diff --git a/‎pages/docs/scores/_meta.json
Lines changed: 1 addition & 0 deletions b/‎pages/docs/scores/_meta.json
Lines changed: 1 addition & 0 deletions
diff --git a/‎pages/docs/scores/custom.mdx
Lines changed: 112 additions & 87 deletions b/‎pages/docs/scores/custom.mdx
Lines changed: 112 additions & 87 deletions
diff --git a/‎pages/docs/scores/external-evaluation-pipelines.md
Lines changed: 419 additions & 0 deletions b/‎pages/docs/scores/external-evaluation-pipelines.md
Lines changed: 419 additions & 0 deletions
diff --git a/‎pages/docs/scores/model-based-evals.mdx
Lines changed: 10 additions & 6 deletions b/‎pages/docs/scores/model-based-evals.mdx
Lines changed: 10 additions & 6 deletions
diff --git a/‎pages/docs/scores/overview.mdx
Lines changed: 4 additions & 4 deletions b/‎pages/docs/scores/overview.mdx
Lines changed: 4 additions & 4 deletions
@@ -118,5 +118,9 @@
   {
     "notebook": "integration_ollama.ipynb",
     "docsPath": "docs/integrations/ollama"
+  },
+  {
+    "notebook": "example_external_evaluation_pipelines.ipynb",
+    "docsPath": "docs/scores/external-evaluation-pipelines"
   }
 ]
@@ -4,5 +4,6 @@
   "annotation": "Annotation in UI",
   "user-feedback": "User Feedback",
   "model-based-evals": "Model-based Evaluation",
+  "external-evaluation-pipelines": "External Evaluation Pipelines",
   "custom": "Custom via SDKs/API"
 }
@@ -104,10 +104,14 @@ Langfuse ->> User: Analyze evaluation scores via UI & API
 
 You can run your own model-based evals on data in Langfuse by fetching traces from Langfuse (e.g. via the Python SDK) and then adding evaluation results as [`scores`](/docs/scores) back to the traces in Langfuse. This gives you full flexibility to run various eval libraries on your production data and discover which work well for your use case.
 
-Popular evaluation packages:
+The example notebook is a good template to get started with building your own evaluation pipeline.
 
-- OpenAI Evals
-- Langchain Evaluators ([Example notebook](/guides/cookbook/evaluation_with_langchain))
-- RAGAS for RAG applications ([Example notebook](/guides/cookbook/evaluation_of_rag_with_ragas))
-- UpTrain evals ([Example notebook](/guides/cookbook/evaluation_with_uptrain))
-- Whylabs Langkit
+import { FileCode, BookOpen } from "lucide-react";
+
+<Cards num={2}>
+  <Card
+    title="Example: External Evaluation Pipeline"
+    href="/docs/scores/external-evaluation-pipelines"
+    icon={<FileCode />}
+  />
+</Cards>
@@ -27,8 +27,8 @@ Langfuse supports all forms of evaluation methods due to its open architecture a
 
 There are two ways to run model-based evaluations in Langfuse:
 
-- [Via the Langfuse UI (beta)](/docs/scores/model-based-evals#ui)
-- [Via external evaluation pipelines](/docs/scores/model-based-evals#evaluation-pipeline)
+- [Via the Langfuse UI (beta)](/docs/scores/model-based-evals)
+- [Via external evaluation pipelines](/docs/scores/external-evaluation-pipelines)
 
 ### 2. Manual Annotation (in UI)
 
@@ -48,9 +48,9 @@ Learn how to configure and utilize `scores` in Langfuse to assess quality, accur
 
 import { FileCode, BookOpen } from "lucide-react";
 
-<Cards num={3}>
+<Cards num={2}>
   <Card
-    title="Getting Started with Langfuse Scoring"
+    title="Getting Started with Langfuse Evaluation"
     href="/docs/scores/getting-started"
     icon={<FileCode />}
   />
Original file line number	Diff line number	Diff line change
`@@ -118,5 +118,9 @@`
`118`	`118`	`{`
`119`	`119`	`"notebook": "integration_ollama.ipynb",`
`120`	`120`	`"docsPath": "docs/integrations/ollama"`
	`121`	`+ },`
	`122`	`+ {`
	`123`	`+ "notebook": "example_external_evaluation_pipelines.ipynb",`
	`124`	`+ "docsPath": "docs/scores/external-evaluation-pipelines"`
`121`	`125`	`}`
`122`	`126`	`]`
Original file line number	Diff line number	Diff line change
`@@ -4,5 +4,6 @@`
`4`	`4`	`"annotation": "Annotation in UI",`
`5`	`5`	`"user-feedback": "User Feedback",`
`6`	`6`	`"model-based-evals": "Model-based Evaluation",`
	`7`	`+ "external-evaluation-pipelines": "External Evaluation Pipelines",`
`7`	`8`	`"custom": "Custom via SDKs/API"`
`8`	`9`	`}`