Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file removed img/monitor-dark.png
Binary file not shown.
Binary file removed img/monitor-light.png
Binary file not shown.
Binary file added img/monitor/monitor-filter-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-filter-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-filter-options-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-filter-options-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-json-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-json-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-list-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-list-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-page-buckets-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-page-buckets-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-page-line-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-page-line-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-regex-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-regex-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-settings-dark.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added img/monitor/monitor-settings-light.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
File renamed without changes
20 changes: 10 additions & 10 deletions mint.json
Original file line number Diff line number Diff line change
Expand Up @@ -144,12 +144,8 @@
"pages": ["hub/getting-started", "hub/configuration"]
},
{
"group": "Monitoring",
"pages": ["monitoring/introduction"]
},
{
"group": "Prompt Management",
"pages": ["prompts/quick-start", "prompts/registry", "prompts/sdk-usage"]
"group": "Datasets",
"pages": ["datasets/quick-start", "datasets/sdk-usage"]
},
{
"group": "Playgrounds",
Expand All @@ -165,10 +161,6 @@
}
]
},
{
"group": "Datasets",
"pages": ["datasets/quick-start", "datasets/sdk-usage"]
},
{
"group": "Evaluators",
"pages": ["evaluators/intro", "evaluators/custom-evaluator", "evaluators/made-by-traceloop"]
Expand All @@ -177,6 +169,14 @@
"group": "Experiments",
"pages": ["experiments/introduction", "experiments/result-overview", "experiments/running-from-code"]
},
{
"group": "Monitoring",
"pages": ["monitoring/introduction", "monitoring/defining-monitors", "monitoring/using-monitors"]
},
{
"group": "Prompt Management",
"pages": ["prompts/quick-start", "prompts/registry", "prompts/sdk-usage"]
},
{
"group": "Integrations",
"pages": ["integrations/posthog"]
Expand Down
104 changes: 104 additions & 0 deletions monitoring/defining-monitors.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: "Defining Monitors"
description: "Learn how to create and configure monitors to evaluate your LLM outputs"
---

Monitors in Traceloop allow you to continuously evaluate your LLM outputs in real time. This guide walks you through the process of creating and configuring monitors for your specific use cases.

## Creating a Monitor

To create a monitor, you need to complete these steps:

<Steps>
<Step title="Send Traces">
Connect the SDK to your system and add decorators to your flow. See [OpenLLMetry](/openllmetry/introduction) for setup instructions.
</Step>
<Step title="Choose an Evaluator">
Select the evaluation logic that will run on matching spans. You can define your own custom evaluators or use the pre-built ones by Traceloop. See [Evaluators](/evaluators/intro) for more details.
</Step>
<Step title="Define Span Filter">
Set criteria that determine which spans the monitor will evaluate.
</Step>
<Step title="Configure Settings">
Set up how the monitor operates, including sampling rates and other advanced options.
</Step>
</Steps>

### Basic Monitor Setup

Navigate to the Monitors page and click the **New** button to open the Evaluator Library. Choose the evaluator you want to run in your monitor.
Next, you will be able to configure which spans will be monitored.

## Span Filtering

The span filtering modal shows the actual spans from your system, letting you see how your chosen filters apply to real data.
Add filters by clicking on the <kbd>+</kbd> button.
<Frame>
<img className="block dark:hidden"
src="/img/monitor/monitor-filter-light.png" />
<img className="hidden dark:block"
src="/img/monitor/monitor-filter-dark.png" />
</Frame>


### Filter Options

- **Environment**: Filter by a specific environment
- **Workflow Name**: Filter by the workflow name defined in your system
- **Service Name**: Target spans from specific services or applications
- **AI Data**: Filter based on LLM-specific attributes like model name, token usage, streaming status, and other AI-related metadata
- **Attributes**: Filter based on span attributes


<img className="block dark:hidden"
src="/img/monitor/monitor-filter-options-light.png"
style={{maxWidth: '500px'}}
/>
<img className="hidden dark:block"
src="/img/monitor/monitor-filter-options-dark.png"
style={{maxWidth: '500px'}}
/>


## Monitor Settings

### Map Input

You need to map the appropriate span fields to the evaluator’s input schema.
This can be done easily by browsing through the available span field options—once you select a field, the real data is immediately displayed so you can see how it maps to the input.

<Frame>
<img className="block dark:hidden"
src="/img/monitor/monitor-settings-light.png" />
<img className="hidden dark:block"
src="/img/monitor/monitor-settings-dark.png" />
</Frame>

When the field data is not plain text, you can use JSON key mapping or Regex to extract the specific content you need.

For example, if your content is an array and you want to extract the "text" field from the object:

```json
[{"type":"text","text":"explain who are you and what can you do in one sentence"}]
```

You can use JSON key mapping like `0.text` to extract just the text content. The JSON key mapping will be applied to the Preview table, allowing you to see the extracted result in real-time.

<Frame>
<img className="block dark:hidden"
src="/img/monitor/monitor-json-light.png" />
<img className="hidden dark:block"
src="/img/monitor/monitor-json-dark.png" />
</Frame>

You can use Regex like `text":"(.+?)"` to extract just the text content. The regex will be applied to the Preview table, allowing you to see the extracted result in real-time.

<Frame>
<img className="block dark:hidden"
src="/img/monitor/monitor-regex-light.png" />
<img className="hidden dark:block"
src="/img/monitor/monitor-regex-dark.png" />
</Frame>

### Advanced
You can set a **Rate sample** to control the percentage of spans within the selected filter group that the monitor will run on.
34 changes: 9 additions & 25 deletions monitoring/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,39 +3,23 @@ title: "Introduction"
description: "Detect hallucinations and regressions in the quality of your LLMs"
---

One of the key features of Traceloop is the ability to monitor the quality of your LLM outputs. It helps you to detect hallucinations and regressions in the quality of your models and prompts.
One of the key features of Traceloop is the ability to monitor the quality of your LLM outputs in **real time**. It helps you to detect hallucinations and regressions in the quality of your models and prompts.

To start monitoring your LLM outputs, make sure you installed OpenLLMetry and configured it to send data to Traceloop. If you haven't done that yet, you can follow the instructions in the [Getting Started](/openllmetry/getting-started) guide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Broken link risk: “Getting Started” path may not exist.

Link points to /openllmetry/getting-started, but navigation lists per-language pages (…-python, …-ts, …-nextjs, …-go, …-ruby). Verify target exists or update to a valid page.

If missing, consider linking to the language selector page or to /openllmetry/introduction.


🏁 Script executed:

#!/bin/bash
# Verify whether a generic getting-started page exists.
fd -a --strip-cwd-prefix 'getting-started.mdx' | sed 's/^/FILE: /'
fd -a --strip-cwd-prefix 'openllmetry' | rg -nI -e 'getting-started' -e 'Getting Started' -n -H -S

Length of output: 212


🏁 Script executed:

#!/bin/bash
# List top-level directories to inspect documentation structure
ls -1

# Check for a docs folder and list its subdirectories if present
if [ -d docs ]; then
  echo "DIR: docs"
  ls -1 docs
fi

# Find any directory named 'openllmetry'
fd -t d --strip-cwd-prefix openllmetry

# Find all .mdx files and filter for 'getting-started' (case-insensitive)
fd -t f -e mdx --strip-cwd-prefix . | rg -i 'getting-started'

# Preview the generic getting-started.mdx if it exists at repository root
if [ -f getting-started.mdx ]; then
  echo "---- beginning of getting-started.mdx ----"
  head -n20 getting-started.mdx
  echo "---- end of snippet ----"
fi

Length of output: 596


Broken “Getting Started” link—update to a valid per-language path
In monitoring/introduction.mdx:8, the link /openllmetry/getting-started 404s. Only language-specific guides exist under /openllmetry/getting-started-python, …-ts, …-nextjs, …-go, and …-ruby. Point this link to the correct per-language page (or add a generic landing page if intended).

🤖 Prompt for AI Agents
In monitoring/introduction.mdx around line 8, the link
`/openllmetry/getting-started` 404s because only language-specific guides exist;
update the anchor to point to the appropriate per-language landing (e.g.
`/openllmetry/getting-started-python`, `/openllmetry/getting-started-ts`,
`/openllmetry/getting-started-nextjs`, `/openllmetry/getting-started-go`, or
`/openllmetry/getting-started-ruby`) based on this doc’s audience, or create a
generic `/openllmetry/getting-started` landing page and change the link to that;
ensure the new target path is valid and the link text remains the same.

Next, if you're not using a [supported LLM framework](/openllmetry/tracing/supported#frameworks), [make sure to annotate workflows and tasks](/openllmetry/tracing/annotations).
You can then define any of the following [monitors](https://app.traceloop.com/monitors/prd) to track the quality of your LLM outputs.

<Frame>
<img className="block dark:hidden" src="/img/monitor-light.png" />
<img className="hidden dark:block" src="/img/monitor-dark.png" />
<img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" />
<img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" />
Comment on lines +12 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add alt text to images (accessibility + SEO).

-  <img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" />
-  <img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" />
+  <img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" alt="Monitors list in the dashboard (light)" />
+  <img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" alt="Monitors list in the dashboard (dark)" />
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" />
<img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" />
<img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" alt="Monitors list in the dashboard (light)" />
<img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" alt="Monitors list in the dashboard (dark)" />
🤖 Prompt for AI Agents
In monitoring/introduction.mdx around lines 12 to 13, the two <img> tags lack
alt attributes which harms accessibility and SEO; update each <img> element to
include meaningful alt text (e.g., "Monitor list — light theme" and "Monitor
list — dark theme" or other context-appropriate descriptions) so screen readers
and search engines receive descriptive content, keeping the alt strings concise
and matching the image purpose.

</Frame>
## Semantic Metrics

- **QA Relevancy:** Asses the relevant of an answer generated by a model with respect to a question. This is especially useful when running RAG pipelines.
- **Faithfulness:** Checks whether some generated content was inferred or deducted from a given context. Relevant for RAG pipelines, entity extraction, summarization, and many other text-related tasks.
- **Text Quality:** Evaluates the overall readability and coherence of text.
- **Grammar Correctness:** Checks for grammatical errors in generated texts.
- **Redundancy Detection:** Identifies repetitive content.
- **Focus Assessment:** Measures whether a given paragraph focuses on a single subject or "jumps" between multiple ones.
## What is a Monitor?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should talk about LLM-as-a-judge vs "made by traceloop" and connect it to our evals. I wouldn't put the list of monitors here


## Syntactic TextMetrics
A monitor is an evaluator that runs on a group of defined spans with specific characteristics in real time. For every span that matches the group filter, it will run the evaluator and log the monitor result. This allows you to continuously assess the quality and performance of your LLM outputs as they are generated in production.

- **Text Length:** Checks if the length of the generated text is within a given range (constant or with respect to an input).
- **Word Count:** Checks if the number of words in the generated text is within a given range.
Monitors can use two types of evaluators:

## Safety Metrics
- **LLM-as-a-Judge**: uses a large language model to evaluate outputs based on semantic qualities. You can create custom evaluators with this method by writing prompts that capture your own criteria.
- **Traceloop built in evaluators**: deterministic evaluations for structural validation, safety checks, and syntactic analysis.

- **PII Detection:** Identifies personally identifiable information in generated texts or input prompts.
- **Secret Detection:** Identifies secrets and API keys in generated texts or input prompts.
- **Toxicity Detection:** Identifies toxic content in generated texts or input prompts.

## Structural Metrics

- **Regex Validation**: Ensures that the output of a model matches a given regular expression.
- **SQL Validation**: Ensures SQL queries are syntactically correct.
- **JSON Schema Validation**: Ensures that the output of a model matches a given JSON schema.
- **Code Validation**: Ensures that the output of a model is valid code in a given language.
All monitors connect to our comprehensive [Evaluators](/evaluators/intro) library, allowing you to choose the right evaluation approach for your specific use case.
79 changes: 79 additions & 0 deletions monitoring/using-monitors.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: "Using Monitors"
description: "Learn how to view, analyze, and act on monitor results in your LLM applications"
---

Once you've created monitors, Traceloop continuously evaluates your LLM outputs and provides insights into their performance. This guide explains how to interpret and act on monitor results.

## Monitor Dashboard

The Monitor Dashboard provides an overview of all active monitors and their current status.
It shows each monitor’s health, the number of times it has run, and the most recent execution time.

<Frame>
<img className="block dark:hidden" src="/img/monitor/monitor-list-light.png" />
<img className="hidden dark:block" src="/img/monitor/monitor-list-dark.png" />
</Frame>


## Viewing Monitor Results

### Real-time Monitoring

Monitor results are displayed in real-time as your LLM applications generate new spans. You can view:

- **Run Details**: The span value that was evaluated and its result
- **Trend Analysis**: Performance over time
- **Volume Metrics**: Number of evaluations performed
- **Evaluator Output Rates**: Such as success rates for threshold-based evaluators

### Monitor Results Page

Click on any monitor to access its detailed results page. The monitor page provides comprehensive analytics and span-level details.

#### Chart Visualizations

The Monitor page includes multiple chart views to help you analyze your data, and you can switch between chart types using the selector in the top-right corner.

**Line Chart View** - Shows evaluation trends over time:
<Frame>
<img className="block dark:hidden" src="/img/monitor/monitor-page-line-light.png" />
<img className="hidden dark:block" src="/img/monitor/monitor-page-line-dark.png" />
</Frame>

**Bar Chart View** - Displays evaluation results in time buckets:
<Frame>
<img className="block dark:hidden" src="/img/monitor/monitor-page-buckets-light.png" />
<img className="hidden dark:block" src="/img/monitor/monitor-page-buckets-dark.png" />
</Frame>

#### Filtering and Time Controls

The top toolbar provides filtering options:
- **Environment**: Filter by production, staging, etc.
- **Time Range**: 24h, 7d, 14d, or custom ranges
- **Metric**: Select which evaluator output property to measure
- **Bucket Size**: 6h, Hourly, Daily, etc.
- **Aggregation**: Choose average, median, sum, min, max, or count

#### Matching Spans Table

The bottom section shows all spans that matched your monitor's filter criteria:
- **Timestamp**: When the evaluation occurred
- **Input**: The actual content that was mapped to be evaluated
- **Output**: The evaluation result/score
- **Completed Runs**: Total successful/error evaluations
- **Error Runs**: Failed evaluation attempts

Each row includes a link icon to view the full span details in the trace explorer:

<Frame>
<img className="block dark:hidden" src="/img/trace/trace-light.png" />
<img className="hidden dark:block" src="/img/trace/trace-dark.png" />
</Frame>

For further information on tracing refer to [OpenLLMetry](/openllmetry/introduction).

<Tip>
Ready to set up an evaluator for your monitor? Learn more about creating and configuring evaluators in the [Evaluators](/evaluators/intro) section.
</Tip>
4 changes: 2 additions & 2 deletions openllmetry/integrations/traceloop.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ sidebarTitle: "Traceloop"
---

<Frame>
<img className="block dark:hidden" src="/img/trace-light.png" />
<img className="hidden dark:block" src="/img/trace-dark.png" />
<img className="block dark:hidden" src="/img/trace/trace-light.png" />
<img className="hidden dark:block" src="/img/trace/trace-dark.png" />
</Frame>

[Traceloop](https://app.traceloop.com) is a platform for observability and evaluation of LLM outputs.
Expand Down
4 changes: 2 additions & 2 deletions openllmetry/introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ title: "What is OpenLLMetry?"
---

<Frame>
<img className="block dark:hidden" src="/img/trace-light.png" />
<img className="hidden dark:block" src="/img/trace-dark.png" />
<img className="block dark:hidden" src="/img/trace/trace-light.png" />
<img className="hidden dark:block" src="/img/trace/trace-dark.png" />
</Frame>

OpenLLMetry is an open source project that allows you to easily start monitoring and debugging the execution of your LLM app.
Expand Down