traceloop · nina-kollman · Aug 25, 2025 · Aug 25, 2025 · Aug 25, 2025 · Aug 25, 2025
diff --git a/datasets/quick-start.mdx b/datasets/quick-start.mdx
@@ -0,0 +1,63 @@
+---
+title: "Quick Start"
+---
+
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/dataset/dataset-list-light.png"
+  />
+  <img className="hidden dark:block" src="/img/dataset/dataset-list-dark.png" />
+</Frame>
+
+Datasets are simple data tables that you can use to manage your data for experiments and evaluation of your AI applications.
+Datasets are available in the SDK, and they enable you to create versioned snapshots for reproducible testing. 
+
+<Steps>
+<Step title="Create a new dataset">
+
+Click **New Dataset** to create a dataset, give it a descriptive name that reflects its purpose or use case, add a description to help your team understand its context, and provide a slug that allows you to use the dataset in the SDK.
+
+</Step>
+
+<Step title="Add your data">
+
+Add rows and columns to structure your dataset.
+You can add different column types:
+- **Text**: For prompts, model responses, or any textual data
+- **Number**: For numerical values, scores, or metrics
+- **Boolean**: For true/false flags or binary classifications
+
+<Tip>
+ Use meaningful column names that clearly describe what each field contains,
+ making it easier to work with your dataset in code, ensure clarity when using evaluators, and collaborate with team members.
+</Tip>
+
+</Step>
+
+<Step title="Publish your dataset version">
+
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/dataset/dataset-view-light.png"
+  />
+  <img className="hidden dark:block" src="/img/dataset/dataset-view-dark.png" />
+</Frame>
+
+Once you're satisfied with your dataset structure and data:
+1. Click **Publish Version** to create a stable snapshot
+2. Published versions are immutable
+3. Publish versions are accessible in the SDK
+
+</Step>
+
+<Step title="View your version history">
+
+You can access all published versions of your dataset by opening the version history modal. This allows you to:
+- Compare different versions of your dataset
+- Track changes over time
+- Switch between versions
+
+</Step>
+</Steps>
diff --git a/datasets/sdk-usage.mdx b/datasets/sdk-usage.mdx
@@ -0,0 +1,226 @@
+---
+title: "SDK usage"
+description: "Access your managed datasets with the Traceloop SDK"
+---
+
+## SDK Initialization
+
+First, initialize the Traceloop SDK.
+
+<CodeGroup>
+
+```python Python
+from traceloop.sdk import Traceloop
+
+# Initialize with dataset sync enabled
+client = Traceloop.init()
+```
+
+```js Typescript
+import * as traceloop from "@traceloop/node-server-sdk";
+
+// Initialize with comprehensive configuration
+traceloop.initialize({
+  appName: "your-app-name",
+  apiKey: process.env.TRACELOOP_API_KEY,
+  disableBatch: true,
+  traceloopSyncEnabled: true,
+});
+
+// Wait for initialization to complete
+await traceloop.waitForInitialization();
+
+// Get the client instance for dataset operations
+const client = traceloop.getClient();
+```
+
+</CodeGroup>
+
+<Note>
+  Make sure you've created an API key and set it as an environment variable
+  `TRACELOOP_API_KEY` before you start. Check out the SDK's [getting started
+  guide](/openllmetry/getting-started-python) for more information.
+</Note>
+
+The SDK fetches your datasets from Traceloop servers. Changes made to a draft dataset version are immediately available in the UI.
+
+## Dataset Operations
+
+### Create a dataset
+
+You can create datasets in different ways depending on your data source:
+- **Python**: Import from CSV file or pandas DataFrame
+- **TypeScript**: Import from CSV data or create manually
+
+<CodeGroup>
+
+```python Python
+import pandas as pd
+from traceloop.sdk import Traceloop
+
+client = Traceloop.init()
+
+# Create dataset from CSV file
+dataset_csv = client.datasets.from_csv(
+    file_path="path/to/your/data.csv",
+    slug="medical-questions",
+    name="Medical Questions",
+    description="Dataset with patients medical questions"
+)
+
+# Create dataset from pandas DataFrame
+data = {
+    "product": ["Laptop", "Mouse", "Keyboard", "Monitor"],
+    "price": [999.99, 29.99, 79.99, 299.99],
+    "in_stock": [True, True, False, True],
+    "category": ["Electronics", "Accessories", "Accessories", "Electronics"],
+}
+df = pd.DataFrame(data)
+
+# Create dataset from DataFrame
+dataset_df = client.datasets.from_dataframe(
+    df=df,
+    slug="product-inventory",
+    name="Product Inventory",
+    description="Sample product inventory data",
+)
+```
+
+```js Typescript
+const client = traceloop.getClient();
+
+// Option 1: Create dataset manually
+const myDataset = await client.datasets.create({
+  name: "Medical Questions",
+  slug: "medical-questions",
+  description: "Dataset with patients medical questions"
+});
+
+// Option 2: Create and import from CSV data
+const csvData = `user_id,prompt,response,model,satisfaction_score
+user_001,"What is React?","React is a JavaScript library...","gpt-3.5-turbo",4
+user_002,"Explain Docker","Docker is a containerization platform...","gpt-3.5-turbo",5`;
+
+await myDataset.fromCSV(csvData, { hasHeader: true });
+```
+
+</CodeGroup>
+
+### Get a dataset
+The dataset can be retrieved using its slug, which is available on the dataset page in the UI
+<CodeGroup>
+
+```python Python
+# Get dataset by slug - current draft version
+my_dataset = client.datasets.get_by_slug("medical-questions")
+
+# Get specific version as CSV
+dataset_csv = client.datasets.get_version_csv(
+    slug="medical-questions", 
+    version="v2"
+)
+```
+
+```js Typescript
+// Get dataset by slug - current draft version
+const myDataset = await client.datasets.get("medical-questions");
+
+// Get specific version as CSV
+const datasetCsv = await client.datasets.getVersionCSV("medical-questions", "v1");
+
+```
+
+</CodeGroup>
+
+### Adding a Column
+
+<CodeGroup>
+
+```python Python
+from traceloop.sdk.dataset import ColumnType
+
+# Add a new column to your dataset
+new_column = my_dataset.add_column(
+    slug="confidence_score",
+    name="Confidence Score", 
+    col_type=ColumnType.NUMBER
+)
+```
+
+```js Typescript
+// Define schema by adding multiple columns
+const columnsToAdd = [
+  {
+    name: "User ID",
+    slug: "user-id",
+    type: "string" as const,
+    description: "Unique identifier for the user"
+  },
+  {
+    name: "Satisfaction score",
+    slug: "satisfaction-score",
+    type: "number" as const,
+    description: "User satisfaction rating (1-5)"
+  }
+];
+
+await myDataset.addColumn(columnsToAdd);
+console.log("Schema defined with multiple columns");
+```
+
+</CodeGroup>
+
+### Adding Rows
+
+Map the column slug to its relevant value
+<CodeGroup>
+
+```python Python
+# Add new rows to your dataset
+row_data = {
+    "product": "TV Screen",
+    "price": 1500.0,
+    "in_stock": True,
+    "category": "Electronics"
+}
+
+my_dataset.add_rows([row_data])
+```
+
+```js Typescript
+// Add individual rows to dataset
+const userId = "user_001";
+const prompt = "Explain machine learning in simple terms";
+const startTime = Date.now();
+
+const rowData = {
+  user_id: userId,
+  prompt: prompt,
+  response: `This is the model response`,
+  model: "gpt-3.5-turbo",
+  satisfaction_score: 1,
+};
+
+await myDataset.addRow(rowData);
+```
+
+</CodeGroup>
+
+## Dataset Versions
+
+### Publish a dataset
+Dataset versions and history can be viewed in the UI. Versioning allows you to run the same evaluations and experiments across different datasets, making valuable comparisons possible.
+<CodeGroup>
+
+```python Python
+# Publish the current dataset state as a new version
+published_version = my_dataset.publish()
+```
+
+```js Typescript
+// Publish dataset with version and description
+const publishedVersion = await myDataset.publish();
+```
+
+</CodeGroup>
+
diff --git a/experiments/introduction.mdx b/experiments/introduction.mdx
@@ -0,0 +1,31 @@
+---
+title: "Introduction"
+---
+
+Building reliable LLM applications means knowing whether a new prompt, model, or change of flow actually makes things better.
+
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/experiment/exp-list-light.png"
+  />
+  <img className="hidden dark:block" src="/img/experiment/exp-list-dark.png" />
+</Frame>
+
+Experiments in Traceloop provide teams with a structured workflow for testing and comparing results across different prompt, model, and evaluator checks, all against real datasets.
+## What You Can Do with Experiments
+
+<CardGroup cols={2}>
+ <Card title="Run Multiple Evaluators" icon="list-check">
+    Execute multiple evaluation checks against your dataset
+  </Card>
+   <Card title="View Complete Results" icon="table">
+    See all experiment run outputs in a comprehensive table view with relevant indicators and detailed reasoning
+  </Card>
+  <Card title="Compare Experiment Runs Results" icon="code-compare">
+    Run the same experiment across different dataset versions to see how it affects your workflow
+  </Card>
+  <Card title="Custom Task Pipelines" icon="code">
+    Add a tailored task to the experiment to create evaluator input. For example: LLM calls, semantic search, etc. 
+  </Card>
+</CardGroup>
diff --git a/experiments/result-overview.mdx b/experiments/result-overview.mdx
@@ -0,0 +1,44 @@
+---
+title: "Result Overview"
+---
+
+All experiments are logged in the Traceloop platform. Each experiment is executed through the SDK.
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/experiment/exp-list-light.png"
+  />
+  <img className="hidden dark:block" src="/img/experiment/exp-list-dark.png" />
+</Frame>
+
+## Experiment Runs
+An experiment can be run multiple times against different datasets and tasks. All runs are logged in the Traceloop platform to enable easy comparison.
+
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/experiment/exp-run-list-light.png"
+  />
+  <img className="hidden dark:block" src="/img/experiment/exp-run-list-dark.png" />
+</Frame>
+
+## Experiment Tasks
+
+An experiment run is made up of multiple tasks, where each task represents the experiment flow applied to a single dataset row.
+
+The task logging captures:
+
+- Task input – the data taken from the dataset row.
+
+- Task outputs – the results produced by running the task, which are then passed as input to the evaluator.
+
+- Evaluator results – the evaluator’s assessment based on the task outputs.
+<Frame>
+  <img
+    className="block dark:hidden"
+    src="/img/experiment/exp-run-light.png"
+  />
+  <img className="hidden dark:block" src="/img/experiment/exp-run-dark.png" />
+</Frame>
+
+