Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 14 additions & 1 deletion docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@
"group": "BI feature guides",
"icon": "person-chalkboard",
"pages": [
"guides/ai-agents",
"guides/metrics-catalog",
"references/chart-types",
"guides/how-to-create-scheduled-deliveries",
Expand Down Expand Up @@ -79,6 +78,20 @@
]
}
]
},
{
"group": "AI agents",
"icon": "robot",
"pages": [
"guides/ai-agents",
"guides/ai-agents/getting-started",
"guides/ai-agents/using-ai-agents",
"guides/ai-agents/agent-memory",
"guides/ai-agents/self-improvement",
"guides/ai-agents/best-practices",
"guides/ai-agents/data-access",
"guides/ai-agents/evaluations"
]
}
]
},
Expand Down
656 changes: 39 additions & 617 deletions guides/ai-agents.mdx

Large diffs are not rendered by default.

24 changes: 24 additions & 0 deletions guides/ai-agents/agent-memory.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
---
title: Agent memory and learning
description: How AI agents learn from your corrections and feedback
---

<Info>Agent memory can be modified by admins and developers only</Info>

AI agents can now learn from your corrections and feedback. When you correct an agent's response or guide it to better understand your data, the agent can automatically update its own instructions to remember these preferences for future conversations. The memory will only be saved if you approve the agent's suggested learning.

## How it works

- When an agent makes a mistake or you provide clarification, it can capture this feedback
- The agent updates its instructions field with the new learning
- All future conversations with that agent will benefit from this accumulated knowledge
- Memories are stored directly in the agent's instructions, which you can view and edit in agent settings

## What agents can learn

Agents can learn various types of corrections and preferences:

- **Which tables or explores to use** for specific types of queries
- **Field selection preferences** like "always use net_revenue instead of gross_revenue when generating revenue charts"

Check warning on line 22 in guides/ai-agents/agent-memory.mdx

View check run for this annotation

Mintlify / Mintlify Validation (lightdash) - vale-spellcheck

guides/ai-agents/agent-memory.mdx#L22

Did you really mean 'net_revenue'?

Check warning on line 22 in guides/ai-agents/agent-memory.mdx

View check run for this annotation

Mintlify / Mintlify Validation (lightdash) - vale-spellcheck

guides/ai-agents/agent-memory.mdx#L22

Did you really mean 'gross_revenue'?
- **Filter logic** such as "exclude test accounts when counting customers"
- **General preferences** about formatting, ordering, or analysis approaches
185 changes: 185 additions & 0 deletions guides/ai-agents/best-practices.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,185 @@
---
title: Best practices for AI agents
sidebarTitle: Best practices
description: Learn how to configure and optimize your AI agents for the best results
---

To get the most accurate and useful answers from your AI agents, follow these best practices for preparing your data and configuring your agents.

## Think specialized, not general

Think of AI agents as your specialized analysts - each one can be configured to focus on specific areas of your business. For example, you might create a "Marketing Assistant" that only has access to marketing data like campaign performance, lead generation, and customer acquisition metrics. This focused approach ensures more accurate, relevant responses and prevents sensitive data from being accessible to the wrong teams. To find out more about how to configure specific access, see [Limiting access to specific explores/fields](/guides/ai-agents/getting-started#limiting-access-to-specific-explores-and-fields).

## Document your data thoroughly

Good documentation is crucial for AI to understand your data models and provide meaningful insights. The quality of the results depend on the quality of your metadata and documentation.

- **Write clear, descriptive names** for metrics and dimensions
- **Add detailed descriptions** to all metrics and dimensions explaining what they represent
- **Include example questions** in descriptions that AI could answer with the metric
- **Use AI hints** to provide additional context specifically for AI agents

Remember: If your colleague wouldn't understand your documentation, neither will the AI agent. The more context you provide, the better the AI can interpret and analyze your data.

## Using AI hints

AI hints are specialized metadata fields that provide additional context specifically for AI agents. These hints help the AI better understand your data models, business logic, and how to interpret your metrics and dimensions.

<Info>
AI hints are internal metadata used only by AI agents and are not displayed to
users in the Lightdash interface. When both AI hints and descriptions are

Check warning on line 30 in guides/ai-agents/best-practices.mdx

View check run for this annotation

Mintlify / Mintlify Validation (lightdash) - vale-spellcheck

guides/ai-agents/best-practices.mdx#L30

Did you really mean 'Lightdash'?
present, AI hints take precedence for AI agent prompts.
</Info>

AI hints support both string and array of strings formats. The array format allows you to organize multiple distinct pieces of information as separate hints, making them easier to read and maintain.

You can add AI hints at three levels:

### Model-level hints

Provide context about the entire table:

<CodeGroup>

```yaml dbt 1.10+
models:
- name: customers
config:
meta:
ai_hint:
- This is a customers table containing customer information and derived facts
- Use this for customer demographics, behavior analysis, and segmentation
```

```yaml dbt <=1.9
models:
- name: customers
meta:
ai_hint:
- This is a customers table containing customer information and derived facts
- Use this for customer demographics, behavior analysis, and segmentation
```

</CodeGroup>

String format:

<CodeGroup>

```yaml dbt 1.10+
models:
- name: customers
config:
meta:
ai_hint: |
This is a customers table containing customer information and derived facts.
Use this for customer demographics, behavior analysis, and segmentation.
```

```yaml dbt <=1.9
models:
- name: customers
meta:
ai_hint: |
This is a customers table containing customer information and derived facts.
Use this for customer demographics, behavior analysis, and segmentation.
```

</CodeGroup>

### Dimension-level hints

Explain individual columns:

<CodeGroup>

```yaml dbt 1.10+
columns:
- name: last_name
config:
meta:
dimension:
ai_hint:
- Customer's last name
- Contains PII data - use for identification but be mindful of privacy
```

```yaml dbt <=1.9
columns:
- name: last_name
meta:
dimension:
ai_hint:
- Customer's last name
- Contains PII data - use for identification but be mindful of privacy
```

</CodeGroup>

### Metric-level hints

Clarify what metrics measure:

<CodeGroup>

```yaml dbt 1.10+
columns:
- name: customer_id
config:
meta:
metrics:
unique_customer_count:
type: count_distinct
ai_hint:
- Unique customer count for business reporting
- Use this for customer acquisition and retention analysis
```

```yaml dbt <=1.9
columns:
- name: customer_id
meta:
metrics:
unique_customer_count:
type: count_distinct
ai_hint:
- Unique customer count for business reporting
- Use this for customer acquisition and retention analysis
```

</CodeGroup>

## Writing effective instructions

Think of your instructions as teaching your AI agent about your world. The better you explain your business context and preferences, the more useful and relevant your agent's responses will be.

Focus on four key areas: what your agent should know about your industry, your team's goals and constraints, how you like data analyzed, and how results should be communicated.

### What to include

- **Industry terminology and key metrics** including acronyms your team uses regularly (e.g., "CPM means Cost Per Mille, not cost per mile" or "Our ARR calculations exclude one-time setup fees")

Check warning on line 160 in guides/ai-agents/best-practices.mdx

View check run for this annotation

Mintlify / Mintlify Validation (lightdash) - vale-spellcheck

guides/ai-agents/best-practices.mdx#L160

Did you really mean 'Mille'?
- **Communication style** for how results should be presented to your team (e.g., "Keep explanations simple for non-technical stakeholders" or "Always include actionable next steps")
- **Business constraints** like regulatory requirements or budget limitations that affect decision-making
- **Analysis preferences** your team relies on (e.g., "Always compare month-over-month growth" or "Flag any churn rates above 5% as concerning")
- **Context for interpreting your data** (e.g., "Our Q4 always shows higher sales due to holiday promotions" or "Weekend traffic is typically 40% lower")

<Check>
**Good example - Sales Team Agent:** <br />
You analyze sales performance for our SaaS company. Focus on MRR, churn, and pipeline
health. When MRR growth drops below 10% month-over-month, flag it as concerning.
Present insights in simple terms that our sales managers can act on immediately.
Always include trend explanations and next steps.
</Check>

### What to avoid

- **Contradictory instructions** that create confusion about priorities
- **Overly complex rules** that are hard to follow consistently
- **Vague guidance** like "be helpful" without explaining what that means for your situation
- **Too many different focus areas** in one agent, remember to keep each agent focused, there are no limits on the number of agents you can create!
- **Restating basic features**, don't tell the AI to "create charts" since it already does that

<Danger>
**Poor example - Too vague:** <br />
Be helpful and analyze data well. Create good charts and explain things clearly.
</Danger>
121 changes: 121 additions & 0 deletions guides/ai-agents/data-access.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
title: Data access control
description: Understand how AI agents access and use your data
---

AI agents offer flexible data access control to balance insights with privacy and security. By default, agents work with metadata only, but you can optionally enable data access for deeper analysis.

## Data access modes

**Metadata-only mode (default):**

- Agents can see your data structure, field names, and model definitions
- They can generate appropriate queries and visualizations
- No actual data values (except for one-row query results) are shared with the agent
- Perfect for exploring data structure and creating initial analyses

**Data access enabled:**

- Agents receive actual query results in addition to metadata
- Can provide specific insights, identify trends, and analyze patterns in your data
- Offers detailed summaries and data-driven recommendations
- Can search for actual field values to ensure accurate filters when building visualizations
- Only shares data when explicitly enabled per agent

<Info>
Data access is optional and controlled per agent. When disabled, agents only
work with your data model structure and cannot see actual data values. This
ensures sensitive information is only shared when you explicitly choose to
enable this capability.
</Info>

<Accordion title="How to enable data access">
To enable data access, go to your agent settings and toggle the "Data Access"
option.
<Frame>
<img
src="/images/guides/ai-agents/enable-data-access.png"
alt="How to enable data access for AI agents"
/>
</Frame>
</Accordion>

## User attributes and permissions

AI agents automatically respect all data access controls configured through [user attributes](/references/user-attributes). This ensures that agents only access data that the user is authorized to see, maintaining your existing security policies.

### How user attributes flow through AI queries

When an AI agent generates and executes queries on behalf of a user, it inherits that user's attribute values and applies them to all data access:

**Row-level security:**
- Agents automatically apply `sql_filter` rules defined in your models
- Only rows matching the user's attribute values are included in query results
- Example: If a user has `sales_region: 'EMEA'`, the agent will only query data for that region

**Column-level security:**
- Agents respect `required_attributes` on dimensions
- Columns the user cannot access are invisible to the agent
- Metrics derived from restricted columns are also unavailable
- Example: If `salary` requires `is_admin: 'true'`, non-admin users' agents cannot query salary data

**Table-level security:**
- Agents respect `required_attributes` on models
- Tables the user cannot access are completely hidden from the agent
- The agent cannot reference or join restricted tables
- Example: If `payments` requires `is_admin: 'true'`, non-admin users' agents cannot query the payments table

### Default behavior

<Info>
**In the Lightdash app:** AI agents automatically use the logged-in user's attributes for all queries.

Check warning on line 71 in guides/ai-agents/data-access.mdx

View check run for this annotation

Mintlify / Mintlify Validation (lightdash) - vale-spellcheck

guides/ai-agents/data-access.mdx#L71

Did you really mean 'Lightdash'?

**In Slack:** AI agents currently use the attributes of the user who created the agent. We plan to respect user attributes based on Slack user email in the future—reach out if you need this feature!
</Info>

### How this works behind the scenes

When an agent generates a query:

1. The agent receives the user's complete attribute profile (both direct user attributes and group attributes)
2. All `sql_filter` rules are automatically applied to the generated SQL
3. Dimensions and tables with `required_attributes` are filtered from the available schema
4. The agent only sees and can query data within the user's permissions

This happens transparently—the agent doesn't need special configuration. Your existing user attribute rules automatically protect AI-generated queries just like they protect manual queries.

### Example: Regional sales access

Consider this model configuration:

```yaml
models:
- name: sales
meta:
sql_filter: ${TABLE}.region IN (${lightdash.attributes.sales_region})
columns:
- name: revenue
- name: customer_name
meta:
dimension:
required_attributes:
can_view_pii: "true"
```

For a user with `sales_region: 'EMEA'` and no `can_view_pii` attribute:

- The agent can query `revenue` data, but only for EMEA region
- The agent cannot see or query `customer_name` (PII restriction)
- If the agent tries to analyze customer names, it will fail with a permissions error
- All generated queries automatically include `WHERE region IN ('EMEA')`

### Security considerations

- **Metadata mode:** Even in metadata-only mode, agents respect user attributes when showing available fields
- **Data access mode:** When data access is enabled, query results are filtered by user attributes
- **Query generation:** Agents cannot generate queries that bypass user attribute restrictions
- **Error handling:** If an agent attempts to access restricted data, the query fails with a permissions error

<Warning>
Custom SQL in table calculations or SQL Runner can potentially bypass user attribute filters. AI agents use the standard query interface and cannot bypass these restrictions.
</Warning>
Loading