Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -573,7 +573,7 @@ OpenObserve is configured using the following environment variables.
| ZO_QUICK_MODE_ENABLED | false | Indicates if quick mode is enabled. |
| ZO_QUICK_MODE_NUM_FIELDS | 500 | The number of fields to consider for quick mode. |
| ZO_QUICK_MODE_STRATEGY | | Possible values are `first`, `last`, `both`. |

| ZO_QUICK_MODE_FORCE_ENABLED | true | |

## Miscellaneous
| Environment Variable | Default Value | Description |
Expand Down
Binary file modified docs/images/add-regex-pattern.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/apply-multiple-reg-pattern.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/config-hash-pattern-query-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/drop-at-ingestion-time-test-config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/drop-at-query-time-test-config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/extended-retention.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/hashed-at-ingestion-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/hashed-at-query-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/redact-at-ingestion-time-test-config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/redact-at-query-test-config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/redact-or-drop-during-regex-pattern-execution.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/regex-pattern-execution-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/regex-selection-view.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/stream-details-access.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/stream-details-configuration.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/stream-details-schema-settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/stream-details.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/stream-fields.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/stream-name.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/stream-start-end-time.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/ingestion/.pages
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
nav:
- Index: index.md
- Ingestion Overview: index.md
- Logs: logs
- Metrics: metrics
- Traces: traces
6 changes: 5 additions & 1 deletion docs/ingestion/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ description: >-
Ingest logs, metrics, and traces into OpenObserve via OTEL, Fluentbit, APIs,
syslog, Prometheus, or programmatically using Go, Python, or curl.
---
# Ingestion

Logs metrics and traces can be ingested into OpenObserve from a variety of sources. This section describes how to ingest data from the following sources:

Expand All @@ -17,6 +16,11 @@ Logs metrics and traces can be ingested into OpenObserve from a variety of sourc
1. [Fluent-bit](logs/fluent-bit)
1. [Fluentd](logs/fluentd)
1. [Amazon Kinesis Firehose](logs/kinesis_firehose)
1. [Syslog](logs/syslog)
1. [Python](logs/python)
1. [Go](logs/go)
1. [Curl](logs/curl)


### APIs

Expand Down
129 changes: 58 additions & 71 deletions docs/user-guide/alerts/alerts.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,101 +3,88 @@ description: >-
Learn how alerting works in OpenObserve. Supports real-time and scheduled
alerts with thresholds, frequency, silence periods, and aggregation options.
---
# Alerts
## What are alerts?
Alerts automatically monitor your data streams and notify you when conditions are met. When predefined conditions trigger, alerts send notifications to your configured destinations.

Alerting provides mechanism to notify users when certain conditions are met. OpenObserve supports both scheduled and real time alerts. For the most part you should use Standard alerts as they are more efficient and can be used for most use cases.

Real time alerts are useful when you want to be notified of a condition as soon as it occurs. Realtime alerts are suited primarily in the scenarios like "panic" in log or known malicious ip address in logs. Realtime alerts are evaluated at ingestion time based on condition specified, they are evaluated per record and can be computationally expensive.

## Concepts

Following is the definition of the fields in alerts:

- **Threshold**: The threshold above/below which the alert will trigger. e.g. if the threshold is >100 and the query returns a value of 101 then the alert will trigger.
- For Scheduled - Standard:
- Threshold is measured against the number of records returned by the SQL query
- For Scheduled - With aggregation:
- This is fired whenever the SQL query returns more than `0` records
- For Scheduled with SQL:
- Threshold is measured against the number of records returned by the SQL query

- **Period**: Period for which the query should run. e.g. 10 minutes means that whenever the query will run it will use the last 10 minutes of data. If the query runs at 4:00 PM then it will use the data from 3:50 PM to 4:00 PM.

- **Frequency**: How often the alert should be evaluated. 2 minutes means that the query will be run every 2 minutes and will be evaluated based on the parameters provided.

- **Silence notification for**: If the alert triggers then how long should it wait before sending another notification. This is to prevent overloading of alert messages. e.g. if the alert triggers at 4:00 PM and the silence notification is set to 10 minutes then it will not send another notification until 4:10 PM even if the alert is still after 1 minute. This is to avoid spamming the user with notifications.

- Aggregation: The aggregation function to be used for the query. e.g. if the query is `SELECT COUNT(*) FROM table` then the aggregation function is `COUNT`. If the query is `SELECT AVG(column) FROM table` then the aggregation function is `AVG`.

OpenObserve supports following kinds of alerts:

## Standard alerts

Standard alerts are evaluated at frequency (every 1 minute by default ) for the condition of the alert, over duration specified as part of alert. If the condition evaluates to true a notification is sent to alert destination. Additionally user can delay notification after a notification is generated once for specified time duration.

For example:

> A user wants to be notified of condition if error code 500 occurs more than 15 time for duration of 2 mins & wants such evaluation to happen at every 1 minute frequency.

Watch this video to understand more.

<iframe width="760" height="315" src="https://www.youtube.com/embed/9F0jZ_mZSlo?si=Yrlr4E6tFbD50g3h" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

### Scheduled - Standard

You can configure the condition which will be converted to SQL query and executed at specified frequency.

We can configure the alert like this:

**Threshold** is measured against the number of records returned by the SQL query
---

![Standard alert](../../images/alerts/standard_alert.png)
## Alert types
There are two types of alerts in OpenObserve: Real-time and Scheduled.

The above alert configuration will result in the following SQL query (It's simplified for understanding):
- **Real-time alerts**: They monitor data continuously and trigger instantly when conditions are met. Use for critical events requiring instant action. For example, when the error count exceeds 10, alert sends notification to the destination within seconds.
- **Scheduled alerts**: They run at fixed intervals to evaluate aggregated or historical data. Use for routine monitoring and trend analysis. For example, every hour, the alert evaluates your data and checks if average response time exceeds 500ms. If the condition is met, the alert sends a notification.

```sql
select count(*) from default where severity = 'INFO'
```
---

The above query will be run every 1 minute for the last 2 minutes data. If count(*) > 3 (Threshold) then the alert will trigger. Additionally, the alert will not send another notification for 10 minutes after the first notification is sent.
## Core components
### Stream
The data source being monitored.

### Scheduled - With aggregation
### Period
Time window evaluated per alert run. If period is 30 minutes, the alert evaluates the last 30 minutes of data each run.

We fire when record count > 0
### Frequency
How often the alert evaluation runs. Real-time alerts run continuously; scheduled alerts run as per the configured frequency.

![Standard alert](../../images/alerts/aggregation_alert.png)
### Condition: Quick mode and SQL mode
Conditions determine when the alert fires.
OpenObserve supports two modes:

??? note "Quick mode (Real-time and scheduled alerts)"
Build conditions using the UI. Combine fields, operators, and values with `OR` or `AND` logic. Group conditions for complex nested logic.

### Scheduled - with SQL
- **Logic**:

- `OR`: Fire if ANY condition is true
- `AND`: Fire only if ALL conditions are true
- **Example**: `error_count > 100` OR `response_time > 500`
- **Operators**: `>` (greater than), `<` (less than), `==` (equal to)
- **Groups**: Nest multiple conditions for complex scenarios
- **Groups example**: `(error_count > 100 AND response_time > 500) OR status == "critical"`: This alert condition fires when BOTH error count AND response time are high, OR when status is critical

Threshold = number of records returned
??? note "SQL mode (Scheduled alerts only)"
Write custom SQL queries and VRL logic to define precise trigger conditions. Useful for complex filtering, aggregations, and multi-window comparisons.

### Scheduled - with PromQL
- Requires knowledge of SQL and VRL
- Enables advanced workflows with multi-window analysis

TODO
### Destination
Where alerts are sent. Choose one or combine multiple:

- **Email**: Send to team members or distribution lists. Requires SMTP configuration.
- **Webhook**: Send to external systems via HTTP. Integrates with Slack, Microsoft Teams, Jira, ServiceNow, and more.
- **Actions**: Execute custom Python scripts. Most flexible; can send to Slack AND log to stream simultaneously. Supports stateful workflows

## Real time alerts
### Silence Period
The silence period prevents duplicate notifications by temporarily pausing alert triggers after firing.

Real time alerts are evaluated at ingestion time based on condition specified , they are evaluated per record.
---

For example:
## Multi-window alerts
Compare current data against historical data to detect anomalies and trends.
Raw numbers alone cannot reveal trends. Multi-window alerts provide context by comparing current results with past data to detect anomalies and performance shifts. <br>
For example, 200 errors in 30 minutes is critical if you normally see 50, but normal if you typically see 180-210.

> A user wants to be notified of when API response time is more than 100 ms
### Workflow:

<kbd>
![Real Time Alert](../../images/alerts_realtime.png)
</kbd>
1. **Set up windows**: Define current window (time period to monitor) and reference windows (historical periods to compare)
2. **Write SQL**: Query data for all windows
3. **Write VRL logic**: Compare results and calculate differences
4. **Set threshold**: Alert triggers if comparison exceeds your condition

Please note we selected `Slack` destination for demo, but you can add others in `Alert destination`.
### SQL and VRL editor
After configuring windows, navigate to **Conditions** > **SQL mode** > **View Editor**. Use the **SQL Editor** to query data and **VRL Editor** to process results. Run queries to see output, apply VRL to see combined results, then set your alert condition.

Watch this video to understand more.
### Use cases
- **Spike detection**: Detect sudden increases in error counts by comparing the current window with a previous time period.
- **Performance degradation**: Identify when average response times are trending upward compared to historical data.
- **Anomalous behavior**: Detect unusual activity patterns in user behavior, traffic, or system performance that deviate from expected norms.

<iframe width="760" height="315" src="https://www.youtube.com/embed/QvgyHU3_wME?si=xv03MHM6KoQo8pCm" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
---

## FAQ
**Q**: If I set the frequency to 5 minutes and the current time is 23:03, when will the next runs happen?
**A**: OpenObserve aligns the next run to the nearest upcoming time that is divisible by the frequency, starting from the top of the hour in the configured timezone. This ensures that all runs occur at consistent and predictable intervals.
**Question**: If I set the frequency to 5 minutes and the current time is 23:03, when will the next runs happen? <br>
**Answer**: OpenObserve aligns the next run to the nearest upcoming time that is divisible by the frequency, starting from the top of the hour in the configured timezone. This ensures that all runs occur at consistent and predictable intervals.<br>
**Example**<br>
If the current time is 23:03, here is when the next run will occur for different frequencies:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Query your current cluster when you know the data is in your cluster or when you
4. Select one specific cluster from the **Region** dropdown.
5. Select **Run query**.

> For detailed explanation, see **Normal cluster query execution** in the [Federated Search Architecture](../federated-search/federated-search-architecture/) page.
> For detailed explanation, see **Normal cluster query execution** in the [Federated Search Architecture](https://openobserve.ai/docs/user-guide/federated-search/federated-search-architecture/) page.
<br>

**Result**<br>
Expand All @@ -54,7 +54,7 @@ Use federated search when you need data from multiple clusters.
4. Leave the **Region** dropdown unselected, or select multiple clusters.
5. Select **Run query**.

> For detailed explanation, see **Federated search for one different cluster** and **Federated search for multiple clusters** in the [Federated Search Architecture](../federated-search-architecture/) page.
> For detailed explanation, see **Federated search for one different cluster** and **Federated search for multiple clusters** in the [Federated search architecture](https://openobserve.ai/docs/user-guide/federated-search/federated-search-architecture/) page.
<br>

**Result**<br>
Expand All @@ -75,4 +75,4 @@ Use this quick reference to understand how region selection affects query execut

**Next step**

- [Federated Search Architecture](../federated-search-architecture/)
- [Federated search architecture](https://openobserve.ai/docs/user-guide/federated-search/federated-search-architecture/)
6 changes: 3 additions & 3 deletions docs/user-guide/federated-search/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ Before using federated search, understand these core concepts:

> **Important**: Querying your current cluster uses normal cluster query execution, not federated search architecture.

> For detailed technical explanations of deployment modes, architecture, and how queries execute, see the [Federated Search Architecture](../federated-search-architecture/) page.
> For detailed technical explanations of deployment modes, architecture, and how queries execute, see the [Federated search architecture](https://openobserve.ai/docs/user-guide/federated-search/federated-search-architecture/) page.

## When to use federated search

Expand All @@ -61,5 +61,5 @@ Before using federated search, understand these core concepts:

**Next steps**

- [How to Use Federated Search](../how-to-use-federated-search/)
- [Federated Search Architecture](../federated-search-architecture/)
- [How to use federated search](https://openobserve.ai/docs/user-guide/federated-search/how-to-use-federated-search/)
- [Federated search architecture](https://openobserve.ai/docs/user-guide/federated-search/federated-search-architecture/)
Loading