Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/environment-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -333,6 +333,8 @@ OpenObserve is configured using the following environment variables.
| ZO_NATS_DELIVER_POLICY | all | Starting point in the stream for message delivery. Allowed values are all, last, new. |
| ZO_NATS_SUB_CAPACITY | 65535 | Maximum subscription capacity. |
| ZO_NATS_QUEUE_MAX_SIZE | 2048 | Maximum queue size in megabytes. |
| ZO_NATS_KV_WATCH_MODULES | 2048 | Defines which internal modules use the NATS Key-Value Watcher instead of the default NATS Queue for event synchronization. Add one or more module prefixes separated by commas, such as /nodes/ or /user_sessions/. When left empty, all modules use the default NATS Queue mechanism. |
| ZO_NATS_EVENT_STORAGE | memory | Controls how NATS JetStream stores event data. Use memory for high-speed, in-memory event storage or file for durable, disk-based storage that persists across restarts. <br> Performance Benchmark Results: <br> • File Storage: 10,965 ops/sec (10.71 MB/s throughput, ~911 µs mean latency) <br>• Memory Storage: 16,957 ops/sec (16.56 MB/s throughput, ~589 µs mean latency) <br> Memory storage offers ~55 percent higher throughput and lower latency, while file storage ensures durability. |


## S3 and Object Storage
Expand Down
Binary file added docs/images/example-1-query-recommendations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/example-2-query-recommendations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/match-all-hash.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/organization-in-openobserve.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/organization-role-permission.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/select-query-recommendations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/use-query-recommendations.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,25 @@ When no local disk cache is available:
- The querier fetches the latest enrichment data from the metadata database, such as PostgreSQL, and the remote storage system, such as S3. It then provides the data to the restarting node.


## Region-based caching in multi-region super clusters
In a multi-region super cluster deployment, enrichment tables are typically queried from all regions when a node starts up and rebuilds its cache. While this ensures data completeness, it can slow startup or cause failures if one or more regions are unavailable.

To address this, OpenObserve Enterprise supports primary region–based caching, controlled by the environment variable `ZO_ENRICHMENT_TABLE_GET_REGION`.

### Requirements

- Available only in Enterprise Edition.
- Requires Super Cluster to be enabled.
- The `ZO_ENRICHMENT_TABLE_GET_REGION` variable must specify a valid region name.

### How it works
When a node starts, OpenObserve calls internal methods such as `get_enrichment_table_data()` and `cache_enrichment_tables()` to retrieve enrichment table data. <br>
The boolean parameter `apply_primary_region_if_specified` controls whether to use only the primary region for these fetch operations.

In a multi-region super cluster deployment, when `apply_primary_region_if_specified = true`, OpenObserve checks the value of `ZO_ENRICHMENT_TABLE_GET_REGION`.

- If `ZO_ENRICHMENT_TABLE_GET_REGION` specifies a primary region, the node queries only that region to fetch enrichment table data during cache initialization.
- If `ZO_ENRICHMENT_TABLE_GET_REGION` is not set, or the region name is empty, OpenObserve continues to query all regions as before.



72 changes: 36 additions & 36 deletions docs/user-guide/identity-and-access-management/organizations.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,48 +11,48 @@ Organizations provide logical boundaries for separating data, users, and access

![Organizations in OpenObserve](../../images/organization-in-openobserve.png)

## Organization Types
## Organization types

OpenObserve supports two types of organizations:

- **Default organization:** Automatically created for each user upon account creation. Typically named **default** and owned by the user. The UI labels it as type **default**.
- **Custom organization:** Any organization other than the **default**. These are created manually using the UI or ingestion (if enabled). Displayed in the UI as type **custom**.

!!! Info "What Is **_meta** Organization?"
**_meta Organization** is considered as a **custom** organization. It is a system-level organization that exists in both single-node and multi-node (HA) deployments.

- The **_meta** organization provides visibility into the health and status of the OpenObserve instance, including node metrics, resource usage, and configuration across all organizations.
- Use the **IAM > Roles > Permission** in the **_meta** organization to manage users across all organizations and control who can list, create, update, or delete organizations.

## Access

In OpenObserve, access to organization-level operations, such as listing, creating, updating, or deleting organizations, depends on the deployment mode.

### Open-Source Mode
Any authenticated user can create new organizations using the Add Organization button in the UI.
### Enterprise Mode with RBAC Enabled
- Access to organization management is strictly controlled through RBAC, which must be configured in the _meta organization.
- The **root** user always has unrestricted access to all organizations, including **_meta**.
- Only roles defined in **_meta** can include permissions for managing organizations.
- The **organization** module is available in the role editor only within the **_meta** organization.

!!! Info "How to Grant Organization Management Access?"
To delegate organization management to users in enterprise mode:

1. Switch to the **_meta** organization.
2. Go to **IAM > Roles**.
3. Create a new role or edit an existing one.
4. In the **Permissions** tab, locate the Organizations module.
5. Select the required operations:

- **List**: View existing organizations
- **Create**: Add new organizations
- **Update**: Modify organization details
- **Delete**: Remove organizations
6. Click **Save**. <br>
![Grant Organization Management Access in OpenObserve](../../images/organization-role-permission.png)

Once this role is assigned to a user within the **_meta** organization, they will have access to manage organizations across the system.
### _meta organization
**_meta Organization** is considered as a **custom** organization. It is a system-level organization that exists in both single-node and multi-node (HA) deployments.

- The **_meta** organization provides visibility into the health and status of the OpenObserve instance, including node metrics, resource usage, and configuration across all organizations.
- Use the **IAM > Roles > Permission** in the **_meta** organization to manage users across all organizations and control who can list, create, update, or delete organizations.

!!! note "Who can access"
## Who can access
In OpenObserve, access to organization-level operations, such as listing, creating, updating, or deleting organizations, depends on the deployment mode.

### Access in the open-source mode
Any authenticated user can create new organizations using the **Add Organization** button in the UI.
### Access in the enterprise mode with RBAC enabled
- Access to organization management is strictly controlled through RBAC, which must be configured in the _meta organization.
- The **root** user always has unrestricted access to all organizations, including **_meta**.
- Only roles defined in **_meta** can include permissions for managing organizations.
- The **organization** module is available in the role editor only within the **_meta** organization.

## How to grant organization management access?
To delegate organization management to users in enterprise mode:

1. Switch to the **_meta** organization.
2. Go to **IAM > Roles**.
3. Create a new role or edit an existing one.
4. In the **Permissions** tab, locate the Organizations module.
5. Select the required operations:

- **Create**: Add new organizations
- **Update**: Modify organization details
!!! note "Note"
By default, OpenObserve displays the list of organizations a user belongs to. You do not need to explicitly grant permission to view or retrieve organization details.
6. Click **Save**. <br>
![Grant Organization Management Access in OpenObserve](../../images/organization-role-permission.png)

Once this role is assigned to a user within the **_meta** organization, they will have access to manage organizations across the system.


## Create an Organization
Expand Down
2 changes: 2 additions & 0 deletions docs/user-guide/management/aggregation-cache.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ description: Learn how streaming aggregation works in OpenObserve Enterprise.
---
This page explains what streaming aggregation is and shows how to use it to improve query performance with aggregation cache in OpenObserve.

!!! info "Availability"
This feature is available in Enterprise Edition.

=== "Overview"

Expand Down
27 changes: 18 additions & 9 deletions docs/user-guide/management/sensitive-data-redaction.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,17 +16,14 @@ The **Sensitive Data Redaction** feature helps prevent accidental exposure of se
> **Note**: Use ingestion time redaction, hash, or drop when you want to ensure sensitive data is never stored on disk. This is the most secure option for compliance requirements, as the original sensitive data cannot be recovered once it is redacted, hashed, or dropped during ingestion.

- **Redact**: Sensitive data is masked before being stored on disk.
- **Hash**: Sensitive data is replaced with a **hash prefix** to protect the original data.
- **Hash**: Sensitive data is replaced with a [searchable](#search-hashed-values-uusing-match_all_hash) hash before being stored on disk.
- **Drop**: Sensitive data is removed before being stored on disk.

**Query time**
> **Note**: If you have already ingested sensitive data and it is stored on disk, you can use query time redaction or drop to protect it. This allows you to apply sensitive data redaction to existing data.

- **Redaction**: Sensitive data is read from disk but masked before results are displayed.
- **Hash**: Sensitive data is replaced with a hashed prefix during query evaluation, preserving correlation without revealing the value.
!!! note "Configure hash pattern length"
`ZO_RE_PATTERN_HASH_LENGTH` sets the number of hash characters kept for display and search.
Default 12. Allowed range 12 to 64.
- **Hash**: Sensitive data is read from disk but masked with a [searchable](#search-hashed-values-uusing-match_all_hash) hash before results are displayed.
- **Drop**: Sensitive data is read from disk but excluded from the query results.

!!! note "Where to find"
Expand Down Expand Up @@ -277,8 +274,8 @@ The following regex patterns are applied to the `message` field of the `pii_test
- Other fields remain intact.
- This demonstrates field-level drop at ingestion.

??? "Test 3: Hashed at ingestion time"
### Hashed at ingestion time
??? "Test 3: Hash at ingestion time"
### Hash at ingestion time
**Pattern Configuration**:
![config-hash-pattern-ingestion-time](../../images/config-hash-pattern-ingestion-time.png)

Expand Down Expand Up @@ -347,8 +344,8 @@ The following regex patterns are applied to the `message` field of the `pii_test
- The `message` field with the credit card details gets dropped in query results.
- This demonstrates field-level drop at query time.

??? "Test 6: Hashed at query time"
### Hashed at query time
??? "Test 6: Hash at query time"
### Hash at query time
**Pattern Configuration**:
![config-hash-pattern-query-time](../../images/config-hash-pattern-query-time.png)

Expand All @@ -365,6 +362,18 @@ The following regex patterns are applied to the `message` field of the `pii_test
6. Verify results:
![hash-at-query-time](../../images/hashed-at-query-time.png)

## Search hashed values uUsing `match_all_hash`
The `match_all_hash` user-defined function (UDF) complements the SDR Hash feature. It allows you to search for logs that contain the hashed equivalent of a specific sensitive value.
When data is hashed using Sensitive Data Redaction, the original value is replaced with a deterministic hash. You can use `match_all_hash()` to find all records that contain the hashed token, even though the original value no longer exists in storage.
Example:
```sql
match_all_hash('4111-1111-1111-1111')
```
This query returns all records where the SDR Hash of the provided value exists in any field.
In the example below, it retrieves the log entry containing
[REDACTED:907fe4882defa795fa74d530361d8bfb], the hashed version of the given card number.
![match-all-hash](../../images/match-all-hash.png)


## Limitations

Expand Down
19 changes: 4 additions & 15 deletions docs/user-guide/pipelines/pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ Use real-time pipelines when you need immediate processing, such as monitoring l
A scheduled pipeline automates the processing of historical data from an existing stream at user-defined intervals. This is useful when you need to extract, transform, and load (ETL) data at regular intervals without manual intervention.
![Scheduled Pipelines in OpenObserve](../../images/pipelines-new-%20scheduled.png)

!!! note "Performance"
OpenObserve maintains a cache for scheduled pipelines to prevent the alert manager from making unnecessary database calls. This cache becomes particularly beneficial when the number of scheduled pipelines is high. For example, with 500 scheduled pipelines, the cache eliminates 500 separate database queries each time the pipelines are triggered, significantly improving performance.

#### How they work

1. **Source**: To create a scheduled pipeline, you need an existing stream, which serves as the source stream.
Expand All @@ -44,7 +47,7 @@ A scheduled pipeline automates the processing of historical data from an existin
![Scheduled Pipelines Transform in OpenObserve](../../images/pipeline-new-scheduled-condition.png)
4. **Destination**: The transformed data is sent to the following destination(s) for storage or further processing:
- **Stream**: The supported destination stream types are Logs, Metrics, Traces, or Enrichment tables. <br>**Note**: Enrichment Tables can only be used as destination streams in scheduled pipelines.
- **Remote**: Select **Remote** if you wish to send data to [external destination](#external-pipeline-destinations).
- **Remote**: Select **Remote** if you wish to send data to [external destination](https://openobserve.ai/docs/user-guide/pipelines/remote-destination/).

#### Frequency and Period
The scheduled pipeline runs based on the user-defined **Frequency** and **Period**.
Expand All @@ -60,20 +63,6 @@ The scheduled pipeline runs based on the user-defined **Frequency** and **Period
#### When to use
Use scheduled pipelines for tasks that require processing at fixed intervals instead of continuously, such as generating periodic reports and processing historical data in batches.

## External Pipeline Destinations
OpenObserve allows you to route pipeline data to external destinations.

To configure an external destination for pipelines:

1. Navigate to the **Pipeline Destination** configuration page. You can access the configuration page while setting up the remote pipeline destination from the pipeline editor or directly from **Management** (Settings icon in the navigation menu) > **Pipeline Destinations** > **Add Destination**.
2. In the **Add Destination** form, provide a descriptive name for the external destination.
3. Under **URL**, specify the endpoint where the data should be sent.
4. Select the HTTP method based on your requirement.
5. Add headers for authentication. In the **Header** field, enter authentication-related details (e.g., Authorization). In the **Value** field, provide the corresponding authentication token.
6. Use the toggle **Skip TLS Verify** to enable or disable Transport Layer Security (TLS) verification. <br>
**Note**: Enable the **Skip TLS Verify** toggle to bypass security and certificate verification checks for the selected destination. Use with caution, as disabling verification may expose data to security risks. You may enable the toggle for development or testing environments but is not recommended for production unless absolutely necessary.
![Remote Destination](../../images/pipeline-new-remote-destination.png)


## Next Steps
- [Create and Use Pipelines](../use-pipelines/)
Expand Down
Loading