Skip to content

[DRAFT] feat: Add Apache Doris storage plugin #13271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

xiongyan
Copy link

This commit introduces a new storage plugin for Apache Doris.

The plugin allows SkyWalking to use Apache Doris as its backend storage solution. Key features include:

  • Configuration: You can configure Doris connection parameters (host, port, user, password, database) in the application.yml.
  • Client: A DorisClient using the MySQL JDBC driver (due to Doris's MySQL compatibility) handles communication with the Doris cluster.
  • DAOs: Initial implementations for core Data Access Objects (StorageDAO, IMetricsQueryDAO, ITraceQueryDAO) are provided to handle writing and reading data. Other DAOs will need further implementation.
  • Plugin Registration: The plugin is registered via Java's Service Provider Interface (SPI).
  • Unit Tests: Basic unit tests for the DorisClient and DorisMetricsQueryDAO are included, focusing on connection logic and SQL query generation.
  • Documentation: A new document (backend-doris-storage.md) explains how to configure and use the Doris storage plugin.

This plugin provides a new storage option for SkyWalking users, leveraging the capabilities of Apache Doris. Further work will be needed to fully implement all DAO methods and conduct comprehensive integration testing.

  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #.
  • Update the CHANGES log.

This commit introduces a new storage plugin for Apache Doris.

The plugin allows SkyWalking to use Apache Doris as its backend storage solution. Key features include:

- **Configuration:** You can configure Doris connection parameters (host, port, user, password, database) in the `application.yml`.
- **Client:** A `DorisClient` using the MySQL JDBC driver (due to Doris's MySQL compatibility) handles communication with the Doris cluster.
- **DAOs:** Initial implementations for core Data Access Objects (`StorageDAO`, `IMetricsQueryDAO`, `ITraceQueryDAO`) are provided to handle writing and reading data. Other DAOs will need further implementation.
- **Plugin Registration:** The plugin is registered via Java's Service Provider Interface (SPI).
- **Unit Tests:** Basic unit tests for the `DorisClient` and `DorisMetricsQueryDAO` are included, focusing on connection logic and SQL query generation.
- **Documentation:** A new document (`backend-doris-storage.md`) explains how to configure and use the Doris storage plugin.

This plugin provides a new storage option for SkyWalking users, leveraging the capabilities of Apache Doris. Further work will be needed to fully implement all DAO methods and conduct comprehensive integration testing.
@wu-sheng wu-sheng added TBD To be decided later, need more discussion or input. backend OAP backend related. plugin Plugin for agent or collector. Be used to extend the capabilities of default implementor. labels May 28, 2025
@wu-sheng
Copy link
Member

Hi, first of all, welcome to the contribution.

As this is a very heavy plugin for the maintainer team, we require a SWIP to review.
We need a very detailed explanation about

  1. How Doris' data structure supports SkyWalking better as a new storage plugin
  2. How Doris covers all features, including metadata, metrics, logs, traces, with different features and scales
  3. Please provide the benchmark of this new storage option to prove <1>
  4. Please provide real use case(s) about why Doris matters.

@wu-sheng
Copy link
Member

About the codes, I clearly can see many issues.
The plugin is not in the final tarball. I have concerns about whether this is a PoC version or a real, practical version of the product.

SkyWalking is currently used by many top-level companies, we are cautious about accepting this kind of feature.

Please be patient, we will review your SWIP and will ask a lot about your solution. Please provide details and numbers as much as possible.

…ests

This commit significantly advances the Apache Doris storage plugin for SkyWalking.

Key changes include:

- Core DAO Implementations:
    - `DorisStorageDAO`: Refactored to correctly act as a DAO factory.
    - `DorisDAOUtils`: Introduced for generic JDBC operations.
    - `DorisMetricsDAO`: Implemented for writing metrics.
    - `DorisRecordDAO`: Implemented for writing records (segments, logs, alarms).
    - `DorisBatchDAO`: Implemented for batch database operations.
    - `DorisHistoryDeleteDAO`: Implemented for deleting historical data.
    - `DorisMetricsQueryDAO`: Full implementation for querying metrics.
    - `DorisTraceQueryDAO`: Full implementation for querying traces.

- Table Schemas:
    - Defined DDL scripts in `doris_schema.sql` for essential tables (metrics_all, segment, log_record, alarm_record).
    - Documentation updated to reference these schemas.

- Integration Tests:
    - Established `DorisStoragePluginITBase` using Testcontainers with an Apache Doris Docker image (`apache/doris:2.0.3`).
    - `DorisMetricsQueryAndWriteIT`: Tests the write/read cycle for metrics.
    - `DorisTraceQueryAndWriteIT`: Tests the write/read cycle for trace segments, including basic trace brief queries.

- Unit Tests & Documentation:
    - Existing unit tests for client and SQL generation.
    - Documentation for configuring and using the Doris plugin.

This provides a foundational, testable implementation of the Doris storage plugin, covering primary data paths for metrics and traces. Further work would involve implementing more specialized DAOs and expanding test coverage.
@morningman
Copy link

Hi @xiongyan Nice work! Maybe I can help to improve this PR.

Copy link
Member

@wu-sheng wu-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned, we need your swip to be reviewed with design, benchmark and use cases.

Before starting the SkyWalking OAP server with the Doris storage plugin enabled, you must create the necessary tables and schemas in your Doris database. SkyWalking does not automatically create these tables in Doris.

The Data Definition Language (DDL) scripts for creating these tables are provided in the SkyWalking distribution under the Doris storage plugin directory:
`oap-server/server-storage-plugin/storage-doris-plugin/src/main/resources/doris_schema.sql`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A static script for schema don't work. SkyWalking metrics, traces, logs havr internal ORM for different storage.
Schemas are created dynamically.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so ,it needs to implement internal ORM .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All storage features are all required to be implemented, including this. Otherwise, when SkyWalking uses OAL, MAL, LAL to extend, this storage failed. Users have to read codes and add tables on their own.

@xiaokang
Copy link

Hi, @xiongyan @wu-sheng, Glad to see this PR and discussion.

I'm a PMC member of Apache Doris community and focuses on Doris for observability. I'd like to be involved and glad to help.

Apache Doris is a modern OLAP database for real-time analytics. It delivers lightning-fast analytics on real-time data at scale. Doris is popular and its community is very active, with 13.4k github stars, 676 contributors and more than 5000 enterprise users.

One of its major user cases is logging and observability. It's already in the eco-system of OpenTelemetry as a storage backend. Doris provides the fast full-text search of Elasticsearch while keep low cost and high performance aggregation. Some of the the outstanding features of Doris for observability is outlined here:

  1. INVERTED INDEX and full-text search, which is very useful in log and trace search.
  2. easy to operate, due to automatic balance for scaling, online rolling up-gradation.
  3. columnar storage with high compression ratio, which will reduce cost.
  4. high performance for both search, aggregation and JOIN queries.

@xiongyan
Copy link
Author

xiongyan commented May 29, 2025

As this is a very heavy plugin for the maintainer team, we require a SWIP to review.
We need a very detailed explanation about

Hi, first of all, welcome to the contribution.

As this is a very heavy plugin for the maintainer team, we require a SWIP to review. We need a very detailed explanation about

  1. How Doris' data structure supports SkyWalking better as a new storage plugin
  2. How Doris covers all features, including metadata, metrics, logs, traces, with different features and scales
  3. Please provide the benchmark of this new storage option to prove <1>
  4. Please provide real use case(s) about why Doris matters.

Hi, maybe doris is optimal, though it needs to implement all interfaces. I will try to do it.

@wu-sheng
Copy link
Member

Hi, @xiongyan @wu-sheng, Glad to see this PR and discussion.

I'm a PMC member of Apache Doris community and focuses on Doris for observability. I'd like to be involved and glad to help.

Apache Doris is a modern OLAP database for real-time analytics. It delivers lightning-fast analytics on real-time data at scale. Doris is popular and its community is very active, with 13.4k github stars, 676 contributors and more than 5000 enterprise users.

One of its major user cases is logging and observability. It's already in the eco-system of OpenTelemetry as a storage backend. Doris provides the fast full-text search of Elasticsearch while keep low cost and high performance aggregation. Some of the the outstanding features of Doris for observability is outlined here:

  1. INVERTED INDEX and full-text search, which is very useful in log and trace search.
  2. easy to operate, due to automatic balance for scaling, online rolling up-gradation.
  3. columnar storage with high compression ratio, which will reduce cost.
  4. high performance for both search, aggregation and JOIN queries.

Thank you for the general introduction to the project.
At this phase, let's focus on SWIP, https://skywalking.apache.org/docs/main/next/en/swip/readme/. We need SkyWalking x Doris architecture and feature adoption design doc..

Regarding use cases, Doris has a large number of end users. I have no doubt about that. My asking is purely about this feature, using Doris as the SkyWalking storage database. This is critical about how we need to review and test/benchmark this feature.
Default bundled storage plugin or hosting the codes in the main repo with every release, we are going to very cautious. Two TSDB were added and removed later, due to not fit well.

As this is a beginning, be patient, we need to do this slowly and clearly.

@morningman
Copy link

Hi @xiongyan this is my email morningman.cmy@gmail.com, or u can subscribe Doris mailing list dev@doris.apachee.org, maybe we can discuss more details about it and then back to here to draft a SWIP.

@xiongyan
Copy link
Author

Hi, first of all, welcome to the contribution.

As this is a very heavy plugin for the maintainer team, we require a SWIP to review. We need a very detailed explanation about

  1. How Doris' data structure supports SkyWalking better as a new storage plugin
  2. How Doris covers all features, including metadata, metrics, logs, traces, with different features and scales
  3. Please provide the benchmark of this new storage option to prove <1>
  4. Please provide real use case(s) about why Doris matters.

Here's a breakdown of how I would approach answering these questions, though it's important to note that providing definitive benchmarks and real use cases would require significant effort, including setting up test environments and potentially gathering data from actual deployments:

  1. How Doris' data structure supports SkyWalking better as a new storage plugin:

Current Implementation: The current plugin I created uses Doris as a generic JDBC backend. This means it relies on standard SQL table structures (CREATE TABLE, standard data types like VARCHAR, INT, BIGINT) defined in the JDBCTableInstaller and its parent classes. Data is mapped to these relational tables.
Potential Doris-Specific Optimizations (Advanced Implementation):
Aggregating Merge Model: Doris's table models (Aggregate, Unique, Duplicate) could be leveraged. For SkyWalking metrics, which are often pre-aggregated, Doris's Aggregate model with SUM, MAX, MIN, REPLACE functions during data load could be highly beneficial. This could reduce storage and query time for aggregated metrics.
Rollups: Doris's materialized views (rollups) could be used to create pre-aggregated summaries of metrics or traces at different granularities (e.g., service-level, instance-level, endpoint-level metrics from raw trace data). This would significantly speed up common queries on the SkyWalking UI.
BITMAP Data Type: For high-cardinality dimensions (like user IDs, specific tags in traces, etc.), Doris's BITMAP type could be used for extremely fast distinct counting (e.g., unique users accessing an endpoint) and set operations (e.g., users who experienced error A AND error B). This is a common SkyWalking query pattern.
Partitioning and Bucketing: Doris's data partitioning (by time, as SkyWalking data is time-series) and bucketing (by service ID, instance ID) strategies would be crucial for query performance and data lifecycle management (TTL). The current JDBC plugin has some time-based sharding, but Doris's native capabilities might be more powerful.
Columnar Storage: Doris is a columnar MPP database. This is inherently good for the analytical queries SkyWalking performs (OLAP-style queries on traces and metrics).
2. How Doris covers all features (metadata, metrics, logs, traces) with different features and scales:

Traces:
Storage: Individual spans can be stored in a "Duplicate" model table, partitioned by time and bucketed by trace ID or service ID.
Querying: Efficient querying by trace ID, service, endpoint, duration, tags. Columnar storage helps here.
Scalability: Doris's MPP architecture is designed for horizontal scalability to handle large volumes of trace data.
Metrics:
Storage: As mentioned, the "Aggregate" model is ideal. Metrics could be stored with dimensions (service, instance, endpoint, metric name) and value columns (sum, count, max, min).
Querying: Fast aggregation queries, time-window queries. Rollups would further enhance this.
Scalability: Handles high ingest rates and large metric volumes.
Logs:
Storage: "Duplicate" model table, partitioned by time and potentially bucketed by application/service. Text search capabilities in Doris (like LIKE or integration with full-text search engines if needed) would be relevant.
Querying: Filtering by keywords, timestamp, service, instance.
Scalability: Horizontal scaling for log volume.
Metadata:
Storage: Smaller tables, potentially "Unique" model for service names, endpoint names, instance properties to ensure no duplicates and fast lookups.
Querying: Fast lookups by ID or name.
Scalability: Generally less demanding than traces/metrics, but Doris handles it easily.
3. Please provide the benchmark of this new storage option to prove <1> (Doris data structure benefits):

This is a significant undertaking. To provide meaningful benchmarks, I would need to:
Set up comparable SkyWalking instances with:
The current generic JDBC-Doris plugin.
An enhanced Doris plugin that uses specific features like the Aggregate model, BITMAPs, and rollups (this enhanced plugin would need to be developed first).
Another established storage plugin (e.g., Elasticsearch) as a baseline.
Generate a realistic and heavy load of traces, metrics, and logs using a tool like SkyWalking's data generator or a custom solution.
Define a set of representative SkyWalking queries (e.g., fetching traces for a specific service, querying aggregated metrics for a dashboard, searching logs).
Measure:
Ingest Throughput: How much data can be written per second.
Query Latency: How long do different types of queries take.
Storage Footprint: How much disk space is used.
Resource Utilization: CPU, memory on both SkyWalking OAP and Doris nodes.
Without performing these steps, I can only theorize based on Doris's architecture, as I did in point 1. I cannot provide actual benchmark numbers right now.
4. Please provide real use case(s) about why Doris matters.

Unified Analytics + Observability: Companies already using Doris for Business Intelligence (BI) or data warehousing could consolidate their observability data (traces, metrics, logs from SkyWalking) into the same Doris cluster. This allows them to:
Correlate business metrics (e.g., orders, revenue from BI tables) with application performance metrics (e.g., latency, error rates from SkyWalking tables) in a single system using SQL.
Reduce infrastructure complexity and operational overhead by managing one less database system.
Cost-Effective Large-Scale Storage: For organizations generating massive amounts of observability data, Doris's architecture, especially with its efficient aggregation and columnar storage, can offer a cost-effective solution for long-term retention and analysis compared to some other specialized time-series or search-engine based solutions.
SQL Familiarity for Complex Analysis: Teams familiar with SQL can leverage Doris's SQL interface to perform complex ad-hoc analysis on SkyWalking data, potentially going beyond what the standard SkyWalking UI offers. This is useful for deep-dive troubleshooting or custom reporting.
Performance for High Cardinality: As mentioned, Doris's BITMAP type can be a game-changer for environments with very high cardinality dimensions (e.g., tracking metrics per customer ID in a large SaaS application), where other databases might struggle with performance for distinct counts or filtering on such dimensions.

@wu-sheng
Copy link
Member

Please note, in the SWIP, we will need to know how your theoretical things work in SW storage plugin, and later in the codes(PR).
If some features in Doris are beneficial to be used as SW storage, please be clear about how you are using them. If you are using JDBC(pure SQL things), you need to explain how it is better.
About the benchmark, I would say we will need that as a merging condition. Currently, JDBC(MySQL and PG), Elasticsearch(and OpenSearch), and BanyanDB are all maintained by the PMC core team step by step for years, so as a new storage layer, you are facing equally high performance implementation requirements.
This could be hard for you, but it is helpful to keep the team confident to accept this feature.

Some previous posts related to new storage implementation, please read

@wu-sheng
Copy link
Member

wu-sheng commented Jun 3, 2025

@xiongyan In the use case section of SWIP, please be clear which company drives you to write this one, and is going to use this. This is crucial to ensure the SkyWalking community's confidence in the product readiness of this feature.

Comment on lines +183 to +206
public IEventQueryDAO newEventQueryDAO() {
LOGGER.warn("newEventQueryDAO not implemented yet, returning null.");
return null;
}

public IBrowserLogQueryDAO newBrowserLogQueryDAO() {
LOGGER.warn("newBrowserLogQueryDAO not implemented yet, returning null.");
return null;
}

public ISpanAttachedEventQueryDAO newSpanAttachedEventQueryDAO() {
LOGGER.warn("newSpanAttachedEventQueryDAO not implemented yet, returning null.");
return null;
}

public IZipkinQueryDAO newZipkinQueryDAO() {
LOGGER.warn("newZipkinQueryDAO not implemented yet, returning null.");
return null;
}

public ITagAutoCompleteQueryDAO newTagAutoCompleteQueryDAO() {
LOGGER.warn("newTagAutoCompleteQueryDAO not implemented yet, returning null.");
return null;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these are returning NULL, so this PR is not completely ready. I am going to move this as DRAFT, and wait for your update.

@wu-sheng wu-sheng changed the title feat: Add Apache Doris storage plugin [DRAFT] feat: Add Apache Doris storage plugin Jun 3, 2025
@wu-sheng
Copy link
Member

Any update here or from Doris community? I don't see any update for two weeks

@wu-sheng wu-sheng added the no update The owner doesn't provide further feedback. label Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend OAP backend related. no update The owner doesn't provide further feedback. plugin Plugin for agent or collector. Be used to extend the capabilities of default implementor. TBD To be decided later, need more discussion or input.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants