diff --git a/docs/faq-and-others/faq.md b/docs/faq-and-others/faq.md index bf536cae1..5c80b3085 100644 --- a/docs/faq-and-others/faq.md +++ b/docs/faq-and-others/faq.md @@ -1,206 +1,431 @@ --- -keywords: [performance, compatibility, features, use cases, drivers, Prometheus, Grafana, retention policy, schemaless, S3, integration, metrics, WAL, SQL, DataFusion] -description: Frequently Asked Questions about GreptimeDB, covering use cases, performance, compatibility, features, and more. +keywords: [unified observability, metrics, logs, traces, performance, OpenTelemetry, Prometheus, Grafana, cloud-native, SQL, PromQL] +description: Frequently Asked Questions about GreptimeDB - the unified observability database for metrics, logs, and traces. --- # Frequently Asked Questions +## Core Capabilities + +### What is GreptimeDB? + +GreptimeDB is an open-source, cloud-native unified observability database designed to store and analyze metrics, logs, and traces in a single system. Built with Rust for high performance, it offers: +- Up to 50x lower operational and storage costs +- Sub-second query responses on petabyte-scale datasets +- Native OpenTelemetry support +- SQL, PromQL, and stream processing capabilities +- Compute-storage separation for flexible scaling + ### How is GreptimeDB's performance compared to other solutions? -Please read [How is GreptimeDB's performance compared to other solutions](/user-guide/concepts/features-that-you-concern#how-is-greptimedbs-performance-compared-to-other-solutions). +GreptimeDB delivers superior performance across observability workloads: + +**Write Performance**: +- **2-4.7x faster** than Elasticsearch (up to 470% throughput) +- **1.5x faster** than Loki (121k vs 78k rows/s) +- **2x faster** than InfluxDB (250k-360k rows/s) +- **Matches ClickHouse** performance (111% throughput) + +**Query Performance**: +- **40-80x faster** than Loki for log queries +- **500x faster** for repeated queries (with caching) +- **2-11x faster** than InfluxDB for complex time-series queries +- Competitive with ClickHouse across different query patterns + +**Storage & Cost Efficiency**: +- **87% less storage** than Elasticsearch (12.7% footprint) +- **50% less storage** than ClickHouse +- **50% less storage** than Loki (3.03GB vs 6.59GB compressed) +- **Up to 50x lower** operational costs vs traditional stacks + +**Resource Optimization**: +- **40% less CPU** usage compared to previous versions +- **Lowest memory consumption** among tested databases +- Consistent performance on object storage (S3/GCS) +- Superior high-cardinality data handling -### How does this compare to Loki? Is there a crate with Rust bindings available, preferably as a tracing or logging subscriber? +**Unique Advantages**: +- Single database for metrics, logs, and traces +- Native cloud-native architecture +- Horizontal scalability (handles 1.15B+ rows) +- Full-text search with native indexing -GreptimeDB now supports log data types and has introduced compatibility with various industry protocols in version 0.10. These include Loki Remote Write, Vector plugins, and the full range of OTLP data types (Metrics, Traces, Logs). +Benchmark reports: [vs InfluxDB](https://greptime.com/blogs/2024-08-07-performance-benchmark) | [vs Loki](https://greptime.com/blogs/2025-08-07-beyond-loki-greptimedb-log-scenario-performance-report) | [Log Benchmark](https://greptime.com/blogs/2025-03-10-log-benchmark-greptimedb) -We plan to further refine the log engine, focusing on improving query performance and user experience. Future enhancements will include (but are not limited to) extending the functionality of GreptimeDB's log query DSL and implementing compatibility with some Elasticsearch/Loki APIs, providing users with more efficient and flexible log query capabilities. +### How does GreptimeDB handle metrics, logs, and traces? -For more information about using GreptimeDB with logs, refer to the documentation: +GreptimeDB is designed as a unified observability database that natively supports all three telemetry types: +- **Metrics**: Full Prometheus compatibility with PromQL support +- **Logs**: Full-text indexing, Loki protocol support, and efficient compression +- **Traces**: Experimental OpenTelemetry trace storage with scalable querying + +This unified approach eliminates data silos and enables cross-signal correlation without complex data pipelines. + +For detailed documentation: - [Log Overview](/user-guide/logs/overview.md) +- [Trace Overview](/user-guide/traces/overview.md) - [OpenTelemetry compatibility](/user-guide/ingest-data/for-observability/opentelemetry.md) +- [Prometheus compatibility](/user-guide/ingest-data/for-observability/prometheus.md) - [Loki protocol compatibility](/user-guide/ingest-data/for-observability/loki.md) +- [Elasticsearch compatibility](/user-guide/ingest-data/for-observability/elasticsearch.md) - [Vector compatibility](/user-guide/ingest-data/for-observability/vector.md) -### What would be the use cases for a time-series database? +### What are the main use cases for GreptimeDB? -Common use cases for time-series database include but are not limited to the following four scenarios: +GreptimeDB excels in: +- **Unified Observability**: Replace complex monitoring stacks with a single database +- **Edge and Cloud Data Management**: Seamless data synchronization across environments +- **IoT and Automotive**: Process high-volume sensor data efficiently +- **AI/LLM Monitoring**: Track model performance and behavior +- **Real-time Analytics**: Sub-second queries on petabyte-scale datasets -1. Monitor applications and infrastructure -2. Store and access IoT data -3. Process self-driving vehicle data -4. Understand financial trends +## Architecture & Performance -### Does GreptimeDB have a Go driver? +### Can GreptimeDB replace my Prometheus setup? -Yes, you can find our Go SDK [here](https://github.com/GreptimeTeam/greptimedb-ingester-go). +Yes, GreptimeDB provides: +- Native PromQL support with near 100% compatibility +- Prometheus remote write protocol support +- Efficient handling of high-cardinality metrics +- Long-term storage without downsampling +- Better resource efficiency than traditional Prometheus+Thanos stacks -### When will GreptimeDB release its first GA version? +### What indexing capabilities does GreptimeDB offer? -We expect to release the GA version this June. For detailed plans, please refer to: [GreptimeDB 2025 Roadmap Released!](https://greptime.com/blogs/2025-02-06-greptimedb-roadmap2025) +GreptimeDB provides rich indexing options: +- **Inverted indexes**: Fast lookups on tag columns +- **Full-text indexes**: Efficient log searching +- **Skipping indexes**: Accelerate range queries +- **Vector indexes**: Support for AI/ML workloads -### Are there any plans/works done for the official UI for GreptimeDB so that it would be possible to check cluster status, list of tables, statistics etc? +These indexes enable sub-second queries even on petabyte-scale datasets. -Yes, we open sourced the dashboard for users to query and visualize their data. +For configuration details, see [Index Management](/user-guide/manage-data/data-index.md). -Please check out our initial version on [GitHub Repo](https://github.com/GreptimeTeam/dashboard). +### How does GreptimeDB achieve cost efficiency? -### Can GreptimeDB be used as a Rust alternative to Prometheus in the observable area? +GreptimeDB reduces costs through: +- **Columnar storage**: Superior compression ratios +- **Compute-storage separation**: Independent scaling of resources +- **Efficient cardinality management**: Handles high-cardinality data without explosion +- **Unified platform**: Eliminates need for multiple specialized databases -GreptimeDB has implemented native support for PromQL, with over 90% compatibility that can cover most common usage requirement. We are keeping making it comparable to VictoriaMetrics. - -### Is GreptimeDB compatible with Grafana? +Result: Up to 50x lower operational and storage costs compared to traditional stacks. -Yes, It's compatible with Grafana. +### What makes GreptimeDB cloud-native? -GreptimeDB has an official Grafana plugin: [greptimedb-grafana-datasource](https://github.com/GreptimeTeam/greptimedb-grafana-datasource/) +GreptimeDB is purpose-built for Kubernetes with: +- **Disaggregated architecture**: Separate compute and storage layers +- **Elastic scaling**: Add/remove nodes based on workload +- **Multi-cloud support**: Run across AWS, GCP, Azure seamlessly +- **Kubernetes operators**: Simplified deployment and management +- **Object storage backend**: Use S3, GCS, or Azure Blob for data persistence -GreptimeDB also supports MySQL and PostgreSQL protocol, so you can use [MySQL or PG grafana plugin](https://grafana.com/docs/grafana/latest/datasources/mysql/) to config GreptimeDB as a datasource. Then you can use SQL to query the data. +For Kubernetes deployment details, see the [Kubernetes Deployment Guide](/user-guide/deployments-administration/deploy-on-kubernetes/overview.md). -Also, we are implementing PromQL natively which is frequently used with Grafana. +### Does GreptimeDB support schemaless data ingestion? -### How is the performance of GreptimeDB when used for non-time-series DB tables? +Yes, GreptimeDB supports automatic schema creation when using: +- gRPC protocol +- InfluxDB Line Protocol +- OpenTSDB protocol +- Prometheus Remote Write +- OpenTelemetry protocol +- Loki protocol (for log data) +- Elasticsearch-compatible APIs (for log data) -GreptimeDB supports SQL and can deal with non-time-series data, especially efficient for high concurrent and throughput data writing. However, we develop GreptimeDB for a specific domain (Iot and Observerbility scenarios), and it doesn't support transactions and can't delete data efficiently. +Tables and columns are created automatically on first write, eliminating manual schema management. -### Is there any retention policy? +## Integration & Compatibility -GreptimeDB supports both database-level and table-level TTLs. By default, a table inherits the TTL of its database. However, if a table is assigned a specific TTL, the table-level TTL takes precedence. For details, refer to the official documentation on TTL: [TTL Syntax Documentation](/reference/sql/create.md). +### What protocols and tools does GreptimeDB support? -### Where’s the name “Greptime” coming from? +GreptimeDB provides extensive compatibility: +- **Protocols**: OpenTelemetry, Prometheus Remote Write, InfluxDB Line, Loki, Elasticsearch, MySQL, PostgreSQL (see [Protocols Overview](/user-guide/protocols/overview.md)) +- **Query Languages**: SQL, PromQL +- **Visualization**: [Grafana integration](/user-guide/integrations/grafana.md), any MySQL/PostgreSQL compatible tool +- **Data Pipeline**: Vector, Fluent Bit, Telegraf, Kafka +- **SDKs**: Go, Java, Rust, Erlang, Python -Because `grep` is the most useful command line tool on \*nix platform to search data, and time means time series. So Greptime is to help everybody to search/find value in time series data. - -### Does GreptimeDB support schemaless? - -Yes, GreptimeDB is a schemaless database without need for creating tables in advance. The table and columns will be created automatically when writing data with protocol gRPC, InfluxDB Line Protocol, OpenTSDB, Prometheus Remote Write. +### Is GreptimeDB compatible with Grafana? -### Does GreptimeDB support dumping table-level data to S3? +Yes, GreptimeDB offers: +- [Grafana integration](/user-guide/integrations/grafana.md) with official plugin +- [MySQL/PostgreSQL protocol support](/user-guide/integrations/grafana.md#mysql-data-source) for standard Grafana data sources +- [Native PromQL](/user-guide/query-data/promql.md) for Prometheus-style queries +- SQL support for complex analytics -You can use the [`COPY TO` command](/reference/sql/copy.md#s3) to dump table-level data to S3. +### How does GreptimeDB integrate with OpenTelemetry? -### Can GreptimeDB be used for a large-scale internal metrics collection system similar to Fb's Gorilla or Google's Monarch, with a preference for in-memory data and high availability? Are there plans for asynchronous WAL or optional disk storage, and how is data replication handled without WAL? +GreptimeDB is OpenTelemetry-native: +- Direct OTLP ingestion for metrics, logs, and traces +- No translation layer or data loss +- Supports OpenTelemetry Collector and SDKs +- Preserves semantic conventions and resource attributes -GreptimeDB supports asynchronous WAL and is developing a per-table WAL toggle for more control. A tiered storage approach, starting with in-memory caching, is also in development. For data replication, data flushed to remote stores like S3 is replicated independently of WAL. For more about the details of tiered storage, please read the [blog](https://greptime.com/blogs/2025-03-26-greptimedb-storage-architecture). +### What SDKs are available for GreptimeDB? -### If I delete the database, can I use the `DROP DATABASE` command? +- **Go**: [greptimedb-ingester-go](https://github.com/GreptimeTeam/greptimedb-ingester-go) +- **Java**: [greptimedb-ingester-java](https://github.com/GreptimeTeam/greptimedb-ingester-java) +- **Rust**: [greptimedb-ingester-rust](https://github.com/GreptimeTeam/greptimedb-ingester-rust) +- **Erlang**: [greptimedb-ingester-erl](https://github.com/GreptimeTeam/greptimedb-ingester-erl) +- **Python**: Via SQL drivers (MySQL/PostgreSQL compatible) -Yes. You can refer to the official documentation for usage: [`Drop Database`](/reference/sql/drop.md#drop). +### How can I migrate from other databases to GreptimeDB? -### What are the main differences between Greptime and another time-series database built on DataFusion like InfluxDB? +GreptimeDB provides migration guides for popular databases: +- **From ClickHouse**: Table schema and data migration +- **From InfluxDB**: Line protocol and data migration +- **From Prometheus**: Remote write and historical data migration +- **From MySQL/PostgreSQL**: SQL-based migration -At GreptimeDB, we share some technical similarities with InfluxDB, both using Datafusion, Arrow, Parquet, and built on object storage. However, we differ in several key aspects: +For detailed migration instructions, see [Migration Overview](/user-guide/migrate-to-greptimedb/overview.md). -- **Open-Source Strategy**: Unlike InfluxDB, which only open-sources its standalone version, our entire distributed cluster version is open-source. Our architecture can even run on edge Android systems. -- **Distributed Architecture**: Our architecture is more aligned with HBase's Region/RegionServer design. Our Write-Ahead Log (WAL) uses Kafka, and we're exploring a quorum-based implementation in the future. -- **Workload and Services**: We specialize in handling various types of observability data—including metrics, logs, traces, and events—while seamlessly integrating them with analytical workloads. This integration aims to enhance resource efficiency and real-time performance for users. We also offer [GreptimeCloud](https://greptime.com/product/cloud), a commercial cloud service. -- **Storage Engine Design**: Our pluggable storage engine is versatile. For scenarios with many small data tables, like in Prometheus, we have a dedicated Metrics storage engine. -- **Query Language Support**: We support PromQL for observability and SQL for data analysis, and incorporate Python for complex data processing. InfluxDB, on the other hand, uses InfluxQL and SQL. +### What disaster recovery options does GreptimeDB provide? -We're a young, rapidly evolving project and always looking to improve. For more details, visit [our Blog](https://greptime.com/blogs/) and [Contributor Guide](/contributor-guide/overview). We welcome your interest and contributions! +GreptimeDB offers multiple disaster recovery strategies to meet different availability requirements: -### As a first-timer looking to contribute to GreptimeDB, where can I find a comprehensive guide to get started? +- **Standalone DR Solution**: Uses remote WAL and object storage, achieving RPO=0 and RTO in minutes for small-scale scenarios +- **Region Failover**: Automatic failover for individual regions with minimal downtime +- **Active-Active Failover** (Enterprise): Synchronous request replication between nodes for high availability +- **Cross-Region Single Cluster**: Spans three regions with zero RPO and region-level error tolerance +- **Backup and Restore**: Periodic data backups with configurable RPO based on backup frequency -Welcome! Please refer to our [contribution guide](https://github.com/GreptimeTeam/greptimedb/blob/main/CONTRIBUTING.md). For those new to GreptimeDB, we have a selected collection of [good first issues](https://github.com/GreptimeTeam/greptimedb/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+first+issue%22). Feel free to reach us in Slack channel anytime! +Choose the appropriate solution based on your availability requirements, deployment scale, and cost considerations. For detailed guidance, see [Disaster Recovery Overview](/user-guide/deployments-administration/disaster-recovery/overview.md). -### Does GreptimeDB have a way to handle absolute counters that can reset, like InfluxDB's non-negative differential? How do aggregations work with these counters, and is PromQL preferred over SQL for them? Also, is there a plan to integrate PromQL functions into SQL, similar to InfluxDB v3? +## Data Management & Processing -GreptimeDB, like Prometheus, handles counters effectively. Functions like` reset()`, `rate()`, or `delta()` in GreptimeDB are designed to automatically detect and adjust for counter resets. While it's not recommended to use the `deriv()` function on a counter since it's meant for gauges, you can apply `rate()` to your counter and then use `deriv()`. PromQL is indeed more suitable for operations involving counters, given its origin in Prometheus. However, we are exploring the integration of PromQL functions into SQL for greater flexibility. If you're interested in implementing functions into GreptimeDB, we have documentation available which you can check out: [Greptime Documentation](https://github.com/GreptimeTeam/greptimedb/blob/main/docs/how-to/how-to-write-aggregate-function.md). +### How does GreptimeDB handle data lifecycle? -### What are the feature differences between the open-source version and the cloud version of GreptimeDB? +**Retention Policies**: +- Database-level and table-level TTL settings +- Automatic data expiration without manual cleanup +- Configurable via [TTL Documentation](/reference/sql/create.md#table-options) -Below are some key points: +**Data Export**: +- [`COPY TO` command](/reference/sql/copy.md#connect-to-s3) for S3, local files +- Standard SQL queries via any compatible client +- Export functionality for backup and disaster recovery: [Back up & Restore Data](/user-guide/deployments-administration/disaster-recovery/back-up-&-restore-data.md) -- **Foundational Features**: The foundational features, including the ingestion protocol, SQL capabilities, and storage functions, are largely identical between the two versions. However, GreptimeCloud offers advanced SQL functions and additional features. -- **Fully Managed Service**: GreptimeCloud is a fully managed service that supports multi-tenancy, data encryption, and security audits for compliance, which are not available in the open-source version. GreptimeCloud supports dedicated deployments as well as a pay-as-you-go serverless model. -- **Enhanced Dashboard**: Another significant advantage of GreptimeCloud is its superior dashboard, which is more user-friendly and includes a unique Prometheus workbench. This workbench facilitates online editing of Prometheus dashboards and alert rules, as well as GitOps integration. +### How does GreptimeDB handle high-cardinality and real-time processing? -As mentioned, the cloud version offers more ready-to-use features to help you get started quickly. The core features are almost identical, especially on our dedicated plan. +**High-Cardinality Management**: +- Advanced indexing strategies prevent cardinality explosion +- Columnar storage with intelligent compression +- Distributed query execution with data pruning +- Handles millions of unique time series efficiently -### Where can I find documentation related to on-premises deployment and performance benchmark reports? +Learn more about indexing: [Index Management](/user-guide/manage-data/data-index.md) -You can find the public TSBS benchmark results [here](https://github.com/GreptimeTeam/greptimedb/tree/main/docs/benchmarks/tsbs) and the deployment documentation [here](/getting-started/installation/overview.md). +**Real-Time Processing**: +- **[Flow Engine](/user-guide/flow-computation/overview.md)**: Real-time stream processing system that enables continuous, incremental computation on streaming data with automatic result table updates +- **[Pipeline](/user-guide/logs/pipeline-config.md)**: Data parsing and transformation mechanism for processing incoming data in real-time, with configurable processors for field extraction and data type conversion across multiple data formats +- **Output Tables**: Persist processed results for analysis -For more about performance reports please read [How is GreptimeDB's performance compared to other solutions?](/user-guide/concepts/features-that-you-concern.md#how-is-greptimedbs-performance-compared-to-other-solutions) +### What are GreptimeDB's scalability characteristics? -### What should I do if the region becomes `DOWNGRADED` and the tables on that node become read-only after the datanode restarts? Is there a way to automatically reactivate it? +**Scale Limits**: +- No strict limitations on table or column count +- Hundreds of tables with minimal performance impact +- Performance scales with primary key design, not table count +- Column-oriented storage ensures efficient partial reads -According to your configuration, the failover in metasrv, which may mark the region as `DOWNGRADED`, is disabled. Another procedure that may mark a region as `DOWNGRADED` is the region migration procedure. Please try running the region migration procedure and provide feedback for further assistance. +**Partitioning & Distribution**: +- Automatic time-based organization within regions +- Manual distributed sharding via PARTITION clause (see [Table Sharding Guide](/user-guide/deployments-administration/manage-data/table-sharding.md)) +- Automatic region splitting planned for future releases +- **Dynamic partitioning without configuration** (Enterprise feature) -### Is there a guide or suggestions for compiling GreptimeDB into a standalone binary with only the necessary modules and features for an embedded environment? +**Core Scalability Features**: +- **Multi-tiered caching**: Write cache (disk-backed) and read cache (LRU policy) for optimal performance +- **Object storage backend**: Virtually unlimited storage via S3/GCS/Azure Blob +- **Asynchronous WAL**: Efficient write-ahead logging with optional per-table controls +- **Distributed query execution**: Multi-node coordination for large datasets +- **Manual Compaction**: Available via [admin commands](/reference/sql/admin.md) -We have prebuilt binaries for Android ARM64 platforms, which have been successfully used in some enterprise projects. However, these binaries are not available for bare metal devices, as some fundamental components of GreptimeDB require a standard library. +**Enterprise Scale Features**: +- Advanced partitioning and automatic rebalancing +- Enhanced multi-tenancy and isolation +- Enterprise-grade monitoring and management tools -### Is there a built-in SQL command like `compaction table t1` that can be used for manual compaction? +For architecture details, see the [storage architecture blog](https://greptime.com/blogs/2025-03-26-greptimedb-storage-architecture). -Please refer [here](/reference/sql/admin.md). +### What are GreptimeDB's design trade-offs? -### Can GreptimeDB be used to store logs? +GreptimeDB is optimized for observability workloads with intentional limitations: +- **No ACID transactions**: Prioritizes high-throughput writes over transactional consistency +- **Limited delete operations**: Designed for append-heavy observability data +- **Time-series focused**: Optimized for IoT, metrics, logs, and traces rather than general OLTP +- **Simplified joins**: Optimized for time-series queries over complex relational operations -Yes, please refer [here](/user-guide/logs/overview.md ) for detailed information. +## Deployment & Operations -### How is the query performance for non-primary key fields? Can inverted indexes be set? Will the storage cost be lower compared to Elasticsearch? +### What are the deployment options for GreptimeDB? -Non-primary key fields can also have indexes created. For details, refer to [Index Management](/user-guide/manage-data/data-index.md). +**Cluster Deployment** (Production): +- Minimum 3 nodes for high availability +- Services: metasrv, frontend, and datanode on each node +- Can separate services for larger scale deployments +- See [Capacity Planning Guide](/user-guide/deployments-administration/capacity-plan.md) -GreptimeDB's storage cost is significantly lower than Elasticsearch, with the size of persisted data being only 50% of ClickHouse and 12.7% of Elasticsearch. For more information, see the [Performance Benchmark](https://greptime.com/blogs/2025-03-10-log-benchmark-greptimedb). +**Edge & Standalone**: +- Android ARM64 platforms (prebuilt binaries available) +- Raspberry Pi and constrained environments +- Single-node mode for development/testing +- Efficient resource usage for IoT scenarios -### Is the Log-Structured Merge-Tree engine similar to Kafka's engine model? +**Storage Backends**: +- **Production**: S3, GCS, Azure Blob for data persistence +- **Development**: Local storage for testing +- **Metadata**: MySQL/PostgreSQL backend support for metasrv -From a technical storage perspective, they are similar. However, the actual data formats differ: GreptimeDB reads and writes Parquet format, while Kafka uses its own RecordBatch format. To analyze time-series data temporarily stored in Kafka, it needs to be written into GreptimeDB first. +For deployment and administration details: [Deployments & Administration Overview](/user-guide/deployments-administration/overview.md) -You can use Vector to consume Kafka messages and write them into GreptimeDB. For more details, refer to [this article](https://greptime.com/blogs/2024-10-29-ingest-data-using-vector). +### How does data distribution work? -### Are there limitations on the number of tables or columns in GreptimeDB? Does having many columns affect read and write performance? +**Current State**: +- Manual partitioning via PARTITION clause during table creation (see [Table Sharding Guide](/user-guide/deployments-administration/manage-data/table-sharding.md)) +- Time-based automatic organization within regions +- Manual region migration support for load balancing (see [Region Migration Guide](/user-guide/deployments-administration/manage-data/region-migration.md)) +- Automatic region failover for disaster recovery (see [Region Failover](/user-guide/deployments-administration/manage-data/region-failover.md)) -Generally, there are no strict limitations. With a few hundred tables, as long as there aren't many primary key columns, the impact on write performance is minimal (measured by the number of points written per second, not rows). +**Roadmap**: +- Automatic region splitting and rebalancing +- Dynamic workload distribution across nodes -Similarly, for reads, if queries only involve a subset of columns, the memory and computational load will not be significantly high. +### How do I monitor and troubleshoot GreptimeDB? + +GreptimeDB provides comprehensive monitoring capabilities including metrics collection, health checks, and observability integrations. For detailed monitoring setup and troubleshooting guides, see the [Monitoring Overview](/user-guide/deployments-administration/monitoring/overview.md). + +## Open Source vs Enterprise vs Cloud -### How many servers are generally needed to set up a reliable GreptimeDB cluster, and how should Frontend, Datanode, and Metasrv be deployed? Should each node run all three services regardless of the number of nodes? +### What are the differences between GreptimeDB versions? -A minimum of 3 nodes is required, with each node running the 3 services: metasrv, frontend, and datanode. However, the exact number of nodes depends on the scale of data being handled. +**Open Source Version**: +- High-performance ingestion and query capabilities +- Cluster deployment with basic read-write separation +- Multiple protocol support (OpenTelemetry, Prometheus, InfluxDB, etc.) +- Basic authentication and access control +- Basic data encryption +- Community support -It is not necessary to deploy all three services on each node. A small-sized cluster can be set up with 3 nodes dedicated to metasrv. Frontend and datanode can be deployed on equal nodes, with one container running two processes. +**Enterprise Version** (all Open Source features plus): +- Cost-based query optimizer for better performance +- Advanced read-write separation and active-active failover (see [Active-Active Failover](/enterprise/deployments-administration/disaster-recovery/dr-solution-based-on-active-active-failover.md)) +- Automatic scaling, indexing, and load balancing +- Layered caching and enterprise-level web console +- Enterprise authorization (RBAC/LDAP integration) +- Enhanced security and audit features +- One-on-one technical support with 24/7 service response +- Professional customization services -For more general advice for deployment, please read [Capacity Plan](/user-guide/deployments-administration/capacity-plan.md). +**GreptimeCloud** (fully managed, all Enterprise features plus): +- Serverless autoscaling with pay-as-you-go pricing +- Fully managed deployment with seamless upgrades +- Independent resource pools with isolated networks +- Configurable read/write capacity and unlimited storage +- Advanced dashboard with Prometheus workbench +- SLA guarantees and automated disaster recovery + +For detailed comparison, see [Pricing & Features](https://greptime.com/pricing#differences). + +### What security features are available? + +**Open Source**: +- Basic username/password authentication +- TLS/SSL support for connections -### In the latest version, does the Flow Engine (pre-computation) feature support PromQL syntax for calculations? +**Enterprise/Cloud**: +- Role-based access control (RBAC) +- Team management and API keys +- Data encryption at rest +- Audit logging for compliance -This is a good suggestion. Currently, the Flow Engine does not support PromQL syntax for calculations. We will evaluate this, as it seems theoretically feasible. +## Technical Details -### Will Metasrv support storage backends like MySQL or PostgreSQL? +### How does GreptimeDB extend Apache DataFusion? -The latest version of GreptimeDB now supports PostgreSQL as the storage backend for Metasrv. For details, please refer to [here](/user-guide/deployments-administration/configuration.md#metasrv-only-configuration). +GreptimeDB builds on DataFusion with: +- **Query Languages**: Native PromQL alongside SQL +- **Distributed Execution**: Multi-node query coordination +- **Custom Functions**: Time-series specific UDFs/UDAFs +- **Optimizations**: Rules tailored for observability workloads +- **Counter Handling**: Automatic reset detection in `rate()` and `delta()` functions -### What is the best way to downsample interface traffic rates (maximum rate within every hour) from multiple NICs(network interface controller) across thousands of computers every 30 seconds, so that the data can be kept for many years? +For custom function development: [Function Documentation](https://github.com/GreptimeTeam/greptimedb/blob/main/docs/how-to/how-to-write-aggregate-function.md) -Using a flow table is the appropriate tool for this task. A simple flow task should suffice. The output of a flow task is stored in a normal table, allowing it to be kept indefinitely. +### What's the difference between GreptimeDB and InfluxDB? -### Can GreptimeDB create dynamic day partitions? +Key differences: +- **Open Source**: GreptimeDB's entire distributed system is fully open source +- **Architecture**: Region-based design optimized for observability workloads +- **Query Languages**: SQL + PromQL vs InfluxQL + SQL +- **Unified Model**: Native support for metrics, logs, and traces in one system +- **Storage**: Pluggable engines with dedicated optimizations +- **Cloud-Native**: Built for Kubernetes with disaggregated compute/storage (see [Kubernetes Deployment Guide](/user-guide/deployments-administration/deploy-on-kubernetes/overview.md)) + +For detailed comparisons, see [GreptimeDB vs InfluxDB](https://greptime.com/compare/influxdb). Additional product comparisons (vs. ClickHouse, Loki, etc.) are available in the Resources menu on our website. -Yes. Within a Region, time-series data is dynamically organized by time by default, without requiring any configuration. Please note that this is a different concept from the sharding of distributed tables. One refers to data organization within a shard(called region), while the other refers to the distributed partitioning of the data. +### How does GreptimeDB's storage engine work? -### Which parts of DataFusion are customized in GreptimeDB? +**LSM-Tree Architecture**: +- Based on Log-Structured Merge Tree (LSMT) design +- WAL can use local disk or distributed services (e.g., Kafka) via Log Store API +- SST files are flushed to object storage (S3/GCS) or local disk +- Designed for cloud-native environments with object storage as primary backend +- Optimized for time-series workloads with TWCS (Time-Window Compaction Strategy) -GreptimeDB customizes the following aspects of DataFusion: -- PromQL query support. -- Distributed query execution. -- Custom UDFs (User-Defined Functions) and UDAFs (User-Defined Aggregate Functions). -- Custom optimization rules +**Performance Considerations**: +- **Timestamps**: Datetime formats (yyyy-MM-dd HH:mm:ss) have no performance impact +- **Compression**: Measure only data directory; WAL is cyclically reused +- **Append-only tables**: Recommended for better write and query performance, especially for log scenarios +- **Flow Engine**: Currently SQL-based; PromQL support under evaluation -### Does the open-source version of GreptimeDB support fine-grained access control? +### What are best practices for specific use cases? -The open-source version supports basic username-password authentication only. Fine-grained access control like RBAC is available in the enterprise edition. +**Network Monitoring** (e.g., thousands of NICs): +- Use Flow tables for continuous aggregation +- Manual downsampling via Flow Engine for data reduction +- Output to regular tables for long-term storage -### Does writing TIMESTAMP values in datetime format affect query performance? +**Log Analytics**: +- Use append-only tables for better write and query performance +- Create indexes on frequently queried fields ([Index Management](/user-guide/manage-data/data-index.md)) +- Storage efficiency: 50% of ClickHouse, 12.7% of Elasticsearch -No, writing in datetime format (e.g., yyyy-MM-dd HH:mm:ss) does not affect query performance. The underlying storage format remains consistent. +**Table Design & Performance**: +- For table modeling guidance: [Design Table](/user-guide/deployments-administration/performance-tuning/design-table.md) +- For performance optimization: [Performance Tuning Tips](/user-guide/deployments-administration/performance-tuning/performance-tuning-tips.md) + + +## Getting Started + +### Where can I find documentation and benchmarks? + +**Performance & Benchmarks**: +- [TSBS Benchmarks](https://github.com/GreptimeTeam/greptimedb/tree/main/docs/benchmarks/tsbs) +- [Performance Comparisons](/user-guide/concepts/features-that-you-concern.md#how-is-greptimedbs-performance-compared-to-other-solutions) +- [vs InfluxDB](https://greptime.com/blogs/2024-08-07-performance-benchmark) +- [vs Loki](https://greptime.com/blogs/2025-08-07-beyond-loki-greptimedb-log-scenario-performance-report) +- [Log Benchmark](https://greptime.com/blogs/2025-03-10-log-benchmark-greptimedb) + +**Installation & Deployment**: +- [Installation Guide](/getting-started/installation/overview.md) +- [Capacity Planning](/user-guide/deployments-administration/capacity-plan.md) +- [Configuration Guide](/user-guide/deployments-administration/configuration.md) -### When assessing data compression, should I consider only the data directory size or include the wal directory? +### How can I contribute to GreptimeDB? -You only need to consider the data directory size. The WAL directory is cyclically reused and does not factor into data compression metrics. +Welcome to the community! Get started: +- **Code**: [Contribution Guide](https://github.com/GreptimeTeam/greptimedb/blob/main/CONTRIBUTING.md) +- **First Issues**: [Good First Issues](https://github.com/GreptimeTeam/greptimedb/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+first+issue%22) +- **Community**: [Slack Channel](https://greptime.com/slack) +- **Documentation**: [Help improve these docs!](https://github.com/GreptimeTeam/docs) -### In cluster mode, without using PARTITION during table creation, does data automatically balance across datanodes? +### What's next? -Currently, data does not automatically balance across datanodes without the PARTITION feature. This capability requires the implementation of region split and auto-rebalance, which is planned for versions v1.2 or v1.3. +1. **Try GreptimeCloud**: [Free serverless tier](https://greptime.com/product/cloud) +2. **Self-host**: Follow the [installation guide](/getting-started/installation/overview.md) +3. **Explore Integrations**: GreptimeDB supports extensive integrations with Prometheus, Vector, Kafka, Telegraf, EMQX, Metabase, and many more. See [Integrations Overview](/user-guide/integrations/overview.md) for the complete list, or start with [OpenTelemetry](/user-guide/ingest-data/for-observability/opentelemetry.md) or [Prometheus](/user-guide/ingest-data/for-observability/prometheus.md) +4. **Join Community**: Connect with users and maintainers on [Slack](https://greptime.com/slack) \ No newline at end of file diff --git a/docs/user-guide/concepts/features-that-you-concern.md b/docs/user-guide/concepts/features-that-you-concern.md index 2fbc8c18d..908d6b569 100644 --- a/docs/user-guide/concepts/features-that-you-concern.md +++ b/docs/user-guide/concepts/features-that-you-concern.md @@ -66,10 +66,14 @@ Please read the performance benchmark reports: Yes. Please refer to [disaster recovery](/user-guide/deployments-administration/disaster-recovery/overview.md). -## Does GeptimeDB has geospatial index? +## Does GreptimeDB has geospatial index? Yes. We offer [built-in functions](/reference/sql/functions/geo.md) for Geohash, H3 and S2 index. ## Any JSON support? See [JSON functions](/reference/sql/functions/overview.md#json-functions). + +## More Questions? + +For more comprehensive answers to frequently asked questions about GreptimeDB, including deployment options, migration guides, performance comparisons, and best practices, please visit our [FAQ page](/faq-and-others/faq.md). diff --git a/docs/user-guide/concepts/overview.md b/docs/user-guide/concepts/overview.md index ed735ea4f..3acd8d6fc 100644 --- a/docs/user-guide/concepts/overview.md +++ b/docs/user-guide/concepts/overview.md @@ -11,6 +11,7 @@ description: Provides an overview of GreptimeDB, including its features, data mo - [Storage Location](./storage-location.md): This document describes the storage location of GreptimeDB, including local disk, HDFS, and cloud object storage such as S3, Azure Blob Storage, etc. - [Key Concepts](./key-concepts.md): This document describes the key concepts of GreptimeDB, including table, time index, table region and data types. - [Features that You Concern](./features-that-you-concern.md): Describes some features that may be concerned about a unified metrics, logs & events database. +- [Frequently Asked Questions](/faq-and-others/faq.md): Comprehensive FAQ covering common questions about GreptimeDB's capabilities, deployment, and usage. ## Read More diff --git a/docs/user-guide/concepts/why-greptimedb.md b/docs/user-guide/concepts/why-greptimedb.md index 595454cdc..95cef834f 100644 --- a/docs/user-guide/concepts/why-greptimedb.md +++ b/docs/user-guide/concepts/why-greptimedb.md @@ -25,7 +25,7 @@ GreptimeDB leverages cloud object storage (like AWS S3 and Azure Blob Storage et ## High Performance -As for performance optimization, GreptimeDB utilizes different techniques such as LSM Tree, data sharding, and kafka-based WAL design, to handle large workloads of observability data ingestion. +As for performance optimization, GreptimeDB utilizes different techniques such as LSM Tree, data sharding, and flexible WAL options (local disk or distributed services like Kafka), to handle large workloads of observability data ingestion. GreptimeDB is written in pure Rust for superior performance and reliability. The powerful and fast query engine is powered by vectorized execution and distributed parallel processing (thanks to [Apache DataFusion](https://datafusion.apache.org/)), and combined with [indexing capabilities](/user-guide/manage-data/data-index.md) such as inverted index, skipping index, and full-text index. GreptimeDB combines smart indexing and Massively Parallel Processing (MPP) to boost pruning and filtering. @@ -75,7 +75,6 @@ GreptimeDB supports multiple data ingestion protocols, making integration with e For data querying, GreptimeDB provides: - **SQL**: For real-time queries, analytics, and database management - **PromQL**: Native support for real-time metrics querying and Grafana integration -- **Python** *(Planned)*: For in-database UDFs and DataFrame operations GreptimeDB integrates seamlessly with your observability stack while maintaining high performance and flexibility. diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/faq-and-others/faq.md b/i18n/zh/docusaurus-plugin-content-docs/current/faq-and-others/faq.md index 4eefbbc74..14bb67c48 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/faq-and-others/faq.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/faq-and-others/faq.md @@ -1,235 +1,418 @@ --- -keywords: [常见问题] -description: 关于 GreptimeDB 的常见问题解答。 +keywords: [统一可观测性, metrics, logs, traces, 性能, OpenTelemetry, Prometheus, Grafana, 云原生, SQL, PromQL] +description: 关于 GreptimeDB 的常见问题解答 - 统一可观测性数据库,支持 metrics、logs 和 traces。 --- # 常见问题 -### GreptimeDB 的性能与其他解决方案相比如何? - -参考[GreptimeDB 对比其他存储或时序数据库的性能如何?](/user-guide/concepts/features-that-you-concern.md#greptimedb-对比其他存储或时序数据库的性能如何)。 +## 核心能力 -### GreptimeDB 与 Loki 相比有什么区别?是否提供 Rust 绑定的库,能否支持 Traces 和 Logs? +### 什么是 GreptimeDB? -GreptimeDB 现在仅支持日志(Log)数据类型,在 v0.10 版本中引入了对多个行业协议的兼容性。这些协议包括 Loki Remote Write、Vector 插件,以及全范围的 OTLP 数据类型(指标、追踪、日志)。 +GreptimeDB 是一个开源、云原生的统一可观测性数据库,旨在在单一系统中存储和分析 metrics、logs 和 traces。基于 Rust 构建以实现高性能,它提供: +- 高达 50 倍的运营和存储成本降低 +- 在 PB 级数据集上实现亚秒级查询响应 +- 原生 OpenTelemetry 支持 +- SQL、PromQL 和流处理能力 +- 计算存储分离,实现灵活扩展 -我们计划进一步优化日志引擎,着重提升查询性能和用户体验。未来的增强功能将包括(但不限于)扩展 GreptimeDB 日志查询 DSL 的功能,并实现与部分 Elasticsearch/Loki API 的兼容,为用户提供更高效、灵活的日志查询能力。 +### GreptimeDB 的性能与其他解决方案相比如何? -关于如何使用 GreptimeDB 处理日志的更多信息,您可以参考以下文档: -- [日志概述](/user-guide/logs/overview) -- [OpenTelemetry 兼容性](/user-guide/ingest-data/for-observability/opentelemetry) +GreptimeDB 在可观测性工作负载中提供卓越性能: + +**写入性能**: +- 比 Elasticsearch **快 2-4.7倍**(高达 470% 吞吐量) +- 比 Loki **快 1.5倍**(121k vs 78k rows/s) +- 比 InfluxDB **快 2倍**(250k-360k rows/s) +- **媲美 ClickHouse**(达到 111% 吞吐量) + +**查询性能**: +- 日志查询比 Loki **快 40-80倍** +- 重复查询**快 500倍**(缓存优化) +- 复杂时序查询比 InfluxDB **快 2-11倍** +- 与 ClickHouse 在不同查询模式下性能相当 + +**存储与成本效率**: +- 存储占用比 Elasticsearch **少 87%**(仅需 12.7%) +- 比 ClickHouse **节省 50%** 存储 +- 比 Loki **节省 50%** 存储(3.03GB vs 6.59GB 压缩后) +- 运营成本比传统架构**降低 50倍** + +**资源优化**: +- CPU 使用率**减少 40%** +- 在测试数据库中**内存消耗最低** +- 对象存储(S3/GCS)上性能一致 +- 卓越的高基数数据处理 + +**独特优势**: +- 单一数据库处理 metrics、logs 和 traces +- 原生云原生架构 +- 水平扩展能力(处理 11.5亿+ 行数据) +- 原生全文搜索和索引 + +基准测试报告:[vs InfluxDB](https://greptime.cn/blogs/2024-08-08-report) | [vs Loki](https://greptime.cn/blogs/2025-08-07-beyond-loki-greptimedb-log-scenario-performance-report.html) | [日志基准测试](https://greptime.cn/blogs/2025-03-07-greptimedb-log-benchmark) + +### GreptimeDB 如何处理 metrics、logs 和 traces? + +GreptimeDB 设计为统一可观测性数据库,原生支持三种遥测数据类型: +- **Metrics**:完全兼容 Prometheus,支持 PromQL +- **Logs**:全文索引、Loki 协议支持和高效压缩 +- **Traces**:实验性 OpenTelemetry trace 存储,支持可扩展查询 + +这种统一方法消除了数据孤岛,无需复杂的数据管道即可实现跨信号关联。 + +详细文档: +- [日志概述](/user-guide/logs/overview.md) +- [链路追踪概述](/user-guide/traces/overview.md) +- [OpenTelemetry 兼容性](/user-guide/ingest-data/for-observability/opentelemetry.md) +- [Prometheus 兼容性](/user-guide/ingest-data/for-observability/prometheus.md) - [Loki 协议兼容性](/user-guide/ingest-data/for-observability/loki.md) -- [Vector 兼容性](/user-guide/ingest-data/for-observability/vector) - -### 时序数据库的常见应用场景有哪些? - -时序数据库的常见应用场景包括但不限于以下四种: -- 监控应用程序和基础设施; -- 存储和访问物联网数据; -- 处理自动驾驶汽车数据; -- 分析金融趋势。 - -### GreptimeDB 是否有 Go 驱动? - -用户可以查看 [Go SDK](https://github.com/GreptimeTeam/greptimedb-ingester-go) 的具体信息。 - -### GreptimeDB 什么时候发布 GA(正式发布)版本? - -我们预计在今年六月发布 GA 版本,具体计划参考:[GreptimeDB 2025 Roadmap 发布!](https://greptime.cn/blogs/2025-01-24-greptimedb-roadmap2025) - -### GreptimeDB 是否有官方的 UI 界面用于查看集群状态、表列表和统计信息等? - -是的,我们已经开源了 Dashboard 供用户查询和观测数据。 +- [Elasticsearch 兼容性](/user-guide/ingest-data/for-observability/elasticsearch.md) +- [Vector 兼容性](/user-guide/ingest-data/for-observability/vector.md) -访问 [GitHub 仓库](https://github.com/GreptimeTeam/dashboard)查看详细信息。 +### GreptimeDB 的主要应用场景是什么? -### 在可观测领域,GreptimeDB 可以作为 Rust 版本的 Prometheus 替代品使用吗? +GreptimeDB 擅长于: +- **统一可观测性**:用单一数据库替代复杂的监控堆栈 +- **边缘和云数据管理**:跨环境无缝数据同步 +- **IoT 和汽车**:高效处理大量传感器数据 +- **AI/LLM 监控**:跟踪模型性能和行为 +- **实时分析**:在 PB 级数据集上实现亚秒级查询 -GreptimeDB 已原生支持 PromQL 且兼容性超 90%,可以满足大部分常见的使用需求。我们正在持续改进其功能,以使其与 VictoriaMetrics 相媲美。 +## 架构与性能 -### GreptimeDB 是否与 Grafana 兼容? +### GreptimeDB 能否替代我的 Prometheus 设置? -是的,GreptimeDB 与 Grafana 完全兼容。GreptimeDB 提供了官方的 Grafana 插件:[greptimedb-grafana-datasource](https://github.com/GreptimeTeam/greptimedb-grafana-datasource/)。 +是的,GreptimeDB 提供: +- 原生 PromQL 支持,兼容性接近 100% +- Prometheus remote write 协议支持 +- 高效处理高基数 metrics +- 无需降采样的长期存储 +- 比传统 Prometheus+Thanos 堆栈更高的资源效率 -此外,GreptimeDB 还支持 MySQL 和 PostgreSQL 协议,因此用户可以使用 [MySQL 或 PG 的 Grafana 插件](https://grafana.com/docs/grafana/latest/datasources/mysql/),将 GreptimeDB 配置为数据源,然后使用 SQL 查询数据。 +### GreptimeDB 提供哪些索引能力? -我们还原生实现了 PromQL,能够配合 Grafana 一起使用。 +GreptimeDB 提供丰富的索引选项: +- **倒排索引**:标签列的快速查找 +- **全文索引**:高效日志搜索 +- **跳跃索引**:加速范围查询 +- **向量索引**:支持 AI/ML 工作负载 -### GreptimeDB 用于非时序数据库表时的性能如何? +这些索引即使在 PB 级数据集上也能实现亚秒级查询。 -GreptimeDB 支持 SQL 查询,也能够处理非时序数据,尤其在高并发和高吞吐量的数据写入场景下表现出色。然而,GreptimeDB 是针对特定领域(可观测性数据场景)开发的,它不支持事务,也无法高效地删除数据。 +配置详情请参见[索引管理](/user-guide/manage-data/data-index.md)。 -### GreptimeDB 是否有数据保留策略(Retention Policy)? +### GreptimeDB 如何实现成本效益? -GreptimeDB 支持数据库级别和表级别的 TTL(Time To Live,存活时间)。默认情况下,表会继承其所属数据库的 TTL 设置;但是,如果表被赋予了特定的 TTL,则表级 TTL 会优先生效。 +GreptimeDB 通过以下方式降低成本: +- **列式存储**:卓越的压缩比 +- **计算存储分离**:独立扩展资源 +- **高效基数管理**:处理高基数数据而不发生爆炸 +- **统一平台**:消除对多个专用数据库的需求 -有关 TTL 的详细信息,请参考官方文档中的 [TTL 语法说明](/reference/sql/create)。 +结果:比传统堆栈降低高达 50 倍的运营和存储成本。 -### “Greptime”这个名字来源于哪里? -“Greptime”由两个部分组成:`grep` 是 `*nix` 系统中最常用的命令行工具,用于搜索数据;而 time 则代表时序数据。因此,Greptime 寓意着帮助每个人在时序数据中查找/搜索有价值的信息。 +### 什么使 GreptimeDB 成为云原生? -### GreptimeDB 是否支持 Schemaless? +GreptimeDB 专为 Kubernetes 构建: +- **分解架构**:分离计算和存储层 +- **弹性扩展**:根据工作负载添加/删除节点 +- **多云支持**:无缝运行在 AWS、GCP、Azure +- **Kubernetes operators**:简化部署和管理 +- **对象存储后端**:使用 S3、GCS 或 Azure Blob 进行数据持久化 -是的,GreptimeDB 是一个 Schemaless 的数据库,无需事先创建表。在使用 gRPC、InfluxDB Line Protocol、OpenTSDB 和 Prometheus Remote Write 协议写入数据时,会自动创建表和列。 +Kubernetes 部署详情请参见 [Kubernetes 部署指南](/user-guide/deployments-administration/deploy-on-kubernetes/overview.md)。 -### GreptimeDB 是否支持将表级数据导出到 S3? +### GreptimeDB 支持无 schema 数据摄入吗? -是的,用户可以使用 `COPY TO` 命令将表级数据导出到 S3。 +是的,GreptimeDB 在使用以下协议时支持自动 schema 创建: +- gRPC 协议 +- InfluxDB Line Protocol +- OpenTSDB 协议 +- Prometheus Remote Write +- OpenTelemetry 协议 +- Loki 协议(日志数据) +- Elasticsearch 兼容 API(日志数据) -### GreptimeDB 可以与夜莺集成吗?其兼容性如何? +表和列在首次写入时自动创建,无需手动 schema 管理。 -可以,参考这篇[博客](https://greptime.cn/blogs/2024-09-04-nightingale)。 -目前,GreptimeDB 的兼容性工作主要集中在原生 PromQL 的实现上。未来,我们将继续增强与 MetricQL 扩展语法的兼容性。 +## 集成与兼容性 -### GreptimeDB 是否适用于大规模内部指标收集系统(类似于 Facebook 的 Gorilla 或 Google 的 Monarch)?对于偏好内存数据存储和高可用性场景,有无异步 WAL 或可选磁盘存储的计划?数据复制在无 WAL 的情况下如何处理? +### GreptimeDB 如何与现有工具和系统集成? -GreptimeDB 支持异步 WAL(Write Ahead Log,预写日志),正在开发基于每张表的 WAL 开关功能,以便更灵活地控制日志写入。同时,GreptimeDB 正在构建分层存储机制,其中以内存缓存为起点的存储策略已在开发中。 +**协议支持**: +- **数据写入**:OpenTelemetry、Prometheus Remote Write、InfluxDB Line、Loki、Elasticsearch、gRPC(参见[协议概述](/user-guide/protocols/overview.md)) +- **数据查询**:MySQL、PostgreSQL 协议兼容 +- **查询语言**:SQL、PromQL -在数据复制方面,写入到远程存储(如 S3)的数据会独立于 WAL 进行复制。有关分层存储的详细信息,可以阅读[本篇博客](https://greptime.cn/blogs/2025-03-26-greptimedb-storage-architecture)。 +**可视化与监控**: +- **Grafana**:[Grafana 集成](/user-guide/integrations/grafana.md)(含官方插件)+ [MySQL/PostgreSQL 数据源支持](/user-guide/integrations/grafana.md#mysql-数据源) +- **原生 PromQL**:[直接支持](/user-guide/query-data/promql.md) Prometheus 风格查询和仪表板 +- **任何 SQL 工具**:通过 MySQL/PostgreSQL 协议兼容 -### 如果想删除数据库,可以使用 `DROP DATABASE` 这个命令来实现吗? +**数据管道集成**: +- **OpenTelemetry**:原生 OTLP 摄入,无转换层,保留所有语义约定 +- **数据收集**:Vector、Fluent Bit、Telegraf、Kafka +- **实时流处理**:直接从 Kafka、Vector 等系统接收数据 -是的。请参考[`Drop Database`](/reference/sql/drop.md#drop)文档。 +**SDK 和客户端库**: +- **Go**:[greptimedb-ingester-go](https://github.com/GreptimeTeam/greptimedb-ingester-go) +- **Java**:[greptimedb-ingester-java](https://github.com/GreptimeTeam/greptimedb-ingester-java) +- **Rust**:[greptimedb-ingester-rust](https://github.com/GreptimeTeam/greptimedb-ingester-rust) +- **Erlang**:[greptimedb-ingester-erl](https://github.com/GreptimeTeam/greptimedb-ingester-erl) +- **Python**:通过标准 SQL 驱动程序(MySQL/PostgreSQL 兼容) -### GreptimeDB 和其他基于 DataFusion 的时序数据库(如 InfluxDB)有什么主要区别? +### 如何从其他数据库迁移到 GreptimeDB? -GreptimeDB 与 InfluxDB 在一些技术上有相似之处,例如都使用了 DataFusion、Arrow、Parquet 并基于对象存储。但在以下几个方面有所不同: +GreptimeDB 为流行数据库提供迁移指南: +- **从 ClickHouse**:表结构和数据迁移 +- **从 InfluxDB**:Line protocol 和数据迁移 +- **从 Prometheus**:Remote write 和历史数据迁移 +- **从 MySQL/PostgreSQL**:基于 SQL 的迁移 -- 开源策略: - - InfluxDB 只开放其单机版本,而 GreptimeDB 完全开源,包括其分布式集群版本; - - GreptimeDB 的架构可以运行在边缘设备(如 Android 系统)上。 +详细迁移说明请参见[迁移概述](/user-guide/migrate-to-greptimedb/overview.md)。 -- 分布式架构: - - 我们的架构更接近 HBase 的 Region/RegionServer 设计; - - 预写日志(WAL)基于 Kafka,同时也在探索基于 quorum 的实现方案。 - -- 混合负载与服务支持: - - GreptimeDB 聚焦于多种可观测数据的统一处理,以及与分析型负载的结合,以提升资源效率和实时性能; - - 提供商业化云服务 GreptimeCloud,方便用户快速上手。 +### GreptimeDB 提供哪些灾备选项? -- 存储引擎设计: - - 可插拔存储引擎适配多种场景,特别是 Prometheus 的小表场景,GreptimeDB 提供专门的 Metrics 存储引擎。 +GreptimeDB 提供多种灾备策略以满足不同的可用性需求: -- 查询语言支持: - - 支持 PromQL(用于监控)、SQL(用于分析)以及 Python(用于复杂数据处理); - 相比之下,InfluxDB 使用 InfluxQL 和 SQL。 +- **单机灾备方案**:使用远程 WAL 和对象存储,可实现 RPO=0 和分钟级 RTO,适合小规模场景 +- **Region 故障转移**:个别区域的自动故障转移,停机时间最短 +- **双机热备**(企业版):节点间同步请求复制,提供高可用性 +- **跨区域单集群**:跨越三个区域,实现零 RPO 和区域级容错 +- **备份与恢复**:定期数据备份,可根据备份频率配置 RPO -GreptimeDB 是一个快速发展的开源项目,欢迎社区的反馈和贡献!了解更多细节可以访问我们的 [Blog](https://greptime.cn/blogs/) 和 [Contributor Guide](/contributor-guide/overview/). +根据您的可用性需求、部署规模和成本考虑选择合适的解决方案。详细指导请参见[灾难恢复概述](/user-guide/deployments-administration/disaster-recovery/overview.md)。 -### 作为新手,我应该如何开始为 GreptimeDB 贡献代码?? +## 数据管理与处理 -欢迎参与 GreptimeDB 的贡献!新手请阅读[贡献指南](https://github.com/GreptimeTeam/greptimedb/blob/main/CONTRIBUTING.md)。 +### GreptimeDB 如何处理数据生命周期? -除此之外,我们还挑选了一些[适合新手贡献的 PR(good first issues)](https://github.com/GreptimeTeam/greptimedb/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+first+issue%22)供您参考。 +**保留策略**: +- 数据库级和表级 TTL 设置 +- 无需手动清理的自动数据过期 +- 通过 [TTL 文档](/reference/sql/create.md#表选项)配置 -### GreptimeDB 是否支持类似于 InfluxDB 的非负微分方法(处理可重置的绝对计数器)? +**数据导出**: +- 用于 S3、本地文件的 [`COPY TO` 命令](/reference/sql/copy.md#连接-s3) +- 通过任何兼容客户端的标准 SQL 查询 +- 备份和灾难恢复的导出功能:[备份与恢复数据](/user-guide/deployments-administration/disaster-recovery/back-up-&-restore-data.md) -是的,GreptimeDB 具有与 Prometheus 类似的功能,可有效处理计数器重置问题。例如,`reset()`、`rate()` 或 `delta()` 等函数可以自动检测和调整计数器重置。 +### GreptimeDB 如何处理高基数和实时处理? -但是以下几点需要注意: -- 不建议对计数器使用 `deriv()` 函数(此函数适用于 gauge 类型),可以先对计数器应用 `rate()`,然后使用 `deriv()`。 -在计数器操作方面 PromQL 更适合,因为它源于 Prometheus 的生态; -- 我们也在探索将 PromQL 函数集成到 SQL 中,为用户提供更大的灵活性。如果您对这方面有兴趣,可参考我们的[文档](https://github.com/GreptimeTeam/greptimedb/blob/main/docs/how-to/how-to-write-aggregate-function.md)。 +**高基数管理**: +- 高级索引策略防止基数爆炸 +- 具有智能压缩的列式存储 +- 带数据修剪的分布式查询执行 +- 高效处理数百万唯一时间序列 -### GreptimeDB 开源版本与云版本有哪些功能差异? +了解更多索引信息:[索引管理](/user-guide/manage-data/data-index.md) -以下是主要区别: -- 基础功能: - - 两个版本在基础功能上几乎相同,包括数据写入协议、SQL 能力和存储功能; - - GreptimeCloud 提供了一些高级 SQL 功能和额外特性。 +**实时处理**: +- **[Flow Engine](/user-guide/flow-computation/overview.md)**:实时流数据处理系统,对流式数据进行连续增量计算,自动更新结果表 +- **[Pipeline](/user-guide/logs/pipeline-config.md)**:实时数据解析转换机制,通过可配置处理器对各种入库数据进行字段提取和数据类型转换 +- **输出表**:持久化处理结果用于分析 -- 托管服务: - - GreptimeCloud 支持多租户、数据加密和安全审计,以满足合规需求,这些功能在开源版本中不可用。 - - GreptimeCloud 支持独占部署和按量付费的 Serverless 模式 +### GreptimeDB 的可扩展性特征是什么? -- 增强型控制台: - - 云版本的仪表盘更加友好,独有的 Prometheus 工作台支持在线编辑 Prometheus 仪表盘和告警规则,支持 GitOps 集成。 +**扩展限制**: +- 表或列数量无严格限制 +- 数百个表的性能影响最小 +- 性能随主键设计而非表数量扩展 +- 列式存储确保高效的部分读取 -如上所述,云版本提供了更多即用型功能,可帮助用户快速上手。 +**分区与分布**: +- region 内基于时间的自动组织 +- 通过 PARTITION 子句进行手动分布式分片(参见[表分片指南](/user-guide/deployments-administration/manage-data/table-sharding.md)) +- 计划未来版本的自动 region 拆分 +- **无需配置的动态分区**(企业版功能) -### 哪里可以找到有关本地部署和性能基准测试报告的相关文档? +**核心扩展性功能**: +- **多层缓存**:写入缓存(磁盘支持)和读取缓存(LRU 策略)优化性能 +- **对象存储后端**:通过 S3/GCS/Azure Blob 实现几乎无限存储 +- **异步 WAL**:高效的预写日志,支持可选的按表控制 +- **分布式查询执行**:多节点协调处理大型数据集 +- **手动压缩**:通过[管理命令](/reference/sql/admin.md)提供 -用户可以查看我们的 [TSBS 基准测试结果](https://github.com/GreptimeTeam/greptimedb/tree/main/docs/benchmarks/tsbs)并查看相关的[部署文档](/getting-started/installation/overview)。 +**企业级扩展功能**: +- 高级分区和自动重新平衡 +- 增强的多租户和隔离功能 +- 企业级监控和管理工具 -更多性能报告请参考[GreptimeDB 对比其他存储或时序数据库的性能如何](/user-guide/concepts/features-that-you-concern.md#greptimedb-对比其他存储或时序数据库的性能如何)。 +架构详情请参见[存储架构博客](https://greptime.cn/blogs/2024-12-24-observability)。 +### GreptimeDB 的设计权衡是什么? -### 是否有将 GreptimeDB 编译为仅包含必要模块和功能的嵌入式独立二进制文件的参考指南? +GreptimeDB 针对可观测性工作负载进行了优化,具有以下设计限制: +- **无 ACID 事务**:优先考虑高吞吐量写入而非事务一致性 +- **有限删除操作**:为追加重的可观测性数据设计 +- **时序数据专注**:针对 IoT、metrics、logs 和 traces 而非通用 OLTP 进行优化 +- **简化连接**:针对时序查询而非复杂关系操作进行优化 -我们提供适用于 Android ARM64 平台的预构建二进制文件,这些文件已在部分企业项目中成功应用。然而,这些二进制文件暂不支持裸机设备,因为 GreptimeDB 的一些核心组件依赖标准库。 +## 部署与运维 -### 是否有内置 SQL 命令(例如 compaction table t1)可用于手动进行数据压缩? +### GreptimeDB 部署和运维指南 -请参考[该文档](/reference/sql/admin)。 +**部署选项**: -### GreptimeDB 是否可以用于存储日志? +*集群部署(生产环境)*: +- 最少 3 个节点实现高可用性 +- 服务架构:metasrv、frontend、datanode(可同节点或分离部署) +- 存储后端:S3、GCS、Azure Blob(生产)或本地存储(测试) +- 元数据存储:MySQL/PostgreSQL 后端支持 metasrv +- 参见[容量规划指南](/user-guide/deployments-administration/capacity-plan.md) -可以,详细信息请参考[这里](/user-guide/logs/overview)。 +*边缘与单机部署*: +- Android ARM64、Raspberry Pi 等受限环境 +- 单节点模式适用于开发测试和 IoT 场景 +- 高效资源使用,支持边缘计算 -### 非主键字段的查询性能如何?是否可以设置倒排索引?与 Elasticsearch 相比,存储成本是否更低? +**数据分布策略**: +- **当前**:通过 PARTITION 子句手动分区(参见[表分片指南](/user-guide/deployments-administration/manage-data/table-sharding.md)),region 内自动时间组织,支持手动 region 迁移进行负载均衡(参见[Region 迁移指南](/user-guide/deployments-administration/manage-data/region-migration.md)) +- 自动 region 故障转移容灾(参见[Region 故障转移](/user-guide/deployments-administration/manage-data/region-failover.md)) +- **路线图**:自动 region 拆分、动态负载均衡 -非主键字段也可以创建索引,参考[索引管理](/user-guide/manage-data/data-index.md)。 +**监控与运维**: -GreptimeDB 的存储成本比 Elasticsearch 低的多,持久化数据大小仅为 ClickHouse 的 50% 和 Elasticsearch 的 12.7%,具体请见[性能评测](https://greptime.cn/blogs/2025-03-07-greptimedb-log-benchmark)。 +GreptimeDB 提供全面的监控能力,包括指标收集、健康检查和可观测性集成。详细的监控设置和故障排除指南请参见[监控概述](/user-guide/deployments-administration/monitoring/overview.md)。 -### Log-Structured Merge-Tree(LSM)引擎是否与 Kafka 的引擎模型类似? +部署和管理详情:[部署与管理概述](/user-guide/deployments-administration/overview.md) -从技术存储角度来看,它们确实有相似之处。然而,实际的数据格式有所不同: +## 开源版 vs 企业版 vs 云版本 -GreptimeDB 采用 Parquet 格式进行读写;Kafka 使用其专有的 RecordBatch 格式。若需要对暂时存储在 Kafka 中的时序数据进行分析,必须先将数据写入 GreptimeDB。 +### GreptimeDB 各版本的区别是什么? -你可以使用 Vector 消费 Kafka 并将消息写入 GreptimeDB,参见[这篇文章](https://greptime.cn/blogs/2024-10-29-vector#%E5%88%A9%E7%94%A8-vector-%E5%B0%86-kafka-%E4%B8%AD%E7%9A%84%E6%97%A5%E5%BF%97%E6%95%B0%E6%8D%AE%E9%AB%98%E6%95%88%E5%86%99%E5%85%A5-greptimedb)。 +**开源版本**: +- 高性能写入和查询能力 +- 集群部署和基础读写分离 +- 多协议支持(OpenTelemetry、Prometheus、InfluxDB 等) +- 基础访问控制和加密 +- 基础性能诊断 +- 社区支持 -### 在 GreptimeDB 中,表和列的数量是否有限制?列数较多是否会影响读写性能? +**企业版本**(包含所有开源版功能,另增加): +- 基于成本的查询优化器,提升性能 +- 高级读写分离和双活灾备(参见[双机热备灾备方案](/enterprise/deployments-administration/disaster-recovery/dr-solution-based-on-active-active-failover.md)) +- 自动扩展、索引和负载均衡 +- 分层缓存和企业级管理控制台 +- 企业授权(RBAC/LDAP 集成) +- 增强的安全和审计功能 +- 一对一技术支持和 7x24 服务响应 +- 专业定制服务 -通常情况下并没有严格的限制。在几百个表的场景下,主键列数量不多时对写入性能的影响是可以忽略的(性能以每秒写入点数而非行数衡量)。 +**GreptimeCloud**(全托管,包含所有企业版功能,另增加): +- Serverless 自动扩展,按用量付费 +- 全托管部署,无缝升级 +- 独立资源池和网络隔离 +- 可配置读写容量和无限存储 +- 具有 Prometheus 工作台的高级仪表板 +- SLA 保证和自动灾难恢复 -同样,对于读取性能来说,如果查询只涉及部分列,内存和计算的开销也不会显著增加。 +详细对比请参见[价格与功能](https://greptime.cn/pricing#differences)。 -### 通常需要多少台服务器来搭建一个可靠的 GreptimeDB 集群?Frontend、Datanode 和 Metasrv 应如何部署?是否所有节点都需要运行这三个服务? +### 有哪些安全功能可用? -搭建 GreptimeDB 集群至少需要 3 台节点,每台节点运行 Metasrv、Frontend 和 Datanode 三个服务。然而,具体的节点数量取决于需要处理的数据规模。 +**开源版本**: +- 基本用户名/密码身份验证 +- 连接的 TLS/SSL 支持 + +**企业版/云版本**: +- 基于角色的访问控制(RBAC) +- 团队管理和 API 密钥 +- 静态数据加密 +- 合规审计日志 -并不需要在每个节点上同时部署所有服务。对于小型集群,可以单独使用 3 台节点运行 Metasrv,而 Frontend 和 Datanode 可以部署在其他等量节点上,每个容器运行两个进程。 +## 技术细节 -更多部署建议请参考[容量规划文档](/user-guide/deployments-administration/capacity-plan.md)。 +### GreptimeDB 如何扩展 Apache DataFusion? -### 最新版本的 Flow Engine(预计算功能)是否支持 PromQL 语法进行计算? +GreptimeDB 基于 DataFusion 构建: +- **查询语言**:原生 PromQL 与 SQL 并存 +- **分布式执行**:多节点查询协调 +- **自定义函数**:时序特定的 UDF/UDAF +- **优化**:针对可观测性工作负载的规则 +- **计数器处理**:在 `rate()` 和 `delta()` 函数中自动重置检测 -目前 Flow Engine 暂不支持 PromQL 语法计算。我们会对该需求进行评估,从理论上看似乎是可行的。 +自定义函数开发:[函数文档](https://github.com/GreptimeTeam/greptimedb/blob/main/docs/how-to/how-to-write-aggregate-function.md) -### Metasrv 是否会支持 MySQL 或 PostgreSQL 作为存储后端? +### GreptimeDB 和 InfluxDB 的区别是什么? -最新版本的 GreptimeDB 已支持 PostgreSQL 作为 Metasrv 的存储后端。具体信息请参考[这里](/user-guide/deployments-administration/configuration.md#仅限于-metasrv-的配置)。 +主要区别: +- **开源策略**:GreptimeDB 的整个分布式系统完全开源 +- **架构**:针对可观测性工作负载优化的基于 region 的设计 +- **查询语言**:SQL + PromQL vs InfluxQL + SQL +- **统一模型**:在一个系统中原生支持 metrics、logs 和 traces +- **存储**:具有专用优化的可插拔引擎 +- **云原生**:为 Kubernetes 构建,具有分解的计算/存储(参见 [Kubernetes 部署指南](/user-guide/deployments-administration/deploy-on-kubernetes/overview.md)) -### 如何最佳地对成千上万台计算机中多个网卡的接口流量速率(每 30 秒取最大值)进行降采样,以便长期保存(如多年的数据)? +详细比较请参见 [GreptimeDB vs InfluxDB](https://greptime.cn/compare/influxdb)。更多产品比较(如 vs. ClickHouse、Loki 等)可在官网的资源菜单中找到。 -使用流表(Flow Table)是完成此任务的最佳工具。一个简单的流任务即可满足需求。流任务的输出会存储到普通表中,支持长期保存。请阅读 [Flow 指南](/user-guide/flow-computation/overview.md)。 +### GreptimeDB 的存储引擎如何工作? -### GreptimeDB 是否支持动态的按天分区? +**LSM-Tree 架构**: +- 基于日志结构合并树(LSMT)设计 +- WAL 可以使用本地磁盘或分布式服务(如 Kafka)通过 Log Store API +- SST 文件刷写到对象存储(S3/GCS)或本地磁盘 +- 面向云原生环境设计,以对象存储为主要后端 +- 使用 TWCS(时间窗口压缩策略)优化时序工作负载 + +**性能考量**: +- **时间戳**:日期时间格式(yyyy-MM-dd HH:mm:ss)无性能影响 +- **压缩**:仅测量数据目录;WAL 循环重用 +- **Append Only 表**:建议使用,对写入和查询性能更友好,尤其适合日志场景 +- **Flow Engine**:目前基于 SQL;PromQL 支持正在评估中 -是的。在 Region 内,时间序列数据默认按照时间来动态组织,无需配置。请注意,它跟分布式表的分片是两个层次的概念。一个是分片内的数据组织,一个是数据的分布式切分。 +### 特定用例的最佳实践是什么? -### GreptimeDB 中对 DataFusion 的哪些部分进行了定制? +**网络监控**(如数千个网卡): +- 使用 Flow 表进行连续聚合 +- 通过 Flow Engine 手动降采样进行数据缩减 +- 输出到常规表进行长期存储 -GreptimeDB 针对 DataFusion 的以下部分进行了定制: -- 支持 PromQL 查询 -- 执行分布式查询 -- 自定义 UDF(用户自定义函数)和 UDAF(用户自定义聚合函数)。 -- 自定义优化规则 +**日志分析**: +- 使用 Append Only 表获得更好的写入和查询性能 +- 在频繁查询的字段上创建索引([索引管理](/user-guide/manage-data/data-index.md)) +- 存储效率:ClickHouse 的 50%,Elasticsearch 的 12.7% -### 开源版本的 GreptimeDB 是否支持细粒度的访问控制? +**表设计与性能**: +- 表建模指导:[设计表](/user-guide/deployments-administration/performance-tuning/design-table.md) +- 性能优化:[性能调优提示](/user-guide/deployments-administration/performance-tuning/performance-tuning-tips.md) -开源版仅支持基础的用户名密码认证。企业版提供了细粒度的访问控制功能,例如 RBAC(基于角色的访问控制)。 -### 以日期时间格式写入 `TIMESTAMP` 值会影响查询性能吗? +## 入门指南 + +### 如何开始使用 GreptimeDB? + +**📚 学习资源**: + +*文档与指南*: +- [安装指南](/getting-started/installation/overview.md) - 快速开始部署 +- [容量规划](/user-guide/deployments-administration/capacity-plan.md) - 生产环境规划 +- [配置指南](/user-guide/deployments-administration/configuration.md) - 详细配置说明 + +*性能基准*: +- [TSBS 基准测试](https://github.com/GreptimeTeam/greptimedb/tree/main/docs/benchmarks/tsbs) +- [性能对比分析](/user-guide/concepts/features-that-you-concern.md#greptimedb-对比其他存储或时序数据库的性能如何) +- [vs InfluxDB](https://greptime.cn/blogs/2024-08-08-report) +- [vs Loki](https://greptime.cn/blogs/2025-08-07-beyond-loki-greptimedb-log-scenario-performance-report.html) +- [日志基准测试](https://greptime.cn/blogs/2025-03-07-greptimedb-log-benchmark) -不会。以日期时间格式(如 yyyy-MM-dd HH:mm:ss)写入并不会影响查询性能。底层存储格式保持一致。 +**🚀 快速上手路径**: -### 评估数据压缩时,是否只需考虑数据目录的大小,还是包括 WAL 目录的大小? +1. **云端体验**:[GreptimeCloud 免费版](https://greptime.cn/product/cloud) - 无需安装即可试用 +2. **本地部署**:按照[安装指南](/getting-started/installation/overview.md)自托管部署 +3. **集成现有系统**:GreptimeDB 支持与 Prometheus、Vector、Kafka、Telegraf、EMQX、Metabase 等众多系统的广泛集成。完整列表请参见[集成概述](/user-guide/integrations/overview.md),或从以下开始: + - [OpenTelemetry 集成](/user-guide/ingest-data/for-observability/opentelemetry.md) + - [Prometheus 迁移](/user-guide/ingest-data/for-observability/prometheus.md) + - Grafana 仪表板配置 -只需考虑数据目录的大小即可,WAL 目录会循环重用,因此不纳入数据压缩指标。 +**🤝 社区与贡献**: -### 在集群模式下,如果创建表时未使用 `PARTITION`,数据是否会自动均衡到各个 Datanode 上? +*参与社区*: +- [Slack 频道](https://greptime.com/slack) - 与用户和开发者交流 +- [GitHub Discussions](https://github.com/GreptimeTeam/greptimedb/discussions) - 技术讨论 -当前版本中,未使用 `PARTITION` 的情况下,数据不会自动均衡到各 Datanode。实现区域分裂和自动均衡的功能计划在 v1.2 或 v1.3 版本中推出。 \ No newline at end of file +*贡献代码*: +- [贡献指南](https://github.com/GreptimeTeam/greptimedb/blob/main/CONTRIBUTING.md) - 开发环境搭建 +- [适合新手的 Issue](https://github.com/GreptimeTeam/greptimedb/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+first+issue%22) - 第一次贡献 +- [文档改进](https://github.com/GreptimeTeam/docs) - 帮助完善中英文文档 \ No newline at end of file diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/features-that-you-concern.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/features-that-you-concern.md index aa26061a0..ba367b9a0 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/features-that-you-concern.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/features-that-you-concern.md @@ -65,12 +65,16 @@ GreptimeDB 在 [ClickHouse 的 JSONBench 测试中 Cold Run 斩获第一](https: 有的,请参阅[灾难恢复文档](/user-guide/deployments-administration/disaster-recovery/overview.md)。 -## GeptimeDB 有地理空间索引吗? +## GreptimeDB 有地理空间索引吗? 我们提供 [内置函数](/reference/sql/functions/geo.md) 支持 Geohash, H3 and S2 索 引。 -## GeptimeDB 支持 JSON 数据吗? +## GreptimeDB 支持 JSON 数据吗? 我们提供 [内置函数](/reference/sql/functions/overview.md#json-functions) 支持访问 JSON 数据类型。 + +## 更多问题? + +有关 GreptimeDB 的更多常见问题解答,包括部署选项、迁移指南、性能对比和最佳实践等,请访问我们的[常见问题页面](/faq-and-others/faq.md)。 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/overview.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/overview.md index 1a87bd42e..a79689f75 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/overview.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/overview.md @@ -11,6 +11,7 @@ description: 概述 GreptimeDB 的特点和优势,并提供相关文档链接 - [存储位置](./storage-location.md):介绍了 GreptimeDB 的存储位置,包括本地磁盘、HDFS、AWS S3 和阿里云 OSS 等云对象存储。 - [核心概念](./key-concepts.md):介绍了 GreptimeDB 的核心概念,包括表、时间索引约束、表 Region 和数据类型等。 - [关键特性](./features-that-you-concern.md): 介绍了 TSDB 用户较为关心的指标(metrics)、日志(logs)和事件(events)数据库的特性。 +- [常见问题](/faq-and-others/faq.md): 全面的 FAQ,涵盖关于 GreptimeDB 能力、部署和使用的常见问题。 ## 阅读更多 diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/why-greptimedb.md b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/why-greptimedb.md index d7351c7e9..0e881757a 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/why-greptimedb.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/user-guide/concepts/why-greptimedb.md @@ -25,7 +25,7 @@ GreptimeDB 采用云对象存储(如 AWS S3、阿里云 OSS 和 Azure Blob Sto ## 高性能 -在性能优化方面,GreptimeDB 运用了多种技术,如 LSM Tree、数据分片和基于 Kafka 的 WAL 设计,以处理大规模可观测数据的写入。 +在性能优化方面,GreptimeDB 运用了多种技术,如 LSM Tree、数据分片和灵活的 WAL 选项(本地磁盘或 Kafka 等分布式服务),以处理大规模可观测数据的写入。 GreptimeDB 使用纯 Rust 编写,具有卓越的性能和可靠性。强大而快速的查询引擎由向量化执行和分布式并行处理(感谢 [Apache DataFusion](https://datafusion.apache.org/))驱动,并结合了丰富的[索引选项](/user-guide/manage-data/data-index.md),例如倒排索引、调数索引和全文索引等。GreptimeDB将智能索引和大规模并行处理 (MPP) 结合在一起,以提升查询过程中数据剪枝和过滤的效率。 @@ -76,7 +76,6 @@ GreptimeDB 支持多种数据摄入协议,从而实现与现有可观测性技 在数据查询方面,GreptimeDB 提供: - **SQL**:用于实时查询、复杂分析和数据库管理 - **PromQL**:原生支持实时指标查询和 Grafana 集成 -- **Python**:(计划中)支持数据科学场景的数据库内 UDF 和 DataFrame 操作 GreptimeDB 与您的现有可观测性技术栈无缝集成,同时保持高性能和灵活性。