diff --git a/docs/doc/90-contributing/10-how-to-write-a-system-table.md b/docs/doc/90-contributing/10-how-to-write-a-system-table.md new file mode 100644 index 0000000000000..309fa6a83b62c --- /dev/null +++ b/docs/doc/90-contributing/10-how-to-write-a-system-table.md @@ -0,0 +1,148 @@ +--- +title: How to Create a System Table +--- + +System tables are tables that provide information about Databend's internal state, such as databases, tables, functions, and settings. If you're familiar with the Databend code structure and have basic knowledge about Rust, you can also create your own system tables as needed. + +Creating a system table mainly involves defining the table information (table name and schema) and how to generate and retrieve data for the table. This can be done through implementing the trait `SyncSystemTable` or `AsyncSystemTable`. + +This guide will show you how to create a new system table for Databend, using the table [system.credits](https://databend.rs/doc/sql-reference/system-tables/system-credits) as an example. The table provides information Databend's upstream dependencies and the code is located at `src/query/storage/system/src/credits_table.rs`. + +:::note +Databend suggests that you store the code for new system tables in the directory `src/query/storage/system/src/`. However, there may be situations where you cannot do so, such as issues related to the build process. In such cases, you can place it temporarily in a directory called `src/query/service/src/databases/system` (although this is not recommended). +::: + +## Creating a System Table + +The following walks through the implementation of the table `system.credits` step by step. + +1. Define a struct for your system table that contains only the fields for storing the table information. + + ```rust + pub struct CreditsTable { + table_info: TableInfo, + } + ``` + +2. Implement a `create` method for your system table struct that takes `table_id` as an argument and returns `Arc`. The `table_id` is generated by `sys_db_meta.next_table_id()` when creating a new system table. + + ```rust + pub fn create(table_id: u64) -> Arc + ``` + +3. Define a schema for your system table using `TableSchemaRefExt` and `TableField`. The schema describes the structure of your system table with field names and types depending on the data you want to store in it. + + For string-type data, you can use `TableDataType::String`; other basic types are similar. But if you need to allow null values in your field, such as an optional 64-bit unsigned integer field, you can use `TableDataType::Nullable(Box::new(TableDataType::Number(NumberDataType::UInt64)))` instead. `TableDataType::Nullable` indicates that null values are allowed; `TableDataType::Number(NumberDataType::UInt64)` represents that the type is 64-bit unsigned integer. + + ```rust + let schema = TableSchemaRefExt::create(vec![ + TableField::new("name", TableDataType::String), + TableField::new("version", TableDataType::String), + TableField::new("license", TableDataType::String), + ]); + ``` + +4. Define metadata for your system table, such as description (`desc`), `name`, `meta`, etc. You can follow other existing examples and fill in these fields accordingly. + + ```rust + let table_info = TableInfo { + desc: "'system'.'credits'".to_string(), + name: "credits".to_string(), + ident: TableIdent::new(table_id, 0), + meta: TableMeta { + schema, + engine: "SystemCredits".to_string(), + ..Default::default() + }, + ..Default::default() + }; + + SyncOneBlockSystemTable::create(CreditsTable { table_info }) + ``` + +5. Create an instance of your system table struct with these fields and wrap it with either `SyncOneBlockSystemTable` or `AsyncOneBlockSystemTable`, depending on whether your data retrieval is synchronous or asynchronous. + +6. Implement either `SyncSystemTable` or `AsyncSystemTable` trait for your system table struct. `SyncSystemTable` requires you to define a `NAME` constant and implement four methods: `get_table_info()`, `get_full_data()`, `get_partitions()`, and `truncate()`. However, the last two methods have default implementations, so you don't need to implement them yourself in most cases. (`AsyncSystemTable` is similar, but it doesn't have `truncate()` method.) + + `NAME` constant follows the format of `system.`. + + ```rust + const NAME: &'static str = "system.credits"; + ``` + + `get_table_info()` method returns the table information stored in the struct. + + ```rust + fn get_table_info(&self) -> &TableInfo { + &self.table_info + } + ``` + + `get_full_data()` method is the most important part, because it contains the logic for generating or retrieving the data for your system table. The credits table has three fields that are similar, so we will only show the license field as an example. + + The license field information is obtained from an environment variable named `DATABEND_CREDITS_LICENSES` (see `common-building`). Each data item is separated by a comma. + + String-type columns are eventually converted from `Vec>`, where each string needs to be converted to `Vec`. So we use `.as_bytes().to_vec()` to do this conversion when iterating over the data. + + ```rust + let licenses: Vec> = env!("DATABEND_CREDITS_LICENSES") + .split_terminator(',') + .map(|x| x.trim().as_bytes().to_vec()) + .collect(); + ``` + +7. Return the retrieved data in a `DataBlock` format. Use `from_data` for non-null types and `from_opt_data` for nullable types. For example: + + ```rust + Ok(DataBlock::new_from_columns(vec![ + StringType::from_data(names), + StringType::from_data(versions), + StringType::from_data(licenses), + ])) + ``` + +8. Edit `system_database.rs` to register the new table to `SystemDatabase`. + + ```rust + impl SystemDatabase { + pub fn create(sys_db_meta: &mut InMemoryMetas, config: &Config) -> Self { + ... + CreditsTable::create(sys_db_meta.next_table_id()), + ... + } + } + ``` + +## Testing a New System Table + +The system table tests are located at `tests/it/storages/system.rs`. For tables with infrequent content changes, Golden File testing can be used, which involves writing the table to a specified file and comparing it to an expected file. For example: + +```rust +#[tokio::test(flavor = "multi_thread")] +async fn test_columns_table() -> Result<()> { + let (_guard, ctx) = crate::tests::create_query_context().await?; + let mut mint = Mint::new("tests/it/storages/testdata"); + let file = &mut mint.new_goldenfile("columns_table.txt").unwrap(); + let table = ColumnsTable::create(1); + run_table_tests(file, ctx, table).await?; + Ok(()) +} +``` + +For tables with dynamically changing content or external dependencies, testing methods are limited. You can test relatively fixed patterns such as the number of rows and columns, or verify if the output contains specific content. For example: + +```rust +#[tokio::test(flavor = "multi_thread")] +async fn test_metrics_table() -> Result<()> { + ... + let result = stream.try_collect::>().await?; + let block = &result[0]; + assert_eq!(block.num_columns(), 4); + assert!(block.num_rows() >= 1); + let output = pretty_format_blocks(result.as_slice())?; + assert!(output.contains("test_test_metrics_table_count")); + #[cfg(feature = "enable_histogram")] + assert!(output.contains("test_test_metrics_table_histogram")); + Ok(()) +} +```