diff --git a/docs/reference/sql/functions/overview.md b/docs/reference/sql/functions/overview.md index dbb94e9f8..e0161d9d1 100644 --- a/docs/reference/sql/functions/overview.md +++ b/docs/reference/sql/functions/overview.md @@ -48,9 +48,40 @@ Where the `datatype` can be any valid Arrow data type in this [list](https://arr DataFusion [String Function](./df-functions.md#string-functions). GreptimeDB provides: -* `matches_term(expression, term)` for full text search. +* `matches_term(expression, term)` for full text search. For details, read the [Fulltext Search](/user-guide/logs/fulltext-search.md). +* `regexp_extract(str, regexp)` to extract the first substring in a string that matches a regular expression. Returns `NULL` if no match is found. -For details, read the [Fulltext Search](/user-guide/logs/fulltext-search.md). +#### regexp_extract + +Extracts the first substring in a string that matches a [regular expression](https://docs.rs/regex/latest/regex/#syntax). Returns `NULL` if no match is found. + +```sql +regexp_extract(str, regexp) +``` + +**Arguments:** + +- **str**: String expression to operate on. Can be a constant, column, or function, and any combination of operators. +- **regexp**: Regular expression to match against. Can be a constant, column, or function. + +**Note on Escaping:** + +GreptimeDB's regex escape behavior differs between MySQL and PostgreSQL compatibility modes: +- **MySQL mode**: Requires double backslashes for escape sequences (e.g., `\\d`, `\\s`) +- **PostgreSQL mode**: Single backslashes work by default (e.g., `\d`, `\s`), or use `E''` prefix for consistency with MySQL (e.g., `E'\\d'`) + +**Examples:** + +```sql +SELECT regexp_extract('version 1.2.3', '\d+\.\d+\.\d+'); +-- Returns: 1.2.3 + +SELECT regexp_extract('Phone: 123-456-7890', '\d{3}-\d{3}-\d{4}'); +-- Returns: 123-456-7890 + +SELECT regexp_extract('no match here', '\d+\.\d+\.\d+'); +-- Returns: NULL +``` ### Math Functions @@ -138,6 +169,8 @@ SELECT date_sub('2023-12-06 07:39:46.222'::TIMESTAMP_MS, '5 day'::INTERVAL); * `date_format(expression, fmt)` to format Timestamp, Date, or DateTime into string by the format: +Supports `Date32`, `Date64`, and all `Timestamp` types. + ```sql SELECT date_format('2023-12-06 07:39:46.222'::TIMESTAMP, '%Y-%m-%d %H:%M:%S:%3f'); ``` diff --git a/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/functions/overview.md b/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/functions/overview.md index 4d1cd63be..0bef5dbde 100644 --- a/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/functions/overview.md +++ b/i18n/zh/docusaurus-plugin-content-docs/current/reference/sql/functions/overview.md @@ -47,9 +47,40 @@ arrow_cast(expression, datatype) DataFusion [字符串函数](./df-functions.md#string-functions)。 GreptimeDB 提供: -* `matches_term(expression, term)` 用于全文检索。 +* `matches_term(expression, term)` 用于全文检索。阅读[查询日志](/user-guide/logs/fulltext-search.md)文档获取更多详情。 +* `regexp_extract(str, regexp)` 提取字符串中与正则表达式匹配的第一个子串。如果没有找到匹配项则返回 `NULL`。 -阅读[查询日志](/user-guide/logs/fulltext-search.md)文档获取更多详情。 +#### regexp_extract + +提取字符串中与[正则表达式](https://docs.rs/regex/latest/regex/#syntax)匹配的第一个子串。如果没有找到匹配项则返回 `NULL`。 + +```sql +regexp_extract(str, regexp) +``` + +**参数:** + +- **str**: 要操作的字符串表达式。可以是常量、列或函数,以及运算符的任意组合。 +- **regexp**: 要匹配的正则表达式。可以是常量、列或函数。 + +**关于转义的说明:** + +GreptimeDB 在 MySQL 和 PostgreSQL 兼容模式下的正则表达式转义行为有所不同: +- **MySQL 模式**:转义序列需要使用双反斜杠(例如 `\\d`、`\\s`) +- **PostgreSQL 模式**:默认情况下单反斜杠即可(例如 `\d`、`\s`),或者使用 `E''` 前缀以与 MySQL 保持一致(例如 `E'\\d'`) + +**示例:** + +```sql +SELECT regexp_extract('version 1.2.3', '\d+\.\d+\.\d+'); +-- 返回: 1.2.3 + +SELECT regexp_extract('Phone: 123-456-7890', '\d{3}-\d{3}-\d{4}'); +-- 返回: 123-456-7890 + +SELECT regexp_extract('no match here', '\d+\.\d+\.\d+'); +-- 返回: NULL +``` ### 数学函数 @@ -134,6 +165,8 @@ SELECT date_sub('2023-12-06 07:39:46.222'::TIMESTAMP_MS, '5 day'::INTERVAL); * `date_format(expression, fmt)` 将 Timestamp、Date 或 DateTime 格式化: +支持 `Date32`、`Date64` 和所有 `Timestamp` 类型。 + ```sql SELECT date_format('2023-12-06 07:39:46.222'::TIMESTAMP, '%Y-%m-%d %H:%M:%S:%3f'); ```