From 80b8bd53651ef5b2fb1dd4a4ce9c43ad152342cf Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 9 Aug 2022 07:57:24 -0600 Subject: [PATCH 1/5] rough out content --- .../user-guide/sql/datafusion-functions.md | 92 +------- docs/source/user-guide/sql/index.rst | 2 +- .../source/user-guide/sql/scalar_functions.md | 205 ++++++++++++++++++ 3 files changed, 208 insertions(+), 91 deletions(-) create mode 100644 docs/source/user-guide/sql/scalar_functions.md diff --git a/docs/source/user-guide/sql/datafusion-functions.md b/docs/source/user-guide/sql/datafusion-functions.md index e37ba11e84c4..651fe7576c78 100644 --- a/docs/source/user-guide/sql/datafusion-functions.md +++ b/docs/source/user-guide/sql/datafusion-functions.md @@ -17,94 +17,6 @@ under the License. --> -# DataFusion-Specific Functions +# DataFusion Functions -These SQL functions are specific to DataFusion, or they are well known and have functionality which is specific to DataFusion. Specifically, the `to_timestamp_xx()` functions exist due to Arrow's support for multiple timestamp resolutions. - -## `to_timestamp` - -`to_timestamp()` is similar to the standard SQL function. It performs conversions to type `Timestamp(Nanoseconds, None)`, from: - -- Timestamp strings - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds -- An Int64 array/column, values are nanoseconds since Epoch UTC -- Other Timestamp() columns or values - -Note that conversions from other Timestamp and Int64 types can also be performed using `CAST(.. AS Timestamp)`. However, the conversion functionality here is present for consistency with the other `to_timestamp_xx()` functions. - -## `to_timestamp_millis` - -`to_timestamp_millis()` does conversions to type `Timestamp(Milliseconds, None)`, from: - -- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of Milliseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds -- An Int64 array/column, values are milliseconds since Epoch UTC -- Other Timestamp() columns or values - -Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to millisecond resolution. - -## `to_timestamp_micros` - -`to_timestamp_micros()` does conversions to type `Timestamp(Microseconds, None)`, from: - -- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of microseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds -- An Int64 array/column, values are microseconds since Epoch UTC -- Other Timestamp() columns or values - -Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to microsecond resolution. - -## `to_timestamp_seconds` - -`to_timestamp_seconds()` does conversions to type `Timestamp(Seconds, None)`, from: - -- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of secondseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds -- An Int64 array/column, values are seconds since Epoch UTC -- Other Timestamp() columns or values - -Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to seconds resolution. - -## `extract` - -`extract(field FROM source)` - -- The `extract` function retrieves subfields such as year or hour from date/time values. - `source` must be a value expression of type timestamp, Data32, or Data64. `field` is an identifier that selects what field to extract from the source value. - The `extract` function returns values of type u32. - - `year` :`extract(year FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 2020` - - `month`:`extract(month FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 9` - - `week` :`extract(week FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 37` - - `day`: `extract(day FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 8` - - `hour`: `extract(hour FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 12` - - `minute`: `extract(minute FROM to_timestamp('2020-09-08T12:01:00+00:00')) -> 1` - - `second`: `extract(second FROM to_timestamp('2020-09-08T12:00:03+00:00')) -> 3` - -## `date_part` - -`date_part('field', source)` - -- The `date_part` function is modeled on the postgres equivalent to the SQL-standard function `extract`. - Note that here the field parameter needs to be a string value, not a name. - The valid field names for `date_part` are the same as for `extract`. - - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` +This content has moved to [scalar functions](scalar-functions.md) diff --git a/docs/source/user-guide/sql/index.rst b/docs/source/user-guide/sql/index.rst index f6d3a0bbed3a..97753d708e1b 100644 --- a/docs/source/user-guide/sql/index.rst +++ b/docs/source/user-guide/sql/index.rst @@ -25,4 +25,4 @@ SQL Reference select ddl aggregate_functions - DataFusion Functions + scalar_functions diff --git a/docs/source/user-guide/sql/scalar_functions.md b/docs/source/user-guide/sql/scalar_functions.md new file mode 100644 index 000000000000..b2e28a1498e7 --- /dev/null +++ b/docs/source/user-guide/sql/scalar_functions.md @@ -0,0 +1,205 @@ + + +# Scalar Functions + +## Math Functions + +| Function | Notes | +| --------------------- | ------------------------------------------------- | +| abs(x) | absolute value | +| acos(x) | inverse cosine | +| asin(x) | inverse sine | +| atan(x) | inverse tangent | +| atan2(y, x) | inverse tangent of y / x | +| ceil(x) | nearest integer greater than or equal to argument | +| cos(x) | cosine | +| exp(x) | exponential | +| floor(x) | nearest integer less than or equal to argument | +| ln(x) | natural logarithm | +| log10(x) | base 10 logarithm | +| log2(x) | base 2 logarithm | +| power(base, exponent) | base raised to the power of exponent | +| round(x) | round to nearest integer | +| signum(x) | sign of the argument (-1, 0, +1) | +| sin(x) | sine | +| sqrt(x) | square root | +| tan(x) | tangent | +| trunc(x) | truncate toward zero | + +## Conditional Functions + +| Function | Notes | +| -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| coalesce | Returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display. | +| nullif | Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | + +## String Functions + +| Function | Notes | +| ---------------- | ----- | +| ascii | | +| bit_length | | +| btrim | | +| char_length | | +| character_length | | +| concat | | +| concat_ws | | +| chr | | +| initcap | | +| left | | +| length | | +| lower | | +| lpad | | +| ltrim | | +| md5 | | +| octet_length | | +| repeat | | +| replace | | +| reverse | | +| right | | +| rpad | | +| rtrim | | +| digest | | +| split_part | | +| starts_with | | +| strpos | | +| substr | | +| translate | | +| trim | | +| upper | | + +## Regular Expression Functions + +| Function | Notes | +| -------------- | ----- | +| regexp_match | | +| regexp_replace | | + +## Temporal Functions + +### `to_timestamp` + +`to_timestamp()` is similar to the standard SQL function. It performs conversions to type `Timestamp(Nanoseconds, None)`, from: + +- Timestamp strings + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds +- An Int64 array/column, values are nanoseconds since Epoch UTC +- Other Timestamp() columns or values + +Note that conversions from other Timestamp and Int64 types can also be performed using `CAST(.. AS Timestamp)`. However, the conversion functionality here is present for consistency with the other `to_timestamp_xx()` functions. + +### `to_timestamp_millis` + +`to_timestamp_millis()` does conversions to type `Timestamp(Milliseconds, None)`, from: + +- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of Milliseconds resolution) + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds +- An Int64 array/column, values are milliseconds since Epoch UTC +- Other Timestamp() columns or values + +Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to millisecond resolution. + +### `to_timestamp_micros` + +`to_timestamp_micros()` does conversions to type `Timestamp(Microseconds, None)`, from: + +- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of microseconds resolution) + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds +- An Int64 array/column, values are microseconds since Epoch UTC +- Other Timestamp() columns or values + +Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to microsecond resolution. + +### `to_timestamp_seconds` + +`to_timestamp_seconds()` does conversions to type `Timestamp(Seconds, None)`, from: + +- Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of secondseconds resolution) + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds +- An Int64 array/column, values are seconds since Epoch UTC +- Other Timestamp() columns or values + +Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolution; this function is the only way to convert/cast to seconds resolution. + +### `extract` + +`extract(field FROM source)` + +- The `extract` function retrieves subfields such as year or hour from date/time values. + `source` must be a value expression of type timestamp, Data32, or Data64. `field` is an identifier that selects what field to extract from the source value. + The `extract` function returns values of type u32. + - `year` :`extract(year FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 2020` + - `month`:`extract(month FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 9` + - `week` :`extract(week FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 37` + - `day`: `extract(day FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 8` + - `hour`: `extract(hour FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 12` + - `minute`: `extract(minute FROM to_timestamp('2020-09-08T12:01:00+00:00')) -> 1` + - `second`: `extract(second FROM to_timestamp('2020-09-08T12:00:03+00:00')) -> 3` + +### `date_part` + +`date_part('field', source)` + +- The `date_part` function is modeled on the postgres equivalent to the SQL-standard function `extract`. + Note that here the field parameter needs to be a string value, not a name. + The valid field names for `date_part` are the same as for `extract`. + - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` + +### Other Temporal Functions: + +| Function | Notes | +| -------------------- | ------------ | +| date_trunc | | +| from_unixtime | | +| now() | current time | + +## Other Functions + +| Function | Notes | +| -------- | ----- | +| array | | +| in_list | | +| random | | +| sha224 | | +| sha256 | | +| sha384 | | +| sha512 | | +| struct | | +| to_hex | | \ No newline at end of file From 668a52dc51489b05c7d68daa7e952aa399f32661 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 9 Aug 2022 08:03:07 -0600 Subject: [PATCH 2/5] use sections instead of tables --- .../source/user-guide/sql/scalar_functions.md | 206 +++++++++++------- 1 file changed, 127 insertions(+), 79 deletions(-) diff --git a/docs/source/user-guide/sql/scalar_functions.md b/docs/source/user-guide/sql/scalar_functions.md index b2e28a1498e7..cfd60da3f92b 100644 --- a/docs/source/user-guide/sql/scalar_functions.md +++ b/docs/source/user-guide/sql/scalar_functions.md @@ -16,81 +16,133 @@ specific language governing permissions and limitations under the License. --> - # Scalar Functions ## Math Functions -| Function | Notes | -| --------------------- | ------------------------------------------------- | -| abs(x) | absolute value | -| acos(x) | inverse cosine | -| asin(x) | inverse sine | -| atan(x) | inverse tangent | -| atan2(y, x) | inverse tangent of y / x | -| ceil(x) | nearest integer greater than or equal to argument | -| cos(x) | cosine | -| exp(x) | exponential | -| floor(x) | nearest integer less than or equal to argument | -| ln(x) | natural logarithm | -| log10(x) | base 10 logarithm | -| log2(x) | base 2 logarithm | -| power(base, exponent) | base raised to the power of exponent | -| round(x) | round to nearest integer | -| signum(x) | sign of the argument (-1, 0, +1) | -| sin(x) | sine | -| sqrt(x) | square root | -| tan(x) | tangent | -| trunc(x) | truncate toward zero | +### abs(x) + +absolute value + +### acos(x) + +inverse cosine + +### asin(x) + +inverse sine + +### atan(x) + +inverse tangent + +### atan2(y, x) + +inverse tangent of y / x + +### ceil(x) + +nearest integer greater than or equal to argument + +### cos(x) + +cosine + +### exp(x) + +exponential + +### floor(x) + +nearest integer less than or equal to argument + +### ln(x) + +natural logarithm + +### log10(x) + +base 10 logarithm + +### log2(x) + +base 2 logarithm + +### power(base, exponent) + +base raised to the power of exponent + +### round(x) + +round to nearest integer + +### signum(x) + +sign of the argument (-1, 0, +1) + +### sin(x) + +sine + +### sqrt(x) + +square root + +### tan(x) + +tangent + +### trunc(x) + +truncate toward zero ## Conditional Functions -| Function | Notes | -| -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| coalesce | Returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display. | -| nullif | Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | +### coalesce + +Returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display. + +### nullif + +Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | ## String Functions -| Function | Notes | -| ---------------- | ----- | -| ascii | | -| bit_length | | -| btrim | | -| char_length | | -| character_length | | -| concat | | -| concat_ws | | -| chr | | -| initcap | | -| left | | -| length | | -| lower | | -| lpad | | -| ltrim | | -| md5 | | -| octet_length | | -| repeat | | -| replace | | -| reverse | | -| right | | -| rpad | | -| rtrim | | -| digest | | -| split_part | | -| starts_with | | -| strpos | | -| substr | | -| translate | | -| trim | | -| upper | | +### ascii +### bit_length +### btrim +### char_length +### character_length +### concat +### concat_ws +### chr +### initcap +### left +### length +### lower +### lpad +### ltrim +### md5 +### octet_length +### repeat +### replace +### reverse +### right +### rpad +### rtrim +### digest +### split_part +### starts_with +### strpos +### substr +### translate +### trim +### upper ## Regular Expression Functions -| Function | Notes | -| -------------- | ----- | -| regexp_match | | -| regexp_replace | | +### regexp_match +### regexp_replace ## Temporal Functions @@ -182,24 +234,20 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut The valid field names for `date_part` are the same as for `extract`. - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` -### Other Temporal Functions: +### date_trunc +### from_unixtime +### now() -| Function | Notes | -| -------------------- | ------------ | -| date_trunc | | -| from_unixtime | | -| now() | current time | +current time ## Other Functions -| Function | Notes | -| -------- | ----- | -| array | | -| in_list | | -| random | | -| sha224 | | -| sha256 | | -| sha384 | | -| sha512 | | -| struct | | -| to_hex | | \ No newline at end of file +### array +### in_list +### random +### sha224 +### sha256 +### sha384 +### sha512 +### struct +### to_hex \ No newline at end of file From 18f35d29182433a27f77103584886ccb6d310b5c Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 9 Aug 2022 08:05:39 -0600 Subject: [PATCH 3/5] formatting --- .../source/user-guide/sql/scalar_functions.md | 126 +++++++++--------- 1 file changed, 63 insertions(+), 63 deletions(-) diff --git a/docs/source/user-guide/sql/scalar_functions.md b/docs/source/user-guide/sql/scalar_functions.md index cfd60da3f92b..f83611a97250 100644 --- a/docs/source/user-guide/sql/scalar_functions.md +++ b/docs/source/user-guide/sql/scalar_functions.md @@ -20,124 +20,124 @@ ## Math Functions -### abs(x) +### `abs(x)` absolute value -### acos(x) +### `acos(x)` inverse cosine -### asin(x) +### `asin(x)` inverse sine -### atan(x) +### `atan(x)` inverse tangent -### atan2(y, x) +### `atan2(y, x)` inverse tangent of y / x -### ceil(x) +### `ceil(x)` nearest integer greater than or equal to argument -### cos(x) +### `cos(x)` cosine -### exp(x) +### `exp(x)` exponential -### floor(x) +### `floor(x)` nearest integer less than or equal to argument -### ln(x) +### `ln(x)` natural logarithm -### log10(x) +### `log10(x)` base 10 logarithm -### log2(x) +### `log2(x)` base 2 logarithm -### power(base, exponent) +### `power(base, exponent)` base raised to the power of exponent -### round(x) +### `round(x)` round to nearest integer -### signum(x) +### `signum(x)` sign of the argument (-1, 0, +1) -### sin(x) +### `sin(x)` sine -### sqrt(x) +### `sqrt(x)` square root -### tan(x) +### `tan(x)` tangent -### trunc(x) +### `trunc(x)` truncate toward zero ## Conditional Functions -### coalesce +### `coalesce` Returns the first of its arguments that is not null. Null is returned only if all arguments are null. It is often used to substitute a default value for null values when data is retrieved for display. -### nullif +### `nullif` Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | ## String Functions -### ascii -### bit_length -### btrim -### char_length -### character_length -### concat -### concat_ws -### chr -### initcap -### left -### length -### lower -### lpad -### ltrim -### md5 -### octet_length -### repeat -### replace -### reverse -### right -### rpad -### rtrim -### digest -### split_part -### starts_with -### strpos -### substr -### translate -### trim -### upper +### `ascii` +### `bit_length` +### `btrim` +### `char_length` +### `character_length` +### `concat` +### `concat_ws` +### `chr` +### `initcap` +### `left` +### `length` +### `lower` +### `lpad` +### `ltrim` +### `md5` +### `octet_length` +### `repeat` +### `replace` +### `reverse` +### `right` +### `rpad` +### `rtrim` +### `digest` +### `split_part` +### `starts_with` +### `strpos` +### `substr` +### `translate` +### `trim` +### `upper` ## Regular Expression Functions @@ -234,20 +234,20 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut The valid field names for `date_part` are the same as for `extract`. - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` -### date_trunc -### from_unixtime -### now() +### `date_trunc` +### `from_unixtime` +### `now` current time ## Other Functions -### array -### in_list -### random -### sha224 -### sha256 -### sha384 -### sha512 -### struct -### to_hex \ No newline at end of file +### `array` +### `in_list` +### `random` +### `sha224` +### `sha256` +### `sha384` +### `sha512` +### `struct` +### `to_hex` \ No newline at end of file From 103beaf14f683872faf6b6989de20b79b208c7f6 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 9 Aug 2022 08:13:41 -0600 Subject: [PATCH 4/5] prettier --- .../source/user-guide/sql/scalar_functions.md | 249 ++++++++++-------- 1 file changed, 145 insertions(+), 104 deletions(-) diff --git a/docs/source/user-guide/sql/scalar_functions.md b/docs/source/user-guide/sql/scalar_functions.md index f83611a97250..a368ed3e09c9 100644 --- a/docs/source/user-guide/sql/scalar_functions.md +++ b/docs/source/user-guide/sql/scalar_functions.md @@ -16,6 +16,7 @@ specific language governing permissions and limitations under the License. --> + # Scalar Functions ## Math Functions @@ -28,69 +29,69 @@ absolute value inverse cosine -### `asin(x)` +### `asin(x)` -inverse sine +inverse sine -### `atan(x)` +### `atan(x)` -inverse tangent +inverse tangent -### `atan2(y, x)` +### `atan2(y, x)` -inverse tangent of y / x +inverse tangent of y / x -### `ceil(x)` +### `ceil(x)` -nearest integer greater than or equal to argument +nearest integer greater than or equal to argument -### `cos(x)` +### `cos(x)` -cosine +cosine -### `exp(x)` +### `exp(x)` -exponential +exponential -### `floor(x)` +### `floor(x)` -nearest integer less than or equal to argument +nearest integer less than or equal to argument -### `ln(x)` +### `ln(x)` -natural logarithm +natural logarithm -### `log10(x)` +### `log10(x)` -base 10 logarithm +base 10 logarithm -### `log2(x)` +### `log2(x)` -base 2 logarithm +base 2 logarithm -### `power(base, exponent)` +### `power(base, exponent)` -base raised to the power of exponent +base raised to the power of exponent -### `round(x)` +### `round(x)` -round to nearest integer +round to nearest integer -### `signum(x)` +### `signum(x)` -sign of the argument (-1, 0, +1) +sign of the argument (-1, 0, +1) -### `sin(x)` +### `sin(x)` -sine +sine -### `sqrt(x)` +### `sqrt(x)` -square root +square root -### `tan(x)` +### `tan(x)` -tangent +tangent ### `trunc(x)` @@ -104,44 +105,74 @@ Returns the first of its arguments that is not null. Null is returned only if al ### `nullif` -Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | +Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | ## String Functions ### `ascii` + ### `bit_length` -### `btrim` -### `char_length` -### `character_length` -### `concat` -### `concat_ws` -### `chr` -### `initcap` -### `left` -### `length` + +### `btrim` + +### `char_length` + +### `character_length` + +### `concat` + +### `concat_ws` + +### `chr` + +### `initcap` + +### `left` + +### `length` + ### `lower` -### `lpad` -### `ltrim` -### `md5` -### `octet_length` -### `repeat` -### `replace` -### `reverse` + +### `lpad` + +### `ltrim` + +### `md5` + +### `octet_length` + +### `repeat` + +### `replace` + +### `reverse` + ### `right` -### `rpad` -### `rtrim` -### `digest` -### `split_part` -### `starts_with` -### `strpos` -### `substr` -### `translate` -### `trim` -### `upper` + +### `rpad` + +### `rtrim` + +### `digest` + +### `split_part` + +### `starts_with` + +### `strpos` + +### `substr` + +### `translate` + +### `trim` + +### `upper` ## Regular Expression Functions ### regexp_match + ### regexp_replace ## Temporal Functions @@ -151,12 +182,12 @@ Returns a null value if value1 equals value2; otherwise it returns value1. This `to_timestamp()` is similar to the standard SQL function. It performs conversions to type `Timestamp(Nanoseconds, None)`, from: - Timestamp strings - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds - An Int64 array/column, values are nanoseconds since Epoch UTC - Other Timestamp() columns or values @@ -167,12 +198,12 @@ Note that conversions from other Timestamp and Int64 types can also be performed `to_timestamp_millis()` does conversions to type `Timestamp(Milliseconds, None)`, from: - Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of Milliseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds - An Int64 array/column, values are milliseconds since Epoch UTC - Other Timestamp() columns or values @@ -183,12 +214,12 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut `to_timestamp_micros()` does conversions to type `Timestamp(Microseconds, None)`, from: - Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of microseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds - An Int64 array/column, values are microseconds since Epoch UTC - Other Timestamp() columns or values @@ -199,12 +230,12 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut `to_timestamp_seconds()` does conversions to type `Timestamp(Seconds, None)`, from: - Timestamp strings, the same as supported by the regular timestamp() function (except the output is a timestamp of secondseconds resolution) - - `1997-01-31T09:26:56.123Z` # RCF3339 - - `1997-01-31T09:26:56.123-05:00` # RCF3339 - - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T - - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified - - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset - - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds + - `1997-01-31T09:26:56.123Z` # RCF3339 + - `1997-01-31T09:26:56.123-05:00` # RCF3339 + - `1997-01-31 09:26:56.123-05:00` # close to RCF3339 but with a space er than T + - `1997-01-31T09:26:56.123` # close to RCF3339 but no timezone et specified + - `1997-01-31 09:26:56.123` # close to RCF3339 but uses a space and timezone offset + - `1997-01-31 09:26:56` # close to RCF3339, no fractional seconds - An Int64 array/column, values are seconds since Epoch UTC - Other Timestamp() columns or values @@ -217,13 +248,13 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut - The `extract` function retrieves subfields such as year or hour from date/time values. `source` must be a value expression of type timestamp, Data32, or Data64. `field` is an identifier that selects what field to extract from the source value. The `extract` function returns values of type u32. - - `year` :`extract(year FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 2020` - - `month`:`extract(month FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 9` - - `week` :`extract(week FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 37` - - `day`: `extract(day FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 8` - - `hour`: `extract(hour FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 12` - - `minute`: `extract(minute FROM to_timestamp('2020-09-08T12:01:00+00:00')) -> 1` - - `second`: `extract(second FROM to_timestamp('2020-09-08T12:00:03+00:00')) -> 3` + - `year` :`extract(year FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 2020` + - `month`:`extract(month FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 9` + - `week` :`extract(week FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 37` + - `day`: `extract(day FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 8` + - `hour`: `extract(hour FROM to_timestamp('2020-09-08T12:00:00+00:00')) -> 12` + - `minute`: `extract(minute FROM to_timestamp('2020-09-08T12:01:00+00:00')) -> 1` + - `second`: `extract(second FROM to_timestamp('2020-09-08T12:00:03+00:00')) -> 3` ### `date_part` @@ -232,22 +263,32 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut - The `date_part` function is modeled on the postgres equivalent to the SQL-standard function `extract`. Note that here the field parameter needs to be a string value, not a name. The valid field names for `date_part` are the same as for `extract`. - - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` + - `date_part('second', to_timestamp('2020-09-08T12:00:12+00:00')) -> 12` -### `date_trunc` -### `from_unixtime` -### `now` +### `date_trunc` -current time +### `from_unixtime` + +### `now` + +current time ## Other Functions -### `array` -### `in_list` -### `random` -### `sha224` -### `sha256` -### `sha384` -### `sha512` -### `struct` -### `to_hex` \ No newline at end of file +### `array` + +### `in_list` + +### `random` + +### `sha224` + +### `sha256` + +### `sha384` + +### `sha512` + +### `struct` + +### `to_hex` From 3f032246e011b3309e48824b1224136e23830896 Mon Sep 17 00:00:00 2001 From: Andy Grove Date: Tue, 9 Aug 2022 09:00:51 -0600 Subject: [PATCH 5/5] Update docs/source/user-guide/sql/scalar_functions.md Co-authored-by: Andrew Lamb --- docs/source/user-guide/sql/scalar_functions.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/source/user-guide/sql/scalar_functions.md b/docs/source/user-guide/sql/scalar_functions.md index a368ed3e09c9..0791cdf3af83 100644 --- a/docs/source/user-guide/sql/scalar_functions.md +++ b/docs/source/user-guide/sql/scalar_functions.md @@ -267,6 +267,8 @@ Note that `CAST(.. AS Timestamp)` converts to Timestamps with Nanosecond resolut ### `date_trunc` +### `date_bin` + ### `from_unixtime` ### `now`