Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 119 additions & 31 deletions docs/en/docs/advanced/time-zone.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,46 +26,42 @@ under the License.

# Time Zone

Doris supports multiple time zone settings

## Noun Interpretation

* FE: Frontend, the front-end node of Doris. Responsible for metadata management and request access.
* BE: Backend, Doris's back-end node. Responsible for query execution and data storage.
Doris supports custom time zone settings

## Basic concepts

There are multiple time zone related parameters in Doris

* `system_time_zone`:

When the server starts, it will be set automatically according to the time zone set by the machine, which cannot be modified after setting.
The following two time zone related parameters exist within Doris:

* `time_zone`:

Server current time zone, set it at session level or global level.
- `system_time_zone` : When the server starts up, it will be set automatically according to the time zone set by the machine, and cannot be modified after it is set.
- `time_zone` : The current time zone of the cluster.

## Specific operations

1. `SHOW VARIABLES LIKE '% time_zone%'`

View the current time zone related configuration

2. `SET time_zone = 'Asia/Shanghai'`
2. `SET [global] time_zone = 'Asia/Shanghai'`

This command sets the time zone at the session level. If the `global` keyword is used, Doris FE persists the parameter and it takes effect for all new sessions afterwards.

## Data source

The time zone data contains the name of the time zone, the corresponding time offset, and the change of daylight saving time. On the machine where the BE is located, the sources of the data are as follows:

This command can set the session level time zone, which will fail after disconnection.
1. the directory returned by the `TZDIR` command
2. the `/usr/share/zoneinfo` directory
3. the `zoneinfo` directory generated under the doris BE deployment directory. The `resource/zoneinfo.tar.gz` directory from the doris repository.

3. `SET global time_zone = 'Asia/Shanghai'`

This command can set time zone parameters at the global level. The FE will persist the parameters and will not fail when the connection is disconnected.
Look up the above data sources in order and use the current item if found. If none of the three are found, the doris BE will fail to start, please rebuild the BE correctly or get the distribution.

### Impact of time zone
## Impact of time zone

Time zone setting affects the display and storage of time zone sensitive values.
### 1. functions

It includes the values displayed by time functions such as `NOW()` or `CURTIME()`, as well as the time values in `SHOW LOAD` and `SHOW BACKENDS` statements.
Includes values displayed by time functions such as `NOW()` or `CURTIME()`, and also time values in `show load`, `show backends`.

However, it does not affect the `LESS THAN VALUE` of the time-type partition column in the `CREATE TABLE` statement, nor does it affect the display of values stored as `DATE/DATETIME` type.
However, it does not affect the less than value of the time-type partitioned columns in `create table`, nor does it affect the display of values stored as `date/datetime` types.

Functions affected by time zone:

Expand All @@ -79,18 +75,110 @@ Functions affected by time zone:

* `CONVERT_TZ`: Converts a date and time from one specified time zone to another.

## Restrictions
### 2. Values of time types

For `DATE`, `DATEV2`, `DATETIME`, `DATETIMEV2` types, we support time zone conversion when inserting data.

- If the data comes with a time zone, such as "2020-12-12 12:12:12+08:00", and the current Doris `time_zone = +00:00`, you get the actual value "2020-12-12 04:12:12".
- If the data does not come with a time zone, such as "2020-12-12 12:12:12", the time is considered to be absolute and no conversion occurs.

### 3. Daylight Saving Time

Daylight Saving Time is essentially the actual time offset of a named time zone, which changes on certain dates.

For example, the `America/Los_Angeles` time zone contains a Daylight Saving Time adjustment that begins and ends approximately in March and November of each year. That is, the `America/Los_Angeles` actual time zone offset changes from `-08:00` to `-07:00` at the start of Daylight Savings Time in March, and from `-07:00` to `-08:00` at the end of Daylight Savings Time in November.
If you do not want Daylight Saving Time to be turned on, set `time_zone` to `-08:00` instead of `America/Los_Angeles`.

## Usage

Time zone values can be given in a variety of formats. The following standard formats are well supported in Doris:

1. standard named time zone formats, such as "Asia/Shanghai", "America/Los_Angeles".
2. standard offset formats, such as "+02:30", "-10:00".
3. abbreviated time zone formats, currently only support:
1. "GMT", "UTC", equivalent to "+00:00" time zone
2. "CST", which is equivalent to the "Asia/Shanghai" time zone
4. single letter Z, for Zulu time zone, equivalent to "+00:00" time zone

Note: Some other formats are currently supported in some imports in Doris due to different implementations. **Production environments should not rely on these formats that are not listed here, and their behaviour may change at any time**, so keep an eye on the relevant changelog for version updates.

## Best Practices

### Time Zone Sensitive Data

The time zone issue involves three main influences:

1. session variable `time_zone` -- cluster timezone
2. header `timezone` specified during import(Stream Load, Broker Load etc.) -- importing timezone
3. timezone type literal "+08:00" in "2023-12-12 08:00:00+08:00" -- data timezone

We can understand it as follows:

Doris is currently compatible with importing data into Doris under all time zones. Since time types such as `DATETIME` do not contain time zone information, the time type data in the Doris cluster can be divided into two categories:

1. absolute time
2. time in a specific time zone

Absolute time means that it is associated with a data scenario that is independent of the time zone. For this type of data, it should be imported without any time zone suffix and they will be stored as is. For this type of time, since it is not associated with an actual time zone, taking the result of a function such as `unix_timestamp` is meaningless. Changes to the cluster `time_zone` will not affect its use.

The time in a particular time zone. This "specific time zone" is our session variable `time_zone`. As a matter of best practice, this variable should be set before data is imported **and never changed**. At this point in time, this type of time data in the Doris cluster will actually mean: time in the `time_zone` time zone. Example:

```sql
mysql> select @@time_zone;
+----------------+
| @@time_zone |
+----------------+
| Asia/Hong_Kong |
+----------------+
1 row in set (0.12 sec)

mysql> insert into dtv23 values('2020-12-12 12:12:12+02:00'); --- absolute timezone is +02:00
Query OK, 1 row affected (0.27 sec)

mysql> select * from dtv23;
+-------------------------+
| k0 |
+-------------------------+
| 2020-12-12 18:12:12.000 | --- converted to Doris' cluster timezone Asia/Hong_Kong. This semantics should be maintained.
+-------------------------+
1 row in set (0.19 sec)

mysql> set time_zone = 'America/Los_Angeles';
Query OK, 0 rows affected (0.15 sec)

mysql> select * from dtv23;
+-------------------------+
| k0 |
+-------------------------+
| 2020-12-12 18:12:12.000 | --- If time_zone is modified, the time value does not change and its meaning is disturbed.
+-------------------------+
1 row in set (0.18 sec)

mysql> insert into dtv23 values('2020-12-12 12:12:12+02:00');
Query OK, 1 row affected (0.17 sec)

Time zone values can be given in several formats, case-insensitive:
mysql> select * from dtv23;
+-------------------------+
| k0 |
+-------------------------+
| 2020-12-12 02:12:12.000 |
| 2020-12-12 18:12:12.000 |
+-------------------------+ --- the data has been misplaced.
2 rows in set (0.19 sec)
```

* A string representing UTC offset, such as '+10:00' or '-6:00'.
In summary, the best practice for dealing with time zone issues is to:

* Standard time zone formats, such as "Asia/Shanghai", "America/Los_Angeles"
1. Confirm the timezone characterised by the cluster and set the `time_zone` before use, and do not change it afterwards.
2. Set the header `timezone` on import to match the cluster `time_zone`.
3. For absolute time, import without a time zone suffix; for time in a time zone, import with a specific time zone suffix, which will be converted to the Doris `time_zone` time zone after import.

* Abbreviated time zone formats such as MET and CTT are not supported. Because the abbreviated time zone is ambiguous in different scenarios, it is not recommended to use it.
### Daylight Saving Time

* In order to be compatible with Doris and support CST abbreviated time zone, CST will be internally transferred to "Asia/Shanghai", which is Chinese standard time zone.
The start and end times for Daylight Saving Time are taken from the [current time zone data source](#data-source) and may not necessarily correspond exactly to the actual officially recognised times for the current year's time zone location. This data is maintained by ICANN. If you need to ensure that Daylight Saving Time behaves as specified for the current year, please make sure that data source selected by Doris is the latest ICANN published time zone data, which could be downloaded at [Extended Reading](#extended-reading).

## Time zone format list
## Extended Reading

[List of TZ database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)
- [List of tz database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)
- [IANA Time Zone Database](https://www.iana.org/time-zones)
- [The tz-announce Archives](https://mm.icann.org/pipermail/tz-announce/)
4 changes: 2 additions & 2 deletions docs/en/docs/advanced/variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -393,11 +393,11 @@ Note that the comment must start with /*+ and can only follow the SELECT.

* `system_time_zone`

Displays the current system time zone. Cannot be changed.
Set to the current system time zone when the cluster is initialised. It cannot be changed.

* `time_zone`

Used to set the time zone of the current session. The time zone has an effect on the results of certain time functions. For the time zone, see [here](./time-zone.md).
Used to set the time zone for the current session. Defaults to the value of `system_time_zone`. It affects the results of certain time functions. For more information, see the [time zone](./time-zone) documentation.

* `tx_isolation`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ Parameter introduction:

9. strict_mode: The user specifies whether to enable strict mode for this import. The default is off. The enable mode is -H "strict_mode: true".

10. timezone: Specify the time zone used for this import. The default is Dongba District. This parameter affects the results of all time zone-related functions involved in the import.
10. timezone: Specifies the timezone used for this import. The default is "+08:00". This variable replaces the session variable `time_zone` in this import transaction. See the section "Importing with timezones" in [Best Practice](#best-practice) for more information.

11. exec_mem_limit: Import memory limit. Default is 2GB. The unit is bytes.

Expand Down Expand Up @@ -462,3 +462,11 @@ separated by commas.

Doris also limits the number of import tasks running at the same time in the cluster, usually ranging from 10-20. Import jobs submitted after that will be rejected.

10. Importing with timezones

Since Doris currently has no built-in time types for time zones, all `DATETIME` related types only represent absolute points in time, and do not contain time zone information, which does not change due to time zone changes in the Doris system. Therefore, for importing data with a time zone, we uniformly handle it as **converting it to data in a specific target time zone**. In the Doris system, this is the time zone represented by the session variable `time_zone`.

In the import, on the other hand, our target timezone is specified by the parameter `timezone`, which will replace the session variable `time_zone` when timezone conversions occur, and when computing timezone-sensitive functions. Therefore, if there are no special circumstances, `timezone` should be set in the import transaction to match the `time_zone` of the current Doris cluster. This means that all time data with a time zone will be converted to that time zone.
For example, if the Doris system timezone is "+08:00", and the time column in the imported data contains two pieces of data, "2012-01-01 01:00:00Z" and "2015-12-12 12:12:12-08:00", then after we specify the timezone of the imported transaction via `-H "timezone: +08:00"` during import, both pieces of data will be converted to that timezone, resulting in the results "2012-01-01 09:00:00" and "2015-12-13 04:12:12".

For a more detailed understanding, see [time-zone](../../../../advanced/time-zone) document.
35 changes: 32 additions & 3 deletions docs/en/docs/sql-manual/sql-reference/Data-Types/DATETIME.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,36 @@ The form of printing is 'yyyy-MM-dd HH:mm:ss.SSSSSS'

### note

DATETIME supports precision up to microseconds.
DATETIME supports temporal precision up to microseconds. When parsing imported DATETIME type data using the BE side (e.g. using Stream load, Spark load, etc.), or using the FE side with the [Nereids](../../../query-acceleration/nereids) on, decimals exceeding the current precision will be **rounded**.
DATETIME reads support resolving the time zone in the format of the original DATETIME literal followed by the time zone:
```sql
<date> <time>[<timezone>]
```

### keywords
DATETIME
For the specific supported formats for `<timezone>`, see [timezone](../../../advanced/time-zone). Note that the `DATE`, `DATEV2`, `DATETIME`, and `DATETIMEV2` types **don't** contain time zone information. For example, if an input time string "2012-12-12 08:00:00+08:00" is parsed and converted to the current time zone "+02:00", and the actual value "2012-12-12 02:00:00" is stored in the DATETIME column, the value itself will not change, no matter how much the cluster environment variables are changed.

### example

```sql
mysql> select @@time_zone;
+----------------+
| @@time_zone |
+----------------+
| Asia/Hong_Kong |
+----------------+
1 row in set (0.11 sec)

mysql> insert into dtv23 values ("2020-12-12 12:12:12Z"), ("2020-12-12 12:12:12GMT"), ("2020-12-12 12:12:12+02:00"), ("2020-12-12 12:12:12America/Los_Angeles");
Query OK, 4 rows affected (0.17 sec)

mysql> select * from dtv23;
+-------------------------+
| k0 |
+-------------------------+
| 2020-12-12 20:12:12.000 |
| 2020-12-12 20:12:12.000 |
| 2020-12-13 04:12:12.000 |
| 2020-12-12 18:12:12.000 |
+-------------------------+
4 rows in set (0.15 sec)
```
Original file line number Diff line number Diff line change
Expand Up @@ -49,25 +49,7 @@ illustrate:
> Note:
>
> 1. Only ADMIN users can set variables to take effect globally
> 2. The globally effective variable does not affect the variable value of the current session, but only affects the variable in the new session.

Variables that support both the current session and the global effect include:

- `time_zone`
- `wait_timeout`
- `sql_mode`
- `enable_profile`
- `query_timeout`
- <version since="dev" type="inline">`insert_timeout`</version>
- `exec_mem_limit`
- `batch_size`
- `allow_partition_column_nullable`
- `insert_visible_timeout_ms`
- `enable_fold_constant_by_be`

Variables that only support global effects include:

- `default_rowset_type`
> 2. The globally effective variable affects the current session and new sessions thereafter, but does not affect other sessions that currently exist.

### Example

Expand All @@ -87,5 +69,3 @@ Variables that only support global effects include:

SET, VARIABLE

### Best Practice

Loading