Implement support of syslog format in the parseDateTimeBestEffort() function #50925

sirvickr · 2023-06-13T13:05:35Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Added the possibility to use date and time arguments in syslog timestamp format in functions parseDateTimeBestEffort*() and parseDateTime64BestEffort*().

…teffort_or_null_empty_string to the new feature

robot-ch-test-poll4 · 2023-06-13T14:32:08Z

This is an automated comment for commit bd16233 with description of existing statuses. It's updated for the latest CI running
The full report is available here
The overall status of the commit is 🔴 failure

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	🟢 success
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	🟢 success
ClickHouse build check	Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process	🟢 success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	🟢 success
Docker image for servers	The check to build and optionally push the mentioned image to docker hub	🟢 success
Docs Check	Builds and tests the documentation	🟢 success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	🟢 success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	🟢 success
Install packages	Checks that the built packages are installable in a clear environment	🟢 success
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	🟢 success
Mergeable Check	Checks if all other necessary checks are successful	🟢 success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	🟢 success
Push to Dockerhub	The check for building and pushing the CI related docker images to docker hub	🟢 success
SQLancer	Fuzzing tests that detect logical bugs with SQLancer tool	🟢 success
Sqllogic	Run clickhouse on the sqllogic test set against sqlite and checks that all statements are passed	🟢 success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	🟢 success
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	🔴 failure
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	🟢 success
Style Check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	🟢 success
Unit tests	Runs the unit tests for different release types	🟢 success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	🟢 success

docs/en/sql-reference/functions/type-conversion-functions.md

rschu1ze · 2023-06-14T09:57:11Z

tests/queries/0_stateless/02783_parsedatetimebesteffort_syslog.sql

@@ -0,0 +1,46 @@
+WITH
+    86400 AS secs_in_day,
+    now() AS ts_now,


Using now() in a test is a recipe for instable test results.

In general, the test setup looks a bit too overengineered. Maybe we can use a more dumb test structure (in particular constant input values) like in 00813_parse_date_time_best_effort_more.sql or 01442_date_time_with_params.sql?

I was thinking about constant timestamps in the first place. Unfortunately, parseDateTimeBestEffort() uses now() internally (namely, it uses time() to get the current moment), so looks like I can't avoid using it in the test.
I came up with calculating a "time shift" between the reference timestamp and any current moment, thereby "freezing" the time of the test whenever it is actually run.
However, the CI run shows that I failed to avoid flakiness, and now I'm trying to tackle this issue and make the test drastically more reliable.

I managed to pinpoint the origin of the problem and work it around. I'll try to explain everything from scratch.
Here is a debug query which is close to the original flaky test:

WITH now() AS dt_now, 86400 AS delta, dt_now - delta AS dt_before, dt_now + delta AS dt_after, toDateTime('2023-06-10 00:00:00') AS dt_ref, dateDiff('second', dt_ref, dt_now) AS time_shift, formatDateTime(dt_around, '%b %e %T') AS syslog_arg SELECT dt_now, time_shift, dt_before, dt_after, dt_around, syslog_arg, parseDateTimeBestEffort(syslog_arg) AS res_actual, formatDateTime(dt_around - time_shift, '%b %e %T') AS around_ref_point, res_actual - time_shift AS res_ref FROM (SELECT arrayJoin([dt_before, dt_after]) AS dt_around) FORMAT PrettySpaceNoEscapes;

Here are the results of two consecutive (one after another, the same second) runs of the query - the first is correct (frequent) and the second is erroneous (reproduces quite seldom):

dt_now time_shift dt_before dt_after dt_around syslog_arg res_actual around_ref_point res_reference 2023-06-15 08:27:08 462428 2023-06-14 08:27:08 2023-06-16 08:27:08 2023-06-14 08:27:08 Jun 14 08:27:08 2023-06-14 08:27:08 Jun 9 00:00:00 2023-06-09 00:00:00 2023-06-15 08:27:08 462428 2023-06-14 08:27:08 2023-06-16 08:27:08 2023-06-16 08:27:08 Jun 16 08:27:08 2022-06-16 08:27:08 Jun 11 00:00:00 2022-06-11 00:00:00 dt_now time_shift dt_before dt_after dt_around syslog_arg res_actual around_ref_point res_reference 2023-06-15 08:27:08 462428 2023-06-14 08:27:08 2023-06-16 08:27:08 2023-06-14 08:27:09 Jun 14 08:27:09 2023-06-14 08:27:09 Jun 9 00:00:01 2023-06-09 00:00:01 2023-06-15 08:27:08 462428 2023-06-14 08:27:08 2023-06-16 08:27:08 2023-06-16 08:27:09 Jun 16 08:27:09 2022-06-16 08:27:09 Jun 11 00:00:01 2022-06-11 00:00:01

Firstly, I get the current moment of time: now() AS dt_now - it happened to be 2023-06-15 08:27:08.
Secondly, I step from it backwards and forward in time by a day (plus/minus 86400 seconds):
dt_now - delta as dt_before and dt_now + delta as dt_after, i.e. 2023-06-14 08:27:08 and 2023-06-16 08:27:08.
After that, I join these two time points to a temporary dataset, which they are extracted from under the name dt_around:
FROM (SELECT arrayJoin([dt_before, dt_after]) AS dt_around).
Then they are formatted as a tested syslog timestamp string: formatDateTime(dt_around, '%b %e %T') AS syslog_arg, i.e. Jun 14 08:27:08 and Jun 16 08:27:08 and passed to parseDateTimeBestEffort as arguments:
parseDateTimeBestEffort(syslog_arg) AS res_actual.
Of course, res_actual, as well as dt_now and other transient variables change each second we run the test, so they poorly fit to be put into the .reference file and thus to be used in functional tests.
Therefore, I virtually move all the test runs to the same arbitrarily selected "reference point", regardless of the moment the test actually runs. Here is my reference point: toDateTime('2023-06-10 00:00:00') AS dt_ref. The test will virtually run each time at this time point.
So, I calculate the time shift (in seconds) between the reference point and the current moment:
dateDiff('second', dt_ref, dt_now) AS time_shift.
Then I convert the current (transient) result of the function to the reference result - as if the functions was called at the reference time point: res_actual - time_shift AS res_reference.
And on top of all that, just to make the case more clear, I also print the "reference" arguments (around_ref_point) instead of their transient counterparts (syslog_arg):
formatDateTime(dt_around - time_shift, '%b %e %T') AS around_ref_point.
This is how the test is meant to work and works most of the time.

Now let's look at the problem.

Lets take a close look at this part of both query results.
I'll remind how I create and use the two time points (in the past and in the future):
FROM (SELECT arrayJoin([dt_before, dt_after]) AS dt_around).
Namely, I put the previously calculated values in a dataset and the extract them for further use under the name dt_around. So, I expect to see them unchanged, and they are, most of the time.
However, from time to time, the date-times put into the dataset and subsequently read from it differ from their original values, as we can see on the picture above.
Namely, I join 2023-06-14 08:27:08 and 2023-06-16 08:27:08 and read 2023-06-14 08:27:09 and 2023-06-16 08:27:09 instead. One second ahead.

This used to be the only reason of the test's flakiness.
I worked around this issue by getting rid of arrayJoin() in the queries.
However, the problem with strange behavior of arrayJoin() still remains.

Thanks for the nice explanation. Very helpful. I think the way you solved the problem it is fine.
I also don't have a good explanation what went wrong. Randomized database settings could be an option (but even then it is not clear how) - anyways, afaik the settings in normal functional test are unchanged (this only happens in stress tests).

If you can reproduce the issue locally, it would be interesting to check further, if not we might as well keep things as is for now.

rschu1ze · 2023-06-16T13:28:27Z

@sirvickr The new tests failed sporadically on ARM for unclear reasons (only the Auckland time zone is affected). I tried to reproduce on an ARM machine but no success, unfortunately.

Perhaps you have an idea what went wrong?

sirvickr · 2023-06-16T16:46:23Z

It barely depends on the architecture. I'll take a look.

sirvickr · 2023-06-16T16:58:07Z

I guess I have a clue. Robert, should we continue within the PR#51081?

rschu1ze · 2023-06-16T16:59:35Z

Doesn't matter too much. If you make a new PR, just set me as reviewer.

sirvickr · 2023-06-16T17:12:08Z

Got it. I'll articulate the idea in #51081.
Looks like I've located the problem.

sirvickr added 15 commits June 6, 2023 07:37

Substitute missing year in parseDateTimeBestEffortImpl()

6d25e5a

Fix runtime bug

3f77b77

Adjust 00569_parse_date_time_best_effort and 01543_parse_datetime_bes…

f096bfc

…teffort_or_null_empty_string to the new feature

Add 02783_parseDateTimeBestEffort_syslog test

0dd75d7

Amend the tests

9f61c78

Align the documentation with the new feature

32d781c

Fix the docs

b740a08

Elucidate the syslog case in the documentation

a21bd4e

Rectify the existing example of the year omission

3c1b02a

Add a syslog format example to the documentation

9b70836

Amend the test infinitesimally

5af06f8

Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-812

b3ef2a4

Remove reduntant header SELECT from the test

a3ff5df

Rename the 'time shift' variable in the test to make it more clear

a8b68a8

Rename the test to the snake_case after the Team Lead's review

38151f9

This comment was marked as outdated.

Sign in to view

rschu1ze self-assigned this Jun 13, 2023

rschu1ze added the can be tested Allows running workflows for external contributors label Jun 13, 2023

robot-ch-test-poll4 added the pr-improvement Pull request with some product improvements label Jun 13, 2023

Merge branch 'master' into ADQM-812

f01b96f

rschu1ze reviewed Jun 14, 2023

View reviewed changes

sirvickr added 6 commits June 15, 2023 10:45

Amend the docs following the reviewer's pointers

11dfd35

Tackle the test's flakiness

5026e0f

Move the reference time point to the border of June

fe0bc99

Shrink the test half-interval to a minute

72ef15e

Use int the test explicit date_add() and date_sub() instead of + and -.

c899f6b

Add more clear remarks to the test

bd16233

rschu1ze merged commit 74cb797 into ClickHouse:master Jun 15, 2023
279 of 281 checks passed

rschu1ze mentioned this pull request Jun 16, 2023

Temporarily disable 02783_parsedatetimebesteffort_syslog #51081

Closed

sirvickr mentioned this pull request Jun 17, 2023

Fix the test 02783_parsedatetimebesteffort_syslog flakiness #51112

Merged

sirvickr deleted the ADQM-812 branch September 2, 2023 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement support of syslog format in the parseDateTimeBestEffort() function #50925

Implement support of syslog format in the parseDateTimeBestEffort() function #50925

sirvickr commented Jun 13, 2023 •

edited by rschu1ze

This comment was marked as outdated.

robot-ch-test-poll4 commented Jun 13, 2023 •

edited by robot-ch-test-poll2

rschu1ze Jun 14, 2023

sirvickr Jun 14, 2023

sirvickr Jun 15, 2023

sirvickr Jun 15, 2023 •

edited

sirvickr Jun 15, 2023

rschu1ze Jun 15, 2023

rschu1ze commented Jun 16, 2023

sirvickr commented Jun 16, 2023

sirvickr commented Jun 16, 2023

rschu1ze commented Jun 16, 2023

sirvickr commented Jun 16, 2023 •

edited

Implement support of syslog format in the parseDateTimeBestEffort() function #50925

Implement support of syslog format in the parseDateTimeBestEffort() function #50925

Conversation

sirvickr commented Jun 13, 2023 • edited by rschu1ze

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

This comment was marked as outdated.

robot-ch-test-poll4 commented Jun 13, 2023 • edited by robot-ch-test-poll2

rschu1ze Jun 14, 2023

Choose a reason for hiding this comment

sirvickr Jun 14, 2023

Choose a reason for hiding this comment

sirvickr Jun 15, 2023

Choose a reason for hiding this comment

sirvickr Jun 15, 2023 • edited

Choose a reason for hiding this comment

sirvickr Jun 15, 2023

Choose a reason for hiding this comment

rschu1ze Jun 15, 2023

Choose a reason for hiding this comment

rschu1ze commented Jun 16, 2023

sirvickr commented Jun 16, 2023

sirvickr commented Jun 16, 2023

rschu1ze commented Jun 16, 2023

sirvickr commented Jun 16, 2023 • edited

sirvickr commented Jun 13, 2023 •

edited by rschu1ze

robot-ch-test-poll4 commented Jun 13, 2023 •

edited by robot-ch-test-poll2

sirvickr Jun 15, 2023 •

edited

sirvickr commented Jun 16, 2023 •

edited