Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateTime64 data type #7170

Merged
merged 103 commits into from Dec 12, 2019
Merged

DateTime64 data type #7170

merged 103 commits into from Dec 12, 2019

Conversation

@Enmk
Copy link
Contributor

Enmk commented Oct 2, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

For changelog. Remove if this is non-significant change.

Category (leave one):

  • New Feature

Short description (up to few sentences):
DateTime64 datatype with configurable sub-second precision.
...

Things not addressed in this PR and to be implemented later:

  • Dates outside of LUT range.
  • Custom formatting of sub-second part via format string (there is no consensus on format spec)
  • Support of SQL INTERVALs for DateTime64 comparison and arithmetic.
Enmk added 9 commits Oct 2, 2019
Added more tests for casting and inserting values.
…ield type.

Since DateTime64 is just a typedef, and there is no explicit Field-type
for it, we have to solely rely on type_hint provided by origin column.
If the hint is missing, there is no way of distinguishing DateTime64
from Decimal64.

Alternative could be having explicit converting code, (and it looks like
it has to be added at some point).
@@ -6,6 +6,7 @@
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeDateTime64.h>

This comment has been minimized.

Copy link
@KochetovNicolai

KochetovNicolai Dec 6, 2019

Contributor

Is it needed here?

This comment has been minimized.

Copy link
@Enmk

Enmk Dec 11, 2019

Author Contributor

fixed

@@ -26,6 +26,7 @@
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime.h>
#include <DataTypes/DataTypeDateTime64.h>

This comment has been minimized.

Copy link
@KochetovNicolai

KochetovNicolai Dec 6, 2019

Contributor

Is it needed here?

This comment has been minimized.

Copy link
@Enmk

Enmk Dec 11, 2019

Author Contributor

Apparently, nope, removed

Copy link
Contributor

KochetovNicolai left a comment

The first iteration is finished.

Generally, everything is good. But I still need to test how it works.
It would be even better to add perftests.
Also, it would be probably more convenient to add scale argument to toDateTime64. And format scale in formatDateTime somehow.

Also, it would be better to split pr to smaller steps next time. E.g. single pr with decimal refactoring and new type, several prs with functions and pr with asof join.

x = negative ? -res : res;
if (check_overflow && buf.count() - initial > std::numeric_limits<T>::digits10)

This comment has been minimized.

Copy link
@KochetovNicolai

KochetovNicolai Dec 10, 2019

Contributor

This check is only for decimal overflow. E.g. std::numeric_limits<char>::digits10 is 2, but number 111 can be represented in char.

This comment has been minimized.

Copy link
@Enmk

Enmk Dec 11, 2019

Author Contributor

I agree, but correct overflow check would require to use mulOverflow, addOverflow and additional runtime checks on last few digits, roughly:

case '9':
    if constexpr (check_overflow == ReadIntTextCheckOverflow::CHECK_OVERFLOW)
    {
        // perform relativelly slow overflow check only when number of decimal digits so far is close to the max for given type.
        if (buf.count() - initial_pos >= std::numeric_limits<T>::max_digits10)
        {
            if (common::mulOverflow(res, static_cast<decltype(res)>(10), res)
                || common::addOverflow(res, static_cast<decltype(res)>(*buf.position() - '0'), res))
                return ReturnType(false);
            break;
        }
    }
    res *= 10;
    res += *buf.position() - '0';
    break;

This comment has been minimized.

Copy link
@KochetovNicolai

KochetovNicolai Dec 11, 2019

Contributor

I mean, it's good that we have this check, and it's perfect for decimals.
Just maybe rename it to CHECK_DECIMAL_OVERFLOW, to specify what we really check.

* put functions in DecimalFunctions into DecimalUtils namespace
* fixed possible buffer overflow in parseDateTimeBestEffortImpl
* fixed readDateTimeTextImpl not to require fractional part separator
  (if fractional part is missing)
* Tests for code from DecimalFunctions.h
* Fixed serializing DateTime64 to string with writeDateTimeText(), fixed tests
@Enmk Enmk force-pushed the Enmk:DateTime64 branch from 2252da8 to f56b2b6 Dec 11, 2019
@Enmk

This comment has been minimized.

Copy link
Contributor Author

Enmk commented Dec 11, 2019

@KochetovNicolai Thanks for a review!

Also, it would be probably more convenient to add scale argument to toDateTime64

well, it is already there, current function prototype is toDateTime64(value, [scale, timezone]) and it is properly verified in tests already.

Also, it would be better to split pr to smaller steps next time. E.g. single pr with decimal refactoring and new type, several prs with functions and pr with asof join.

I think that ASOF JOIN part can be safely excluded from this PR, as for other things... Well, it is complicated, since this PR is based on another one (#5187) and it is hard to split those now.

And format scale in formatDateTime somehow.

There is no consensus on what format specifier to use and how to set precision. And it is explicitly marked in PR description that we'll do that later (as you've mentioned PR is already HUGE), so let's postpone formatting till everything else is resolved.

* More precise overflow check in readIntTextImpl
* writeDateTimeText now always writes sub-second part for DateTime64
* comment for validateFunctionArgumentTypes
* DateTime64-related fixes for FunctionConvertFromString
* other minoe fixes: comments, removed commented out code, variable
  renamings, etc.
@KochetovNicolai

This comment has been minimized.

Copy link
Contributor

KochetovNicolai commented Dec 11, 2019

Surely, I don't suppose to add all features to current pr :) Just have written some ideas which are good to do later.

KochetovNicolai and others added 4 commits Dec 11, 2019
* Fixed precision calculation in DataTypeDecimalBase c-tor
* Fixed max precision calculation in getLeastSupertype
* Fixed reading past end of vector in FunctionsConversion with extractToDecimalScale
* More verbose comments on FunctionArgumentTypeValidator and validateFunctionArgumentTypes
* style and other minor fixes.
@alexey-milovidov

This comment has been minimized.

Copy link
Member

alexey-milovidov commented Dec 11, 2019

We decided to wait for support for arithmetic operations on INTERVALs and dateDiff function.

@KochetovNicolai

This comment has been minimized.

Copy link
Contributor

KochetovNicolai commented Dec 12, 2019

Will do it in other pr. This one is huge already.

@KochetovNicolai KochetovNicolai merged commit 7ad2a6d into ClickHouse:master Dec 12, 2019
28 checks passed
28 checks passed
ClickHouse build check 15/15 builds are OK
Details
Code coverage Coverage prepared
Details
Compatibility check Compatibility check passed
Details
Description check Description OK
Details
Functional stateful tests (address) fail:0, passed:97
Details
Functional stateful tests (debug) fail:0, passed:97
Details
Functional stateful tests (memory) fail:0, passed:97
Details
Functional stateful tests (release) fail:0, passed:97
Details
Functional stateful tests (release, processors) fail:0, passed:97
Details
Functional stateful tests (thread) fail:0, passed:95, skipped:2
Details
Functional stateful tests (ubsan) fail:0, passed:97
Details
Functional stateless tests (address) fail:0, passed:1466, skipped:2
Details
Functional stateless tests (debug) fail:0, passed:1465, skipped:3
Details
Functional stateless tests (memory) fail:24, passed:55
Details
Functional stateless tests (release) fail:0, passed:1468
Details
Functional stateless tests (release, processors) fail:0, passed:1466, skipped:2
Details
Functional stateless tests (thread) fail:0, passed:1465, skipped:3
Details
Functional stateless tests (ubsan) fail:0, passed:1466, skipped:2
Details
Functional stateless tests (unbundled) fail:0, passed:1456, skipped:12
Details
Integration tests (asan) failed:0, passed:263, error:0
Details
Integration tests (release) failed:0, passed:263, error:0
Details
PVS check New errors 0, total errors 20
Details
Performance test 1382/1383 queries are OK
Details
Split build smoke test Server started and responded
Details
Stress test (address) No errors found
Details
Stress test (thread) No errors found
Details
Style check Style check passed
Details
Unit tests fail: 0, passed: 5062
Details
@alexey-milovidov

This comment has been minimized.

Copy link
Member

alexey-milovidov commented Dec 12, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.