Skip to content
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.

Make sure dates and times are within the supported range #698

Closed
erizocosmico opened this issue Apr 25, 2019 · 5 comments · Fixed by #699
Closed

Make sure dates and times are within the supported range #698

erizocosmico opened this issue Apr 25, 2019 · 5 comments · Fixed by #699
Assignees
Labels
bug Something isn't working

Comments

@erizocosmico
Copy link
Contributor

Related to src-d/gitbase#805

### Background

According to the docs:

The supported range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'.

This is not exactly true and the behaviour is a little bit more complicated and weird than that.

If you pass the upper limit, NULL is returned:

SELECT CONVERT('10000-12-31 23:59:59', DATETIME)

However, if you set a date lower than the lower bound, it does not:

SELECT CONVERT('0000-12-31 23:59:59', DATETIME)

This returns:

0000-12-31 23:59:59

The issue with Convert

Convert method of sql.Timestamp and sql.Date types. Convert can either return a value or an error, not return nil. So we need something else to handle conversion to dates that can make them nil, just in case the upper bound is passed.

To illustrate the previous problem, consider this usage inside the code:

value, err := expr.Eval(ctx, row)
if err != nil {  /* handle err */ }

if value == nil {
        return nil, nil
}

value, err = sql.Date.Convert(value)
if err != nil { /* handle err */ }

The nil value is checked after Eval, not after Convert, because conversions are done on non-nil values and are expected to return non-nil values. To change this behaviour, it would require to add the following block to each usage of a Convert method:

if value == nil { return nil, nil }

This is potentially dangerous, because people may forget about it and can lead to a panic.

Non-processed values

Consider the following query:

SELECT date_col FROM table

There was no sql.Date.Convert here. Just a value from the table that will be projected.
We could also check this during the conversion to a SQL value to sent through the wire protocol. But what if table.date_col is not nullable? The client won't expect a null if the format is invalid.

Proposed solution

A solution that will always work correctly is just wrapping any usage of a DATE or DATETIME with CONVERT(x, DATE) or CONVERT(x, DATETIME). That way, the result will be nullable and the client will receive what it expects. Plus, we wouldn't care what clients send, because at some point it would be sanitized.
A single analyzer rule would solve the whole problem.

Caveats

Could this impact performance? Even if it does have a performance penalty, I don't expect it to be very significative. Mostly because once the value has passed through the first convert it will be just a couple ifs.

WDYT? @src-d/data-processing

@erizocosmico erizocosmico added the bug Something isn't working label Apr 25, 2019
@erizocosmico erizocosmico self-assigned this Apr 25, 2019
@juanjux
Copy link
Contributor

juanjux commented Apr 25, 2019

When you say wrapping any usage of DATE or DATETIME with CONVERT, you mean at the SQL level or at the implementation level? Because if it can be done for the later it would ensure the user never forgets to call CONVERT.

@erizocosmico
Copy link
Contributor Author

At the implementation level. I mean an analyzer rule that wraps them in convert.

@juanjux
Copy link
Contributor

juanjux commented Apr 25, 2019

Then I think that's is the right approach. Performance impact is probably negligible but it could be tested anyway.

@kuba--
Copy link
Contributor

kuba-- commented Apr 25, 2019

Can we do it at the end by just overwriting sth. like String?
Or we can try some kind of lazy evaluation, where you accumulate all converts and call them on eval. I'm not sure if it makes sense int his case. I don't know how many potential overflow cases we may have.

@erizocosmico
Copy link
Contributor Author

Eval is already lazy. It's only called when it's needed. There's nothing more we can overwrite, because the thing is a non-nullable date can become null if the format is not valid (which will not happen often, I guess, but can happen).

Another solution might be to just wrap literals in convert and ask datasources to make sure to return nicely formatted dates. But that does not ensure an error won't happen in the client because some datasource forgot to do it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants