Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upsql: Align timestamp math with PostgreSQL #31146
Conversation
bobvawter
requested review from
danhhz and
knz
Oct 9, 2018
bobvawter
requested a review
from cockroachdb/sql-rest-prs
as a
code owner
Oct 9, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
cc @dt |
knz
approved these changes
Oct 9, 2018
LGTM but would welcome additional reviewer's opinions.
pkg/util/duration/duration.go Outdated
knz
requested a review
from
mjibson
Oct 9, 2018
knz
added this to Triage
in SQL Front-end, Lang & Semantics
via automation
Oct 9, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 9, 2018
Contributor
Does the timestamp-math change need to be hidden behind a setting or cluster version? I’m wondering what happens if you run select some_timestamp + '1 month' from table. Is more than one node going to be evaluating that addition?
|
Does the timestamp-math change need to be hidden behind a setting or cluster version? I’m wondering what happens if you run |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 9, 2018
Member
Yes more than 1 node is going to be evaluating this. This is an "interesting" question.
|
Yes more than 1 node is going to be evaluating this. This is an "interesting" question. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 9, 2018
Member
@jordanlewis do you want to provide guidance about who to talk to to make this behavior properly conditional?
|
@jordanlewis do you want to provide guidance about who to talk to to make this behavior properly conditional? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
jordanlewis
Oct 9, 2018
Member
Hmm. Bumping the DistSQL minimum version will prevent problems with DistSQL, but it won't prevent the fact that, during a rolling upgrade, two nodes might give two different answers.
I believe the right thing to do is gate the new behavior on a cluster version. I don't remember how this is done.
|
Hmm. Bumping the DistSQL minimum version will prevent problems with DistSQL, but it won't prevent the fact that, during a rolling upgrade, two nodes might give two different answers. I believe the right thing to do is gate the new behavior on a cluster version. I don't remember how this is done. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 9, 2018
Member
@bobvawter here is an example of cluster version gate
pkg/sql/sqlbase/structured.go:
if !st.Version.IsMinSupported(cluster.VersionBitArrayColumns) {
|
@bobvawter here is an example of cluster version gate
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 9, 2018
Member
I would recommend a (hidden) session setting in SessionData
which is initialized when the session starts usiong the cluster version gate
(SEssionData variables are transferred across distsql flows)
|
I would recommend a (hidden) session setting in SessionData (SEssionData variables are transferred across distsql flows) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
danhhz
Oct 9, 2018
Contributor
I haven't been in the duration stuff (or even thought about it) in ~2 years, so going to remove myself from the review, if that's okay.
|
I haven't been in the duration stuff (or even thought about it) in ~2 years, so going to remove myself from the review, if that's okay. |
danhhz
removed their request for review
Oct 9, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 9, 2018
Contributor
Is the SessionData or cluster version available from a static context, or is this going to require a ton of plumbing to get the necessary version down into Add()?
|
Is the SessionData or cluster version available from a static context, or is this going to require a ton of plumbing to get the necessary version down into |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 9, 2018
Contributor
It looks like there's an EvalContext available at many/most of the call-chains that eventually call into Add(). I'll see what getting all of that plumbing set up looks like.
|
It looks like there's an |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mjibson
Oct 9, 2018
Member
I'm not convinced we need cluster or session settings to handle this. Don't we already change implementation of functions regularly without gating them behind any kind of setting?
|
I'm not convinced we need cluster or session settings to handle this. Don't we already change implementation of functions regularly without gating them behind any kind of setting? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 9, 2018
Member
@mjibson that's a valid argument in general. -- if we could argue the previous behavior was unequivocally a bug , then you'd be right we don't need a flag.
However it's not so clear "on its face" that "Jan 31 + 1 month" should be equal to Feb 29 or March 1 (there are other places in postgres, and in cockroachdb, where 1 month = 30 days no matter what). There is an argument to be made in favor of either way, so I can't confidently say that the current behavior would be perceived as a bug by every user looking at it carefully.
|
@mjibson that's a valid argument in general. -- if we could argue the previous behavior was unequivocally a bug , then you'd be right we don't need a flag. However it's not so clear "on its face" that "Jan 31 + 1 month" should be equal to Feb 29 or March 1 (there are other places in postgres, and in cockroachdb, where 1 month = 30 days no matter what). There is an argument to be made in favor of either way, so I can't confidently say that the current behavior would be perceived as a bug by every user looking at it carefully. |
bobvawter
requested review from
cockroachdb/cli-prs
as
code owners
Oct 10, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 10, 2018
Contributor
PTAL: I've updated this patch with a session variable and cluster setting to allow users to disable the new date math in case their app depends on the golang math. I'm also using this as an opportunity to delay switching to the new logic until a cluster is fully upgraded.
|
PTAL: I've updated this patch with a session variable and cluster setting to allow users to disable the new date math in case their app depends on the golang math. I'm also using this as an opportunity to delay switching to the new logic until a cluster is fully upgraded. |
bobvawter
reviewed
Oct 10, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/util/duration/duration.go, line 275 at r1 (raw file):
Previously, knz (kena) wrote…
You have peeked into pg's implementation to determine what to do here. Please reference the particular pg version you looked at, and the filename inside the source tree you looked at, and the C function name(s) (or tests), in the comment.
Added this info to the doc on AdditionModeCompatible.
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
mjibson
Oct 10, 2018
Member
Here's a more thorough argument about why we should not have either a cluster version check or a session setting and instead just fix it and move on.
Let's begin by looking at the recent changes in builtins that changed the results of a SQL function call (i.e., not fixes for panics):
- sql: properly ignore invalid chars in encoding names
- sql: hashes with NULL input should return NULL
- sql: Fix the handling of NULL arguments in string_agg
- sql: correctly handle NULL arrays in generator funcs
And some for eval.go:
I think every one of the above changes would behave differently during a cluster upgrade, just like this one. I believe none of them had session setting or cluster version checks, we just called it a fixed bug and moved on with life. If someone was depending on the old behavior, too bad.
Another argument, a sort of proof by contradiction: I really really don't want to have a policy that we maintain with session settings any old behavior that doesn't match postgres in addition to the new behavior that does match it. For example, if we are changing the dates as above to make it match postgres and introduce various settings to allow users to choose which implementation they want, would we also not do the same for, say, integer division? We made the choice years ago to have int / int return a decimal instead of an int like postgres does. I think if we apply the same thought process about the date math above to integer division, we would have to change it to match postgres and add a setting to allow for the previous behavior. I think it would be a bad choice to do this (have a setting) for int division, so I also think it's a bad choice for date math.
(I am in full support of just changing int division to match postgres if our policy is to just match postgres during edge cases. I'm equally happy to leave it as is.)
|
Here's a more thorough argument about why we should not have either a cluster version check or a session setting and instead just fix it and move on. Let's begin by looking at the recent changes in builtins that changed the results of a SQL function call (i.e., not fixes for panics):
And some for eval.go: I think every one of the above changes would behave differently during a cluster upgrade, just like this one. I believe none of them had session setting or cluster version checks, we just called it a fixed bug and moved on with life. If someone was depending on the old behavior, too bad. Another argument, a sort of proof by contradiction: I really really don't want to have a policy that we maintain with session settings any old behavior that doesn't match postgres in addition to the new behavior that does match it. For example, if we are changing the dates as above to make it match postgres and introduce various settings to allow users to choose which implementation they want, would we also not do the same for, say, integer division? We made the choice years ago to have (I am in full support of just changing int division to match postgres if our policy is to just match postgres during edge cases. I'm equally happy to leave it as is.) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
knz
Oct 10, 2018
Member
Bob I think I am OK with you taking that approach, matt's arguments notwithstanding, for two reasons:
- I'll defer to your judgement about seriousness
- it's also a good exercise for us to understand how such a version conditional needs to be implemented in practice. You could consider this issue as an opportunity to teach something to the rest of the team we'll need to use in other serious cases.
|
Bob I think I am OK with you taking that approach, matt's arguments notwithstanding, for two reasons:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 10, 2018
Contributor
PTAL: Here's hoping this is the last rev. After a discussion with Ben, this patch drops the user-configurable knobs, but is still version-aware to avoid inconsistencies within mixed-mode clusters. The version check is performed once and memoized in the SessionData.
|
PTAL: Here's hoping this is the last rev. After a discussion with Ben, this patch drops the user-configurable knobs, but is still version-aware to avoid inconsistencies within mixed-mode clusters. The version check is performed once and memoized in the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
dt
Oct 10, 2018
Member
honestly you can still have a nasty heisenbug with drivers that connection pool since every statement you run might get a different session -- I'm increasingly in the "just rip the band-aid off" school.
|
honestly you can still have a nasty heisenbug with drivers that connection pool since every statement you run might get a different session -- I'm increasingly in the "just rip the band-aid off" school. |
bobvawter
reviewed
Oct 10, 2018
@dt, true, but at least within any given execution it will be consistent.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/settings/cluster/cockroach_versions.go, line 288 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Yeah, things get a little weird if we do this out of order. Today's the day to make the 2.1 cluster version anyway.
But I don't think it's a great idea to tie a change like this to the cluster version - the change happens at exactly the point where the upgrade becomes irreversible. I think it might be better to just make the change as soon as the new binary is used while rolling back is still possible.
I thought we wanted all of the nodes to agree that they can use the new date math before any of them start to do so to avoid inconsistency within any given session, but that we're ok with just changing the math behavior out from under the user. Is there some other means to inform the decision about which mode to use in a SessionData?
pkg/sql/sessiondata/session_data.go, line 81 at r2 (raw file):
Previously, mjibson (Matt Jibson) wrote…
Put a TODO here to remove this in the 2.2 release cycle, since at that point we know it's safe to remove the old behavior.
Done.
bdarnell
reviewed
Oct 10, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/settings/cluster/cockroach_versions.go, line 288 at r2 (raw file):
Previously, bobvawter (Bob Vawter) wrote…
I thought we wanted all of the nodes to agree that they can use the new date math before any of them start to do so to avoid inconsistency within any given session, but that we're ok with just changing the math behavior out from under the user. Is there some other means to inform the decision about which mode to use in a SessionData?
I think a distsql version is a good idea, which will prevent inconsistency within a session.
knz
reviewed
Oct 10, 2018
I'm rather comfortable with the end result here. Just a minor nit about the generator interface and this will be good to co from my perspective.
Reviewed 1 of 4 files at r1, 21 of 21 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained
pkg/sql/sem/builtins/generator_builtins.go, line 260 at r2 (raw file):
// with integer bounds. type seriesValueGenerator struct { ctx *tree.EvalContext
It'd be better and less error-prone to change the interface of Next() to take an evalcontext as argument, than to store a reference to it here.
bobvawter
reviewed
Oct 10, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/settings/cluster/cockroach_versions.go, line 288 at r2 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
I think a distsql version is a good idea, which will prevent inconsistency within a session.
Ah, I see now that's actually a separate thing from the cluster version. Will revise.
bobvawter
requested a review
from cockroachdb/distsql-prs
as a
code owner
Oct 11, 2018
bobvawter
reviewed
Oct 11, 2018
Reviewable status:
complete! 0 of 0 LGTMs obtained
a discussion (no related file):
PTAL: Now with a new distsql version check to enable legacy mode.
pkg/sql/sem/builtins/generator_builtins.go, line 260 at r2 (raw file):
Previously, knz (kena) wrote…
It'd be better and less error-prone to change the interface of Next() to take an evalcontext as argument, than to store a reference to it here.
That was my first thought, but it turns into a serious sprawl through a whole lot of other value-generation methods, since we would wind up having to also change the sql.valueIterator and sql.planNode interfaces in order to update projectSetNode.Next().
knz
moved this from Triage
to Current milestone
in SQL Front-end, Lang & Semantics
Oct 11, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bdarnell
Oct 11, 2018
Member
LGTM (from a versioning perspective; I didn't look at the implementation closely)
|
LGTM (from a versioning perspective; I didn't look at the implementation closely) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
bors r+ |
bot
pushed a commit
that referenced
this pull request
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
craig
bot
commented
Oct 12, 2018
Build failed |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 12, 2018
Contributor
Retrying bors due to setup failure in orm tests; looks like pip couldn't pull down a package?
bors r+
|
Retrying bors due to setup failure in orm tests; looks like pip couldn't pull down a package? bors r+ |
bot
pushed a commit
that referenced
this pull request
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
craig
bot
commented
Oct 12, 2018
Build failed |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
bobvawter
Oct 12, 2018
Contributor
orm passed in retry, now roachtest flaked (was ok in first build). Retrying.
bors r+
|
orm passed in retry, now roachtest flaked (was ok in first build). Retrying. bors r+ |
bot
pushed a commit
that referenced
this pull request
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
craig
bot
commented
Oct 12, 2018
Build failed |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
today is not your lucky day it seems... |
bobvawter
requested a review
from cockroachdb/sql-bulk-prs
as a
code owner
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Retrying after rebase bors r+ |
bot
pushed a commit
that referenced
this pull request
Oct 12, 2018
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
craig
bot
commented
Oct 12, 2018
Build succeeded |
bobvawter commentedOct 9, 2018
Our implementation of time+duration math delegates to golang's
time.Time.AddDate(), which has different rounding semantics when compared to
PostgreSQL when normalizing invalid month-of-day values.
The most likely input case is that the duration will have no month component
and we can fall back to go's implementation. Another fast-path of checking the
day-of-month allows us to skip the special-case logic for start dates that
aren't the 29th, 30th, or 31st of a month.
Resolves #30976
Release note (sql change): The normalized results of certain timestamp +
duration operations involving year or month durations have been adjusted to
agree with the values returned by PostgreSQL. In cases such as
'2018-01-31'::TIMESTAMP + '1 month', where an intermediate result of February
31st needs to be normalized, previous versions of CockroachDB would "round up"
to March 1. Instead, we would now "round down" to February 28th to agree with
the values returned by PostgreSQL. This change also affects the results of
generate_sequence()when used with timestamps.