Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: don't copy old pk when changing bucket size on sharded index #45894

Merged
merged 1 commit into from
May 6, 2020

Conversation

rohany
Copy link
Contributor

@rohany rohany commented Mar 9, 2020

Fixes #45889.

Release note (sql change): This PR causes primary key changes to
not create a copy of the old primary key if the primary key change
only changes the bucket count of a hash sharded index.

@rohany rohany requested a review from otan March 9, 2020 17:38
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@otan
Copy link
Contributor

otan commented Mar 9, 2020

Just for sanity - why do we not want this behaviour?

@rohany
Copy link
Contributor Author

rohany commented Mar 9, 2020

The argument is as follows:

We make a copy of the old primary key so that your query performance doesn't regress after the primary key change. However, in the hash sharded case, the change being made is going to affect performance (i.e. what is the bucket count, or what columns are hashed). Keeping around the old primary key means keeping around the old primary key with the incorrect number of buckets etc. What does @ajwerner think about this?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @rohany here, at least for the tested case. I can't see any reason why a user would want to retain the old bucket count.

Is there are case where the user would want to retain the index if they changed the set of columns but both were hashed? That one is less clear to me.

Consider:

CREATE TABLE t (x INT NOT NULL, y INT NOT NULL, ts TIMESTAMPTZ NOT NULL, PRIMARY KEY (ts, x, y) USING HASH WITH BUCKET_COUNT = 2);

then:

ALTER TABLE t ALTER PRIMARY KEY USING COLUMNS (ts, y, x) USING HASH WITH BUCKET_COUNT=3

I could see it either way in this case.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @otan)

@@ -563,7 +563,9 @@ func (n *alterTableNode) startExec(params runParams) error {
// does, but doesn't store anything. We only recreate the index if
// the table has a primary key (no DROP PRIMARY KEY statements have
// been executed), and the primary key is not the default rowid key.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe updated comment for hash sharded cases too

@jordanlewis
Copy link
Member

Does this want to be merged?

@rohany rohany changed the title sql: don't create old when changing between hash sharded indexes sql: don't copy old pk when changing bucket size on sharded index May 6, 2020
@rohany
Copy link
Contributor Author

rohany commented May 6, 2020

Reviving this PR! I am more satisfied with it now -- I changed it to only not create a copy if only the bucket count of the hash sharded index is changing. PTAL

@rohany rohany requested a review from ajwerner May 6, 2020 16:04
Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Completely optional bikeshedding comment.

oldPK := desc.PrimaryIndex
return desc.HasPrimaryKey() && !desc.IsPrimaryIndexDefaultRowID() &&
!(oldPK.IsSharded() && newPK.IsSharded() &&
// The first column in the columnIDs is the shard column, which will be different.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a teensy bit hard to read but makes sense after squinting.

wondering if breaking down to the if statements might be better... but it's personal taste for me.


func shouldCopyPrimaryKey(desc *MutableTableDescriptor, newPK *sqlbase.IndexDescriptor) bool {
  // no primary key means we have nothing to copy
  if !desc.HasPrimaryKey() {
     return false
  }
  // we never want to copy default row id
  if desc.IsPrimaryIndexDefaultRowID() {
    return false
  }
  if oldPK.IsSharded() && newPK.IsSharded() {
     // ...
  }
  return true
}
```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm down for this, I find it hard to read as well.

@blathers-crl
Copy link

blathers-crl bot commented May 6, 2020

❌ The GitHub CI (Cockroach) build has failed on af0526d3.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Fixes cockroachdb#45889.

Release note (sql change): This PR causes primary key changes to
not create a copy of the old primary key if the primary key change
only changes the bucket count of a hash sharded index.
@rohany
Copy link
Contributor Author

rohany commented May 6, 2020

bors r=otan

@craig
Copy link
Contributor

craig bot commented May 6, 2020

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

sql: don't create old pk as an unique index when alter primary key of hash sharded index
5 participants