New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: non-uniform hash-sharded index with power-of-2 number of buckets #91109
Comments
We've discovered a case where non-random data and a power-of-2 number of buckets cause an uneven distribution of data among buckets in a hash- sharded index. There's an old trick of using a prime number of buckets in hash tables to avoid this sort of unfortunate clustering. Let's use that. Fixes: cockroachdb#91109 Epic: None Release note (performance improvement): We now recommend using a prime number of buckets in hash-sharded indexes to increase the chance of an even distribution of data among buckets. The default value of `sql.defaults.default_hash_sharded_index_bucket_count` has been changed to a prime number.
I think this is a good idea. The question is how to pick a prime that's larger than the bucket count to mod with first. If we had a limit on the number of buckets, which I'm sure we should have but I'm less sure we do have, we could just do the smallest prime larger than that limit. |
We've discovered a case where a combination of non-random data and a power-of-two number of buckets causes an uneven distribution of rows in a hash-sharded index. While this specific case is very contrived, it illustrates a small weakness in the current hash-shard calculation: modulo by a power-of-two number of buckets only uses the last few bits of the hash value. Radu suggested this would be a problem in cockroachdb#67865 and also suggested a fix: add an intermediate modulo by a larger prime before modulo by num buckets, so we'll try that. Fixes: cockroachdb#91109 Epic: None Release note (performance improvement): This change updates the shard calculation of newly-created hash-sharded indexes so that uneven distributions of rows are less likely.
The thing that's weird about your input is that it has a lot of repetition of bytes because the values are all the same for every tuple. I don't know how often that comes up. If you did this:
Then you get a perfect distribution. |
Deep down, I think what's going on here is that the bitstring is the same if the column values are the same. This is not a real problem in practice. I'm going to leave this around as documentation but am closing as won't fix. |
@odessit55 found another case using CREATE TABLE products (
ts DECIMAL NOT NULL PRIMARY KEY USING HASH WITH (bucket_count=16),
product_id INT8
);
INSERT INTO products SELECT generate_series(0, 1023), 101; Which shows a similar problem:
Even if we expect real-world data to be better distributed, it doesn't look great when simple user tests result in non-uniform distributions. |
@michae2 in that most recent example, the bucket count was specified manually, so it seems this issue is going beyond just changing the default bucket count. Do you think we should change the hashing function? |
Yes, I think we should. |
One idea is to use
|
Looks like the hash bucket calculation
mod(fnv32(crdb_internal.datums_to_bytes(x, y)), z)
is poorly distributed when x and y are INT8 columns with equal values, and z is a power of 2. For example:Gives:
The hash-sharded index is still unevenly distributed with 16 buckets:
With 14 it's evenly distributed across half the buckets:
With a prime number of buckets it's fine:
This example is (admittedly) contrived, but there could be other cases like this. Maybe we should default to creating a prime number of buckets?
Jira issue: CRDB-21107
Epic CRDB-27601
The text was updated successfully, but these errors were encountered: