Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(sql): support uuid and long256 in parallel GROUP BY #4140

Merged
merged 16 commits into from
Jan 22, 2024

Commits on Jan 16, 2024

  1. Addition of GroupByLong128HashSet

    Currently microbenching at (YMMV):
    
    Benchmark                                                 Mode  Cnt  Score   Error  Units
    GroupByLong128HashSetBenchmark.baseline                   avgt    3  0.009 ± 0.002  us/op
    GroupByLong128HashSetBenchmark.testFastMap                avgt    3  0.197 ± 0.056  us/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet  avgt    3  0.146 ± 0.009  us/op
    nwoolmer committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    d177365 View commit details
    Browse the repository at this point in the history
  2. Addition of CountDistinctUuidGroupByFunction

    Includes port of tests from similar CountDistinctLongGroupByFunctionFactory.
    nwoolmer committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    dd641a5 View commit details
    Browse the repository at this point in the history
  3. Addition of GroupByLong256HashSet

    Currently benches at:
    
    Benchmark                                                 Mode  Cnt  Score   Error  Units
    GroupByLong256HashSetBenchmark.baseline                   avgt    3  0.015 ± 0.001  us/op
    GroupByLong256HashSetBenchmark.testFastMap                avgt    3  0.214 ± 0.023  us/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet  avgt    3  0.251 ± 0.039  us/op
    nwoolmer committed Jan 16, 2024
    Configuration menu
    Copy the full SHA
    3a956d8 View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2024

  1. Addition of CountDistinctLong256GroupByFunction

    Currently debugging issues with the tests. Added to_long256 to help to align the tests across the long/long128/long256 count distinct groupby functions. Inner expression aren't parsing as expected.
    nwoolmer committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    ed4f9dd View commit details
    Browse the repository at this point in the history
  2. Addition of QuaternaryFunction initialisation fixes for LongsToLong25…

    …6FunctionFactory.
    
    Tests now pass.
    nwoolmer committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    640a329 View commit details
    Browse the repository at this point in the history
  3. Updates to GroupByLong128HashSetBenchmark and GroupByLong256HashSetBe…

    …nchmark.
    
    Benchmark                                                  (size)  Mode  Cnt    Score    Error  Units
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet     5000  avgt    3  133.559 ± 24.516  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet    50000  avgt    3  184.387 ± 11.863  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet   500000  avgt    3  180.291 ± 38.545  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet  5000000  avgt    3  180.389 ± 21.280  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap                5000  avgt    3  244.329 ± 29.777  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap               50000  avgt    3  194.403 ±  6.899  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap              500000  avgt    3  191.533 ±  6.748  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap             5000000  avgt    3  191.723 ±  2.794  ns/op
    
    Benchmark                                                  (size)  Mode  Cnt    Score     Error  Units
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet     5000  avgt    3  310.210 ± 135.663  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet    50000  avgt    3  256.012 ±  32.347  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet   500000  avgt    3  258.856 ±  56.371  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet  5000000  avgt    3  275.527 ± 326.510  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap                5000  avgt    3  205.034 ±  38.691  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap               50000  avgt    3  205.936 ±  94.791  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap              500000  avgt    3  199.510 ±  38.080  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap             5000000  avgt    3  215.477 ± 159.120  ns/op
    nwoolmer committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    f8f13ca View commit details
    Browse the repository at this point in the history
  4. Align GroupByLongHashSetBenchmark with Long128 and Long256.

    Benchmark                                            (size)  Mode  Cnt    Score    Error  Units
    GroupByLongHashSetBenchmark.testGroupByLongHashSet     5000  avgt    3   10.319 ±  0.965  ns/op
    GroupByLongHashSetBenchmark.testGroupByLongHashSet    50000  avgt    3   14.073 ±  1.312  ns/op
    GroupByLongHashSetBenchmark.testGroupByLongHashSet   500000  avgt    3   33.237 ±  1.161  ns/op
    GroupByLongHashSetBenchmark.testGroupByLongHashSet  5000000  avgt    3   60.773 ±  7.917  ns/op
    GroupByLongHashSetBenchmark.testOrderedMap             5000  avgt    3   27.105 ±  8.215  ns/op
    GroupByLongHashSetBenchmark.testOrderedMap            50000  avgt    3   27.707 ± 11.362  ns/op
    GroupByLongHashSetBenchmark.testOrderedMap           500000  avgt    3  134.037 ± 93.003  ns/op
    GroupByLongHashSetBenchmark.testOrderedMap          5000000  avgt    3  183.419 ± 32.628  ns/op
    nwoolmer committed Jan 17, 2024
    Configuration menu
    Copy the full SHA
    a8f4d77 View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2024

  1. Configuration menu
    Copy the full SHA
    2bfab9e View commit details
    Browse the repository at this point in the history
  2. Addressing PR comments.

    CountDistinctLong256GroupByFunction/CountDistinctUUIDGroupByFunction:
    Adjusted zero -> null mapping to only map nulls when key is entirely zero.
    Swapped isParallelismSupported() to use the superclass version for Long/Long128/Long256
    
    GroupByLong128HashSet/GroupByLong256HashSet
    Adjusted ascii table comments
    Swapped keyAt functions to keyAddrAt and inlined the offsets, so keyAddr is only calculated once.
    Calculate address once in setKeyAt
    Adjusted benchmark values to half or quarter sizes to account for larger size of inserts (aiming to get cache performance impacting the benchmark)
    
    Hash
    Correct algorithm for hashLong256.
    
    Long256Impl
    Added isNull(LLLL) overload.
    
    LongToLong256FunctionFactory
    Adjust whitespace and remove unnecessary 'this'.
    Use new Long256Impl.isNull(LLLL)
    
    QuaternaryFunction
    Adjust whitespace and naming.
    
    RecordSinkSPI and Unordered16Map
    Adjust names.
    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    ee8d00f View commit details
    Browse the repository at this point in the history
  3. Merge remote-tracking branch 'fork/group-by-int-128-256-hash-set' int…

    …o group-by-int-128-256-hash-set
    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    343a777 View commit details
    Browse the repository at this point in the history
  4. Merge updates, add capacity check, and re-bench.

    Long128
    
    Benchmark                                                  (size)  Mode  Cnt    Score    Error  Units
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet     2500  avgt    3   76.840 ±  3.901  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet    25000  avgt    3  175.917 ± 55.816  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet   250000  avgt    3  168.134 ± 78.253  ns/op
    GroupByLong128HashSetBenchmark.testGroupByLong128HashSet  2500000  avgt    3  164.173 ± 27.133  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap                2500  avgt    3  229.978 ± 22.202  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap               25000  avgt    3  204.169 ±  4.719  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap              250000  avgt    3  196.680 ± 32.682  ns/op
    GroupByLong128HashSetBenchmark.testOrderedMap             2500000  avgt    3  196.723 ± 27.970  ns/op
    
    Long256
    
    Benchmark                                                  (size)  Mode  Cnt    Score    Error  Units
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet     1250  avgt    3  231.235 ± 61.326  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet    12500  avgt    3  232.874 ± 47.436  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet   125000  avgt    3  227.261 ± 11.838  ns/op
    GroupByLong256HashSetBenchmark.testGroupByLong256HashSet  1250000  avgt    3  235.956 ± 28.635  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap                1250  avgt    3  206.105 ± 34.569  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap               12500  avgt    3  203.992 ± 12.676  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap              125000  avgt    3  204.173 ±  3.784  ns/op
    GroupByLong256HashSetBenchmark.testOrderedMap             1250000  avgt    3  203.260 ±  2.665  ns/op
    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    6a92668 View commit details
    Browse the repository at this point in the history
  5. Whitespace

    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    a5cbe7a View commit details
    Browse the repository at this point in the history
  6. Broken commit to show new tests for UUID/Long256 will fail.

    UUID
    -------------------------------------------------------------------------------
    Test set: io.questdb.test.griffin.engine.functions.groupby.CountDistinctUuidGroupByFunctionFactoryTest
    -------------------------------------------------------------------------------
    Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.419 sec <<< FAILURE! - in io.questdb.test.griffin.engine.functions.groupby.CountDistinctUuidGroupByFunctionFactoryTest
    testMappingZeroToNulls(io.questdb.test.griffin.engine.functions.groupby.CountDistinctUuidGroupByFunctionFactoryTest)  Time elapsed: 0.046 sec  <<< FAILURE!
    java.lang.AssertionError: expected:<a	s
    a	4
    > but was:<a	s
    a	3
    >
    
    Long256
    -------------------------------------------------------------------------------
    Test set: io.questdb.test.griffin.engine.functions.groupby.CountDistinctLong256GroupByFunctionFactoryTest
    -------------------------------------------------------------------------------
    Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.288 sec <<< FAILURE! - in io.questdb.test.griffin.engine.functions.groupby.CountDistinctLong256GroupByFunctionFactoryTest
    testMappingZeroToNulls(io.questdb.test.griffin.engine.functions.groupby.CountDistinctLong256GroupByFunctionFactoryTest)  Time elapsed: 0.054 sec  <<< FAILURE!
    java.lang.AssertionError: expected:<a	s
    a	4
    > but was:<a	s
    a	3
    >
    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    c3d2e64 View commit details
    Browse the repository at this point in the history
  7. New tests for UUID/Long256 should pass.

    UUID
    Tests run: 12, Failures: 0, Errors: 0, Skipped: 0
    
    Long256
    Tests run: 12, Failures: 0, Errors: 0, Skipped: 0
    nwoolmer committed Jan 18, 2024
    Configuration menu
    Copy the full SHA
    6be4a18 View commit details
    Browse the repository at this point in the history

Commits on Jan 19, 2024

  1. Updates to tests.

    Added oracles to the hash set test.
    
    Added extra 0/null mapping cases to the griffin functions.
    nwoolmer committed Jan 19, 2024
    3 Configuration menu
    Copy the full SHA
    31e2dab View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    497a992 View commit details
    Browse the repository at this point in the history