Skip to content

accelerate ByteArray.hashCode#7622

Merged
Jackie-Jiang merged 1 commit intoapache:masterfrom
richardstartin:accelerate-bytearray-hashcode
Oct 24, 2021
Merged

accelerate ByteArray.hashCode#7622
Jackie-Jiang merged 1 commit intoapache:masterfrom
richardstartin:accelerate-bytearray-hashcode

Conversation

@richardstartin
Copy link
Member

Description

Speed up ByteArray.hashCode which speeds up building column statistics. Unfortunately the algorithm needs to be maintained in case it has been used for partitioning, but it can be sped up easily by breaking a data dependency in the loop.

@State(Scope.Benchmark)
public class BenchmarkByteArray {

  @Param({"7", "127", "1023"})
  int _size;

  ByteArray _bytes;

  @Setup(Level.Trial)
  public void init() {
    byte[] bytes = new byte[_size];
    ThreadLocalRandom.current().nextBytes(bytes);
    _bytes = new ByteArray(bytes);
  }

  @Benchmark
  public int byteArrayHashCode() {
    return _bytes.hashCode();
  }

  @Benchmark
  public int arraysHashCode() {
    return Arrays.hashCode(_bytes.getBytes());
  }
}
Benchmark                             (_size)  Mode  Cnt    Score   Error  Units
BenchmarkByteArray.arraysHashCode           7  avgt    5    6.823 ± 3.155  ns/op
BenchmarkByteArray.arraysHashCode         127  avgt    5   83.459 ± 5.859  ns/op
BenchmarkByteArray.arraysHashCode        1023  avgt    5  678.600 ± 4.842  ns/op
BenchmarkByteArray.byteArrayHashCode        7  avgt    5    6.294 ± 0.021  ns/op
BenchmarkByteArray.byteArrayHashCode      127  avgt    5   56.796 ± 1.975  ns/op
BenchmarkByteArray.byteArrayHashCode     1023  avgt    5  406.462 ± 3.246  ns/op

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

@richardstartin richardstartin force-pushed the accelerate-bytearray-hashcode branch from e8e51a0 to 4f6eabe Compare October 22, 2021 22:35
@richardstartin richardstartin force-pushed the accelerate-bytearray-hashcode branch from 4f6eabe to d4b1136 Compare October 22, 2021 22:45
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are replacing the JDK implementation with our own, and I feel JIT might be able to optimize on the JDK default method, and the benchmark might have different result in different environment. If some new optimizations are introduced in the future java version, we won't be able to get that. Do you think this method might be slower in certain environments?

@richardstartin
Copy link
Member Author

We are replacing the JDK implementation with our own, and I feel JIT might be able to optimize on the JDK default method, and the benchmark might have different result in different environment. If some new optimizations are introduced in the future java version, we won't be able to get that. Do you think this method might be slower in certain environments?

As far as I am aware there is no Java or C++ compiler capable of performing this transformation, as simple as it is. It's possible that this will be treated with a vectorized intrinsic (the same trick is scalable) in the future, but Pinot doesn't run on JDK16 without adding a lot of --add-opens so shall we cross that bridge when it exists?

@codecov-commenter
Copy link

codecov-commenter commented Oct 22, 2021

Codecov Report

Merging #7622 (d4b1136) into master (a114cd2) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7622      +/-   ##
============================================
- Coverage     71.61%   71.60%   -0.01%     
- Complexity     3938     3941       +3     
============================================
  Files          1562     1562              
  Lines         79370    79376       +6     
  Branches      11748    11750       +2     
============================================
- Hits          56843    56841       -2     
- Misses        18692    18697       +5     
- Partials       3835     3838       +3     
Flag Coverage Δ
integration1 29.37% <0.00%> (-0.06%) ⬇️
integration2 27.85% <0.00%> (+0.09%) ⬆️
unittests1 68.58% <100.00%> (+0.01%) ⬆️
unittests2 14.68% <0.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ain/java/org/apache/pinot/spi/utils/ByteArray.java 62.50% <100.00%> (+5.35%) ⬆️
...nction/DistinctCountBitmapAggregationFunction.java 47.42% <0.00%> (-4.64%) ⬇️
.../pinot/core/query/scheduler/PriorityScheduler.java 80.82% <0.00%> (-2.74%) ⬇️
.../java/org/apache/pinot/spi/data/TimeFieldSpec.java 88.63% <0.00%> (-2.28%) ⬇️
...core/startree/operator/StarTreeFilterOperator.java 85.10% <0.00%> (-1.42%) ⬇️
.../helix/core/realtime/SegmentCompletionManager.java 72.00% <0.00%> (-1.02%) ⬇️
...troller/helix/core/retention/RetentionManager.java 82.60% <0.00%> (-0.87%) ⬇️
...ata/manager/realtime/RealtimeTableDataManager.java 67.88% <0.00%> (-0.82%) ⬇️
.../controller/helix/core/SegmentDeletionManager.java 75.60% <0.00%> (-0.82%) ⬇️
...nMaxValueBasedSelectionOrderByCombineOperator.java 74.60% <0.00%> (-0.80%) ⬇️
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a114cd2...d4b1136. Read the comment docs.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. This video gives some context to the optimization: https://youtu.be/PnVw1uFxSyw?t=586

@Jackie-Jiang Jackie-Jiang merged commit 72b8fc1 into apache:master Oct 24, 2021
kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants