Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-12888][SQL][follow-up] benchmark the new hash expression #10917

Closed
wants to merge 6 commits into from

Conversation

cloud-fan
Copy link
Contributor

Adds the benchmark results as comments.

The codegen version is slower than the interpreted version for simple case becasue of 3 reasons:

  1. codegen version use a more complex hash algorithm than interpreted version, i.e. Murmur3_x86_32.hashInt vs simple multiplication and addition.
  2. codegen version will write the hash value to a row first and then read it out. I tried to create a GenerateHasher that can generate code to return hash value directly and got about 60% speed up for the simple case, does it worth?
  3. the row in simple case only has one int field, so the runtime reflection may be removed because of branch prediction, which makes the interpreted version faster.

The array case is also slow for similar reasons, e.g. array elements are of same type, so interpreted version can probably get rid of runtime reflection by branch prediction.

@cloud-fan
Copy link
Contributor Author

cc @nongli @rxin

@rxin
Copy link
Contributor

rxin commented Jan 26, 2016

@nongli maybe we should just use the simpler multiplication and addition?

@nongli
Copy link
Contributor

nongli commented Jan 26, 2016

@cloud-fan Simple is just a single int right? It's not even doing anything in the previous case?

@cloud-fan
Copy link
Contributor Author

@nongli It's not doing anything to get the hash code of the int field, but do a simple multiplication and addition to get the hash code of the row.

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50072 has finished for PR 10917 at commit 8207dc1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@nongli
Copy link
Contributor

nongli commented Jan 26, 2016

LGTM. We can have different hash functions with different entropy later but this seems okay to me.

@cloud-fan
Copy link
Contributor Author

The map case should have been slower than the array case for codegen, but it doesn't because of a generator bug that has been fixed at #10930. Should we merge this one first and update map case result after #10930 is merged?

@nongli
Copy link
Contributor

nongli commented Jan 26, 2016

No, let's re-run them when the results are easier to explain. Can you also tune the iterations so that the iterations is a higher value. The harness does some rounding with the less significant digits.

@SparkQA
Copy link

SparkQA commented Feb 3, 2016

Test build #50634 has started for PR 10917 at commit 4104d80.

@yhuai
Copy link
Contributor

yhuai commented Feb 3, 2016

test this please

@SparkQA
Copy link

SparkQA commented Feb 3, 2016

Test build #50640 has finished for PR 10917 at commit 4104d80.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 3, 2016

Test build #50642 has finished for PR 10917 at commit 4104d80.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

So many flaky tests...

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 3, 2016

Test build #50653 has finished for PR 10917 at commit 4104d80.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Intel(R) Core(TM) i7-4960HQ CPU @ 2.60GHz
Hash For map: Avg Time(ms) Avg Rate(M/s) Relative Rate
-------------------------------------------------------------------------------
interpreted version 64709.73 0.00 1.00 X
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this benchmark take to run? THis looks really long. I think we should keep benchmarks to run in low number of seconds total if possible.

@rxin
Copy link
Contributor

rxin commented Feb 5, 2016

cc @cloud-fan on follow-up

@@ -124,7 +124,7 @@ private[spark] object Benchmark {
}
val best = runTimes.min
val avg = runTimes.sum / iters
Result(avg / 1000000, num / (best / 1000), best / 1000000)
Result(avg / 1000000, num.toDouble / (best / 1000), best / 1000000)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to keep the precision here. The bestMs/avgMs can be well controlled in an appropriate number, but the rate can't. And we use rate as a divisor later, so if rate is small(assume we are benchmarking some slow operations), we will get large deviation. BTW, we use %10.1f to print rate but previously rate is always integral.

cc @davies

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hot fixed in master.

@SparkQA
Copy link

SparkQA commented Feb 6, 2016

Test build #50858 has finished for PR 10917 at commit 315af8c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Hash For normal: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------
interpreted version 2209 / 2271 0.9 1053.4 1.0X
codegen version 1887 / 2018 1.1 899.9 1.2X
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the generated version is slower?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

codegen version is 20% faster, because it doesn't have runtime reflection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I read it wrong (Maybe I commented in wrong line, meant the previous result).

@SparkQA
Copy link

SparkQA commented Feb 7, 2016

Test build #50899 has finished for PR 10917 at commit 9a1f8ff.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor

davies commented Feb 9, 2016

LGTM, merging this into master, thanks!

@asfgit asfgit closed this in 7fe4fe6 Feb 9, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants