Introduce a benchmark template #26792

chancancode · 2016-10-15T10:28:17Z

This replaces boilerplate in the “benchmark your code” section of the contributors’ guide with an executable template. I also amended the text to encourage best practices and codified it in the template.

For now this is only good for relatively self-contained changes that can be inlined into a simple script. In the future, this can be expanded to cover how to measure the difference between two commits.

The output looks like this:

==================================== Empty =====================================

Warming up --------------------------------------
              blank?   225.963k i/100ms
         fast_blank?   238.147k i/100ms
Calculating -------------------------------------
              blank?      8.825M (± 6.4%) i/s -     44.063M in   5.014824s
         fast_blank?      9.311M (± 6.3%) i/s -     46.439M in   5.009153s

Comparison:
         fast_blank?:  9310694.8 i/s
              blank?:  8824801.7 i/s - same-ish: difference falls within error


================================= Single Space =================================

Warming up --------------------------------------
              blank?    56.581k i/100ms
         fast_blank?   232.774k i/100ms
Calculating -------------------------------------
              blank?    813.985k (±16.7%) i/s -      4.017M in   5.076576s
         fast_blank?      9.547M (± 5.2%) i/s -     47.719M in   5.013204s

Comparison:
         fast_blank?:  9547414.0 i/s
              blank?:   813985.0 i/s - 11.73x  slower


================================== Two Spaces ==================================

Warming up --------------------------------------
              blank?    58.265k i/100ms
         fast_blank?   244.056k i/100ms
Calculating -------------------------------------
              blank?    823.343k (±16.2%) i/s -      4.020M in   5.014213s
         fast_blank?      9.484M (± 4.9%) i/s -     47.347M in   5.005339s

Comparison:
         fast_blank?:  9484021.6 i/s
              blank?:   823343.1 i/s - 11.52x  slower


=============================== Mixed Whitspaces ===============================

Warming up --------------------------------------
              blank?    53.919k i/100ms
         fast_blank?   237.103k i/100ms
Calculating -------------------------------------
              blank?    763.435k (±16.8%) i/s -      3.720M in   5.018029s
         fast_blank?      9.672M (± 5.8%) i/s -     48.369M in   5.019356s

Comparison:
         fast_blank?:  9672467.2 i/s
              blank?:   763435.4 i/s - 12.67x  slower


=============================== Very Long String ===============================

Warming up --------------------------------------
              blank?    34.037k i/100ms
         fast_blank?   240.366k i/100ms
Calculating -------------------------------------
              blank?    409.731k (± 8.9%) i/s -      2.042M in   5.028235s
         fast_blank?      9.794M (± 4.3%) i/s -     49.035M in   5.016328s

Comparison:
         fast_blank?:  9794225.2 i/s
              blank?:   409731.4 i/s - 23.90x  slower

tenderlove

This replaces boilerplate in the “benchmark your code” section of the contributors’ guide with an executable template. I also amended the text to encourage best practices and codified it in the template. For now this is only good for relatively self-contained changes that can be inlined into a simple script. In the future, this can be expanded to cover how to measure the difference between two commits. The output looks like this: ``` ==================================== Empty ===================================== Warming up -------------------------------------- blank? 225.963k i/100ms fast_blank? 238.147k i/100ms Calculating ------------------------------------- blank? 8.825M (± 6.4%) i/s - 44.063M in 5.014824s fast_blank? 9.311M (± 6.3%) i/s - 46.439M in 5.009153s Comparison: fast_blank?: 9310694.8 i/s blank?: 8824801.7 i/s - same-ish: difference falls within error ================================= Single Space ================================= Warming up -------------------------------------- blank? 56.581k i/100ms fast_blank? 232.774k i/100ms Calculating ------------------------------------- blank? 813.985k (±16.7%) i/s - 4.017M in 5.076576s fast_blank? 9.547M (± 5.2%) i/s - 47.719M in 5.013204s Comparison: fast_blank?: 9547414.0 i/s blank?: 813985.0 i/s - 11.73x slower ================================== Two Spaces ================================== Warming up -------------------------------------- blank? 58.265k i/100ms fast_blank? 244.056k i/100ms Calculating ------------------------------------- blank? 823.343k (±16.2%) i/s - 4.020M in 5.014213s fast_blank? 9.484M (± 4.9%) i/s - 47.347M in 5.005339s Comparison: fast_blank?: 9484021.6 i/s blank?: 823343.1 i/s - 11.52x slower =============================== Mixed Whitspaces =============================== Warming up -------------------------------------- blank? 53.919k i/100ms fast_blank? 237.103k i/100ms Calculating ------------------------------------- blank? 763.435k (±16.8%) i/s - 3.720M in 5.018029s fast_blank? 9.672M (± 5.8%) i/s - 48.369M in 5.019356s Comparison: fast_blank?: 9672467.2 i/s blank?: 763435.4 i/s - 12.67x slower =============================== Very Long String =============================== Warming up -------------------------------------- blank? 34.037k i/100ms fast_blank? 240.366k i/100ms Calculating ------------------------------------- blank? 409.731k (± 8.9%) i/s - 2.042M in 5.028235s fast_blank? 9.794M (± 4.3%) i/s - 49.035M in 5.016328s Comparison: fast_blank?: 9794225.2 i/s blank?: 409731.4 i/s - 23.90x slower ```

chancancode · 2016-10-15T10:35:59Z

@fxn also reviewed this in-person 😄

kaspth · 2016-10-15T11:23:38Z

Nice! ❤️

jonathanhefner · 2016-10-17T21:14:27Z

It is very easy to make an optimization that improves performance for a specific scenario you care about but regresses on other common cases. Therefore, you should test your change against a list of representative scenarios.

Also worth noting: looping over scenarios inside of a micro-benchmark can skew measurements. Writing the loop around the benchmark, as codified in this PR, is more accurate. But, when you want aggregated stats, this style can make evaluating performance cumbersome. I wrote a gem (repo) to help with this, which I hope could be useful.

Some examples:

WEIGHTED_SCENARIOS = [
  # 20% empty strings
  "", "",
  # 20% short blank strings
  " ", " \n",
  # 10% long blank strings
  " " * 100,
  # 50% non-blank strings of various lengths
  "abc", "xyz", "abcxyz", "abc xyz"
]

Benchmark.inputs(WEIGHTED_SCENARIOS) do |x|
  x.report('blank?')      {|value| value.blank? }
  x.report('fast_blank?') {|value| value.fast_blank? }
  x.compare!
end

# OUTPUT:
#
# blank?
#   883983.8 i/s (±8.41%)
# fast_blank?
#   10418275.3 i/s (±2.35%)
# 
# Comparison:
#   fast_blank?:  10418275.3 i/s
#        blank?:    883983.8 i/s - 11.79x slower

AGGREGATED_SCENARIOS = {
  "Empty"            => [""],
  "Short Blank"      => [" ", "  ", " \t\r\n"],
  "Long Blank"       => [" " * 20, " " * 100],
  "Short Non-blank"  => ["abc", "abc xyz"],
  "Long Non-blank"   => ["x" * 20, "x" * 100],
}

AGGREGATED_SCENARIOS.each_pair do |name, values|
  puts
  puts " #{name} ".center(80, "=")
  puts

  Benchmark.inputs(values) do |x|
    x.report('blank?')      {|value| value.blank? }
    x.report('fast_blank?') {|value| value.fast_blank? }
    x.compare!
  end
end

# OUTPUT:
#
# ==================================== Empty =====================================
# 
# blank?
#   9173564.6 i/s (±2.49%)
# fast_blank?
#   10144075.8 i/s (±5.31%)
# 
# Comparison:
#   fast_blank?:  10144075.8 i/s
#        blank?:   9173564.6 i/s - 1.11x slower
# 
# 
# ================================= Short Blank ==================================
# 
# blank?
#   619318.1 i/s (±10.54%)
# fast_blank?
#   10222090.9 i/s (±3.30%)
# 
# Comparison:
#   fast_blank?:  10222090.9 i/s
#        blank?:    619318.1 i/s - 16.51x slower
# 
# 
# ================================== Long Blank ==================================
# 
# blank?
#   374167.5 i/s (±7.30%)
# fast_blank?
#   10409157.6 i/s (±6.49%)
# 
# Comparison:
#   fast_blank?:  10409157.6 i/s
#        blank?:    374167.5 i/s - 27.82x slower
# 
# 
# =============================== Short Non-blank ================================
# 
# blank?
#   1294028.6 i/s (±5.78%)
# fast_blank?
#   9928018.4 i/s (±4.39%)
# 
# Comparison:
#   fast_blank?:   9928018.4 i/s
#        blank?:   1294028.6 i/s - 7.67x slower
# 
# 
# ================================ Long Non-blank ================================
# 
# blank?
#   1256066.8 i/s (±5.38%)
# fast_blank?
#   10460655.2 i/s (±3.16%)
# 
# Comparison:
#   fast_blank?:  10460655.2 i/s
#        blank?:   1256066.8 i/s - 8.33x slower

chancancode · 2016-10-18T00:26:55Z

That's pretty nice! I think the ideal approach -something I am hoping to try for my inflector changes - is to record the actual calls from production and replay them (in the same order/frequency) in the benchmarks. I think that is the most realistic way to do it, but it might make sharing the dataset more difficult since it may contain sensitive information.

chancancode changed the title ~~Introduce a benchmark template [ci skip]~~ Introduce a benchmark template Oct 15, 2016

tenderlove approved these changes Oct 15, 2016

View reviewed changes

chancancode force-pushed the benchmark-template branch from 6a7b206 to f2f9b88 Compare October 15, 2016 10:35

chancancode merged commit 0bf90fa into master Oct 15, 2016

chancancode deleted the benchmark-template branch October 15, 2016 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce a benchmark template #26792

Introduce a benchmark template #26792

chancancode commented Oct 15, 2016

tenderlove left a comment

chancancode commented Oct 15, 2016

kaspth commented Oct 15, 2016

jonathanhefner commented Oct 17, 2016

chancancode commented Oct 18, 2016

Introduce a benchmark template #26792

Introduce a benchmark template #26792

Conversation

chancancode commented Oct 15, 2016

tenderlove left a comment

Choose a reason for hiding this comment

chancancode commented Oct 15, 2016

kaspth commented Oct 15, 2016

jonathanhefner commented Oct 17, 2016

chancancode commented Oct 18, 2016