Improve performance (part III) #1604

denisdefreyne · 2022-10-15T18:20:22Z

Detailed description

This continues the progress made in improving performance.

The main learning: Set#any? is slow -- much better to use Set#empty? and negate.

I also realised that Sets can be avoided if the possible elements are limited and known ahead of time. Set is rather slow, and more verbose code can be a lot faster.

To do

Tests
- It’d be nice to have a test case for #props_for returning nil, but by now it’s less and less likely that there’s such old versions of Nanoc around that require this very tiny bit of backwards compatibility.

Related issues

Two other performance-focused PRs:

This is far faster.

The props-to-hash-to-props dance is not needed, and was causing slowness.

DivineDominion · 2022-11-11T21:37:28Z

@denisdefreyne I'm curious: do you have a hunch why Set#any? is slow? Once you have everything hashed, which might take a while sure, the hash lookup should be super quick.

denisdefreyne · 2022-11-12T09:32:59Z

Set#any? is slow because Set does not implement #any? directly, but delegates to Enumerable#any? instead. Enumerable#any? seems to be implemented in a way in which it starts iterating over elements, and then breaks out after the first element, something like this:

module MyEnumerable
  def any?
    each do
      return true
    end
  
    false
  end
end

That is what the CPU profile seems to suggest, at least.

It also explains the significance difference in performance. Here’s a benchmark:

Warming up --------------------------------------
                any?   276.965k i/100ms
             !empty?     3.323M i/100ms
Calculating -------------------------------------
                any?      2.756M (± 2.6%) i/s -     13.848M in   5.027698s
             !empty?     33.343M (± 0.2%) i/s -    169.469M in   5.082677s

Comparison:
             !empty?: 33342764.7 i/s
                any?:  2756310.9 i/s - 12.10x  (± 0.00) slower

Benchmark implementation:

require 'benchmark/ips'
require 'set'

set = Set.new
100.times { set << rand }

Benchmark.ips do |x|
  x.report("any?") do |times|
    i = 0
    while i < times
      set.any?
      i += 1
    end
  end

  x.report("!empty?") do |times|
    i = 0
    while i < times
      !set.empty?
      i += 1
    end
  end

  x.compare!
end

denisdefreyne · 2022-11-12T20:03:49Z

One more thing: #any? also returns false for non-empty collections that contain falsy elements: [false, nil].any? is false.

So, switching from xyz.any? to !xyz.empty? is not only faster, but also more correct. (Though in Nanoc’s case, the correctness wasn’t an issue.)

denisdefreyne added 2 commits October 15, 2022 19:32

Speed up affected_props in OutdatednessRule

20c7f8e

Replace Set#any? with negated Set#empty?

0dccd04

This is far faster.

denisdefreyne marked this pull request as ready for review October 15, 2022 18:20

denisdefreyne added 2 commits October 15, 2022 20:23

Optimise #dependencies_causing_outdatedness_of

9ba05d9

The props-to-hash-to-props dance is not needed, and was causing slowness.

Speed up Document#[]=

113ae81

denisdefreyne force-pushed the denis/speedup-3 branch from 0b1651d to 113ae81 Compare October 15, 2022 18:24

denisdefreyne merged commit e4af56b into main Oct 15, 2022

denisdefreyne deleted the denis/speedup-3 branch October 15, 2022 18:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance (part III) #1604

Improve performance (part III) #1604

denisdefreyne commented Oct 15, 2022

DivineDominion commented Nov 11, 2022

denisdefreyne commented Nov 12, 2022

denisdefreyne commented Nov 12, 2022

Improve performance (part III) #1604

Improve performance (part III) #1604

Conversation

denisdefreyne commented Oct 15, 2022

Detailed description

To do

Related issues

DivineDominion commented Nov 11, 2022

denisdefreyne commented Nov 12, 2022

denisdefreyne commented Nov 12, 2022