Skip to content

Commit

Permalink
Make Checksummer faster and more correct
Browse files Browse the repository at this point in the history
The new implementation is faster because of two reasons:

- It more aggressively avoids calculating checksums of already-calculated items
- It effectfully updates a hash rather than using Immutable::Set

It is also more correct because it replaces use of <recur> with a numerical reference to a previously-seen object.

Each newly checksummed object will have a number suffix, e.g.
`Array#5<…>`. Each previously seen object will have an @-reference, e.g.
`Array#9<@4,@5,>`.

This will invalidate existing stored checksums in Nanoc sites, but it is unavoidable and worth the tradeoff.
  • Loading branch information
denisdefreyne committed Apr 4, 2024
1 parent f1ff297 commit 0ff8793
Show file tree
Hide file tree
Showing 6 changed files with 96 additions and 71 deletions.
29 changes: 27 additions & 2 deletions nanoc-core/lib/nanoc/core/checksummer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,13 @@ def calc(obj, digest_class = CompactDigest)
digest.to_s
end

# TODO: remove (older, slower implementation)
def calc_v1(obj, digest_class = CompactDigest)
digest = digest_class.new
update_v1(obj, digest)
digest.to_s
end

def calc_for_content_of(obj)
obj.content_checksum_data || obj.checksum_data || Nanoc::Core::Checksummer.calc(obj.content)
end
Expand All @@ -61,14 +68,32 @@ def define_behavior(klass, behavior)

private

def update(obj, digest, visited = Immutable::Set.new)
def update(obj, digest, visited = {})
num = visited[obj]
if num
# If there already is an entry for this object, refer to it by its number.
digest.update("@#{num}")
else
# This object isn’t known yet. Assign it a new number.
num = visited.length
visited[obj] = num

digest.update(obj.class.to_s)
digest.update("##{num}<")
behavior_for(obj).update(obj, digest) { |o| update(o, digest, visited) }
digest.update('>')
end
end

# TODO: remove (older, slower implementation)
def update_v1(obj, digest, visited = Immutable::Set.new)
digest.update(obj.class.to_s)

if visited.include?(obj)
digest.update('<recur>')
else
digest.update('<')
behavior_for(obj).update(obj, digest) { |o| update(o, digest, visited.add(obj)) }
behavior_for(obj).update(obj, digest) { |o| update_v1(o, digest, visited.add(obj)) }
digest.update('>')
end
end
Expand Down
4 changes: 2 additions & 2 deletions nanoc-core/spec/nanoc/core/action_sequence_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -147,9 +147,9 @@
example do
expect(subject).to eql(
[
[:filter, :erb, 'PeWUm2PtXYtqeHJdTqnY7kkwAow='],
[:filter, :erb, 'B1gmzMdP+iEDgTz7SylLoB6yLNw='],
[:snapshot, [:bar], true, ['/foo.md']],
[:layout, '/default.erb', '97LAe1pYTLKczxBsu+x4MmvqdkU='],
[:layout, '/default.erb', 'QQW0vu/3fP4Ihc5xhQKuPer3xUc='],
],
)
end
Expand Down

0 comments on commit 0ff8793

Please sign in to comment.