Optimize single-argument methods #210

JacobEvelyn · 2021-08-27T15:26:04Z

This commit adds a performance optimization to one-argument
methods; these methods now have their memoization hashes
stored in an array rather than in an outer hash. This offers
a slight speedup because array accesses are faster than
hash lookups.

Before merging:

Copy the table printed at the end of the latest benchmark results into the README.md and update this PR
If this change merits an update to CHANGELOG.md, add an entry following Keep a Changelog guidelines with semantic versioning
Manually re-confirm that this code is consistently faster than the previous no-array implementation

codecov · 2021-08-27T15:26:48Z

Codecov Report

Merging #210 (01b4fb3) into main (4a7b5cc) will not change coverage.
The diff coverage is 100.00%.

❗ Current head 01b4fb3 differs from pull request most recent head 6bab0b0. Consider uploading reports for the commit 6bab0b0 to get more accurate results

@@            Coverage Diff            @@
##              main      #210   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         2           
  Lines          164       204   +40     
=========================================
+ Hits           164       204   +40

Impacted Files	Coverage Δ
lib/memo_wise.rb	`100.00% <100.00%> (ø)`
lib/memo_wise/internal_api.rb	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4a7b5cc...6bab0b0. Read the comment docs.

JacobEvelyn · 2021-08-27T15:34:45Z

lib/memo_wise.rb

+          #       another_single_arg_method_name: 0
+          #     }
+          #   }
+          @_memo_wise_indices ||= Hash.new { |h, k| h[k] = {} }


I know we avoided Hash default procs elsewhere due to the Marshal issue, but I can't see that being an issue here. Please double-check my reasoning though!

@ms-ati I'd love your thoughts on this as well.

@JacobEvelyn Is your reasoning that this is safe for Marshal, because the Hash with default proc lives in a Class vs in an instance? Do we have tests of the Marshal case that verify this?

My reasoning is more one of usage—I find it unlikely that someone would start loading a class (e.g. memo_wise one method), then Marshal it, then load it, and then memo_wise another method. But maybe I should just change this so we don't worry about it? Adding a test for it feels like it may be overkill though.

nvm, I realized this was no longer needed and have removed it.

JacobEvelyn · 2021-08-27T15:36:10Z

lib/memo_wise.rb

            end
-          end
+          END_OF_METHOD
+        else # :splat, :double_splat


This is kind of annoying, but throughout this PR I use else when I really want to just be explicit about one specific case. Unfortunately, if this isn't else, our code coverage fails because it says we have no test coverage of the "else" condition, since it doesn't realize that our explicit options are in fact exhaustive.

In the end I decided to use else to appease code coverage, but leave comments indicating the original cases I had.

Curious to hear more about why use else to mean :splat, :double_splat instead of making an else no-op (or a better version of raise "Impossible") case?

Because then to get to 100% code coverage we need to add a test that exercises the else case, which is impossible/very difficult because our case cases are exhaustive. Does that make sense?

JacobEvelyn · 2021-08-27T15:36:55Z

lib/memo_wise.rb

+    when :one_required_positional, :one_required_keyword
+      hash = (@_memo_wise_single_argument[api.index(method_name)] ||= {})
+      hash[method_arguments == :one_required_positional ? args.first : kwargs.first.last] = yield
+    when :splat, :double_splat


In many parts of this PR, I went back and forth on when to combine different cases and when to separate them. I'm very open to suggestions.

Yeah reading this code (before this comment) I was wondering why not keep them as separate cases. I think it's slightly cleaner that way instead of using the ternary operator within the switch statement.

👍 works for me

JacobEvelyn · 2021-08-27T15:38:16Z

lib/memo_wise/internal_api.rb

+    # @return [Integer] the array index in `@_memo_wise_single_argument` to use
+    #   to find the memoization data for the given method
+    def index(method_name)
+      indices = target_class.instance_variable_get(:@_memo_wise_indices) ||


Is this search of three different objects (two are searched in target_class) to find the instance variable too janky? I couldn't come up with a better way.

I don't know another way to do it either

@ms-ati I'd love your thoughts on this as well. I'm still worried about edge cases in which multiple @_memo_wise_indices instance vars exist on e.g. both target_class and target and cause incorrect behavior when multiple memoized methods share the same name, but I'm not sure what additional tests I can add that I haven't already added.

I agree that this doesn't seem like the correct pattern @JacobEvelyn. Should we schedule a pairing session to review the code, and look for ways to have a single code path that knows "where" to look?

I may be misunderstanding, but it seems like we should know whether we are in a "def", "def self.", or "class << self" code path, shouldn't we?

JacobEvelyn · 2021-08-27T15:39:28Z

lib/memo_wise/internal_api.rb

      obj.instance_variable_set(:@_memo_wise, {}) unless obj.instance_variables.include?(:@_memo_wise)
+      unless obj.instance_variables.include?(:@_memo_wise_single_argument)


Worth noting—if we similarly optimize the no-args case, I'd probably rename this array @_memo_wise and have the hash be @_memo_wise_multiple_arguments or something similar. I'm open to naming suggestions.

I wonder if it might be worth renaming the hash to @_memo_wise_multiple_arguments already within this PR?

I'd rather not do that now since at this point that hash also contains 0-arity methods, and I think it's possible that this optimization won't end up being faster for 0-arity methods because of the sentinel issue. Thoughts?

That makes sense to me

JacobEvelyn · 2021-08-27T15:40:05Z

lib/memo_wise/internal_api.rb

+    # @param method_name [Symbol] the name of the memoized method
+    # @return [Integer] the array index in `@_memo_wise_single_argument` to use
+    #   to find the memoization data for the given method
+    def index(method_name)


I tried using a more descriptive name but it got annoying. 🤷‍♀️

lib/memo_wise/internal_api.rb

JacobEvelyn · 2021-08-27T15:42:35Z

spec/memo_wise_spec.rb

-        it "does not memoize the class methods" do
-          expect(Array.new(4) { class_with_memo.with_positional_args(1, 2) }).
-            to all eq("class_with_positional_args: a=1, b=2")
+        context "for methods with no arguments" do


I'm concerned about bugs in my logic manifesting when there are multiple memoized methods with the same name, so I added a handful of methods testing that, here and elsewhere in the specs below. I couldn't come up with easy ways to generalize these or test them more broadly/in all cases (modules, etc.) but am definitely open to doing so if y'all have ideas here.

@ms-ati I'd love your thoughts on DRYing up these tests.

Never mind, I think I came up with a DRY approach in the Round 2 changes commit.

spec/support/shared_context_for_instance_methods.rb

JacobEvelyn · 2021-08-27T15:47:56Z

spec/reset_memo_wise_spec.rb

@@ -388,6 +388,48 @@
          expect(instance2.no_args_counter).to eq(1)
        end
      end
+
+      context "when method name is the same as a memoized class method" do


Should we add similar tests to preset_memo_wise_spec? I mostly viewed these as a way to test that InternalAPI#index works correctly, but open to other thoughts.

I think it makes sense to either leave a comment explaining this decision or to add similar tests to preset_memo_wise_spec, otherwise they're asymmetrical without a documented explanation for why.

lib/memo_wise/internal_api.rb

jemmaissroff · 2021-08-27T18:55:22Z

lib/memo_wise.rb

            end
-          end
+          END_OF_METHOD
+        else # :splat, :double_splat


Curious to hear more about why use else to mean :splat, :double_splat instead of making an else no-op (or a better version of raise "Impossible") case?

jemmaissroff · 2021-08-27T18:59:54Z

lib/memo_wise.rb

+    when :one_required_positional, :one_required_keyword
+      hash = (@_memo_wise_single_argument[api.index(method_name)] ||= {})
+      hash[method_arguments == :one_required_positional ? args.first : kwargs.first.last] = yield
+    when :splat, :double_splat


Yeah reading this code (before this comment) I was wondering why not keep them as separate cases. I think it's slightly cleaner that way instead of using the ternary operator within the switch statement.

jemmaissroff · 2021-08-27T19:05:52Z

lib/memo_wise/internal_api.rb

+      #
+      # `@_memo_wise_single_argument` looks like:
+      #   [
+      #     { arg1 => :memoized_result, ... }, # For method 1


Such a nitpick, can we do For method 0 and For method 1 in the comments here? I find that easier to follow with the indexing we have to use here

jemmaissroff · 2021-08-27T19:06:13Z

lib/memo_wise/internal_api.rb

+      #     { arg1 => :memoized_result, ... }, # For method 1
+      #     { arg1 => :memoized_result, ... }, # For method 2
+      #   ]
+      # This is essentially a faster alternative to:


Can we remove essentially here?

jemmaissroff · 2021-08-27T19:07:15Z

lib/memo_wise/internal_api.rb

      obj.instance_variable_set(:@_memo_wise, {}) unless obj.instance_variables.include?(:@_memo_wise)
+      unless obj.instance_variables.include?(:@_memo_wise_single_argument)


I wonder if it might be worth renaming the hash to @_memo_wise_multiple_arguments already within this PR?

jemmaissroff · 2021-08-27T19:10:58Z

lib/memo_wise/internal_api.rb

+          "#{name}#{':' if type == :keyreq}"
+        end.join(", ")
+      else
+        raise ArgumentError, "Unexpected arguments for #{method.name}"


Just considering what I said re the else comment above - I think we should try stay consistent with how we treat this within the codebase.

jemmaissroff · 2021-08-30T15:50:21Z

spec/reset_memo_wise_spec.rb

@@ -388,6 +388,48 @@
          expect(instance2.no_args_counter).to eq(1)
        end
      end
+
+      context "when method name is the same as a memoized class method" do


I think it makes sense to either leave a comment explaining this decision or to add similar tests to preset_memo_wise_spec, otherwise they're asymmetrical without a documented explanation for why.

spec/reset_memo_wise_spec.rb

jemmaissroff · 2021-09-01T13:03:22Z

lib/memo_wise/internal_api.rb

      obj.instance_variable_set(:@_memo_wise, {}) unless obj.instance_variables.include?(:@_memo_wise)
+      unless obj.instance_variables.include?(:@_memo_wise_single_argument)


That makes sense to me

jemmaissroff · 2021-09-01T13:08:18Z

lib/memo_wise/internal_api.rb

+    # @return [Integer] the array index in `@_memo_wise_single_argument` to use
+    #   to find the memoization data for the given method
+    def index(method_name)
+      indices = target_class.instance_variable_get(:@_memo_wise_indices) ||


I don't know another way to do it either

ms-ati

LGTM

This commit adds a performance optimization to one-argument methods; these methods now have their memoization hashes stored in an array rather than in an outer hash. This offers a slight speedup because array accesses are faster than hash lookups.

JacobEvelyn mentioned this pull request Aug 27, 2021

Optimize one-arg methods #205

Closed

2 tasks

JacobEvelyn force-pushed the array-optimization-3 branch 3 times, most recently from c4db0b7 to 3fa4cf1 Compare August 27, 2021 15:34

JacobEvelyn commented Aug 27, 2021

View reviewed changes

lib/memo_wise/internal_api.rb Show resolved Hide resolved

JacobEvelyn commented Aug 27, 2021

View reviewed changes

JacobEvelyn force-pushed the array-optimization-3 branch from 3fa4cf1 to 1ebcc82 Compare August 27, 2021 15:46

JacobEvelyn commented Aug 27, 2021

View reviewed changes

spec/support/shared_context_for_instance_methods.rb Show resolved Hide resolved

JacobEvelyn commented Aug 27, 2021

View reviewed changes

jemmaissroff suggested changes Aug 30, 2021

View reviewed changes

jemmaissroff approved these changes Sep 1, 2021

View reviewed changes

jemmaissroff approved these changes Sep 10, 2021

View reviewed changes

JacobEvelyn force-pushed the array-optimization-3 branch from bf94391 to 094ae58 Compare September 14, 2021 21:38

ms-ati approved these changes Sep 22, 2021

View reviewed changes

Optimize single-argument methods

6bab0b0

This commit adds a performance optimization to one-argument methods; these methods now have their memoization hashes stored in an array rather than in an outer hash. This offers a slight speedup because array accesses are faster than hash lookups.

JacobEvelyn force-pushed the array-optimization-3 branch from 17b4020 to 6bab0b0 Compare September 22, 2021 21:40

JacobEvelyn merged commit 422de9e into main Sep 22, 2021

JacobEvelyn deleted the array-optimization-3 branch September 22, 2021 21:51

JacobEvelyn mentioned this pull request Oct 19, 2021

Explore using array as base data structure instead of hash #191

Closed

		obj.instance_variable_set(:@_memo_wise, {}) unless obj.instance_variables.include?(:@_memo_wise)
		unless obj.instance_variables.include?(:@_memo_wise_single_argument)

Optimize single-argument methods #210

Optimize single-argument methods #210

Conversation

JacobEvelyn commented Aug 27, 2021 • edited

codecov bot commented Aug 27, 2021 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JacobEvelyn Aug 30, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JacobEvelyn Sep 2, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ms-ati left a comment

Choose a reason for hiding this comment

JacobEvelyn commented Aug 27, 2021 •

edited

codecov bot commented Aug 27, 2021 •

edited

JacobEvelyn Aug 30, 2021 •

edited

JacobEvelyn Sep 2, 2021 •

edited