Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a coerce_collections key_factory for @memoized to reduce boilerplate #7127

Conversation

cosmicexplorer
Copy link
Contributor

Problem

I accidentally left a TODO in a previous PR to allow @memoized_method to coerce lists into tuples instead of erroring out (because lists aren't hashable), which I left again in #7126.

It would be nice to not make things tuples everywhere (weird and unclear syntax, imho) just so we can memoize them.

Solution

  • Introduce coercing_arg_normalizer to apply a dict of type coercions to arguments for a key_factory.
  • Introduce coerce_collections{,_per_instance} as key_factorys for @memoized to coerce list and OrderedSet inputs to tuples so they can be hashed.
  • Add testing.

Result

Some boilerplate is removed when interacting with functions or instance methods decorated with forms of @memoized.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Jan 23, 2019

I think @jsirois knows the most about this code.

@cosmicexplorer
Copy link
Contributor Author

Thanks, added! I definitely thought I had clicked that but I think a lot of things.

@cosmicexplorer
Copy link
Contributor Author

cosmicexplorer commented Jan 23, 2019

The only CI failures are building linux and osx wheels, which are also failing in other PRs, so this is reviewable (and I will look into why this is happening, although perhaps tomorrow).

Copy link
Member

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to actually see one real result as described in this PR - is there one instance of boilerplate that can be removed?

self.assertEqual(6, g(OrderedSet(x)))
self.assertEqual(2, g_called._calls)

class C(self._Called):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should break out into its own test method, its testing a different function.

def g(self, x):
self._called()
return sum(x)
c = C()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key_factory is _per_instance but you only ever make one instance. It would be good to make two and confirm each instance carries its own independent cache.

return tuple(coerced)


collection_coercions = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe _COLLECTION_COERCIONS - this serves in the private constant role IIUC.

@@ -24,6 +26,44 @@ def equal_args(*args, **kwargs):
return key


def coercing_arg_normalizer(type_coercions, base_normalizer, args, kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this can be _private, fwict you only intend to export coerce_collections and coerce_collections_per_instance.


:rtype: tuple
"""
args_key = base_normalizer(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no contract whatsoever on what the cache key that is output is shaped like. It could be a cryptographic-grade hash value. Normalization should really apply to *args and **kwargs and the transformed versions of those be passed to the base_normalizer. As things stand you're presenting a pretend api that actually only works with equal_args et. al. One option is to not pretend this is a generic api. You could take a bool flag instead of base_normalizer, say per_instance. If True - pick per_instance here and otherwise use equal_args. By reparamterizing in that way you could leave the code here as you have it and couple to the knowledge of the output shape of those key factories.

@@ -24,6 +26,44 @@ def equal_args(*args, **kwargs):
return key


def coercing_arg_normalizer(type_coercions, base_normalizer, args, kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be wonderful to keep consistent terminology in a single file. base_normalizer is elsewhere key_factory. Maybe coercing_key_factory and base_key_factory or underlying or else rename everything related in this file to *normalizer.

@cosmicexplorer
Copy link
Contributor Author

cosmicexplorer commented Jan 24, 2019

is there one instance of boilerplate that can be removed?

Not in this PR (but I will look for one), but in #7126: there's a TODO that mentions this issue, and the boilerplate would occur here:

run_before_synthesized_task_types = self._synthesize_task_types(tuple(self._run_before_task_types))
python_create_distributions_task_type = self._testing_task_type
run_after_synthesized_task_types = self._synthesize_task_types(tuple(self._run_after_task_types))

After having refactored that code since it was first written, the @classpropertys could be declared as tuples, as we do with e.g. subsystem_dependencies() (in a previous iteration of that PR that was not the case, and the memoization was needed for correctness because we were calling _synthesize_task_types() multiple times in the same method, motivating this separate change). The rationale for this change is that the collection type (tuple) required for subclasses to implement e.g. _run_before_task_types takes on an arbitrary importance due to the implementation detail of memoization, which is confusing (this is much more the reason than "weird syntax").

I will address your inline comments and look for another example of boilerplate that may make this more convincing today.

@Eric-Arellano
Copy link
Contributor

Closing due to being stale and review feedback that this would benefit from a use case being included in the PR. Feel free to reopen, of course, if you still have a need for the feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants