Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add word cache reset function #152

Merged
merged 13 commits into from
Jun 17, 2023
Merged

Conversation

eliotwrobson
Copy link
Collaborator

See title, adds a word cache reset function and precomputes the digraph used by the DFA. Resolves #148

In looking at this, I thought about using a cached property for some of the methods that return fixed information about the given DFAs. However, switching these to cached properties (with the decorator) requires using a slightly different API. @caleb531 what are your thoughts about changing this? Is there a way to provide this caching without changing the API? It would be great if there were a way to do this cleanly.

@coveralls
Copy link

coveralls commented Jun 17, 2023

Coverage Status

coverage: 99.883%. remained the same when pulling 5313990 on eliotwrobson:caching into 2aa2dda on caleb531:develop.

docs/fa/class-dfa.md Outdated Show resolved Hide resolved
tests/test_dfa.py Outdated Show resolved Hide resolved
automata/fa/dfa.py Outdated Show resolved Hide resolved
Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eliotwrobson Overall looks good, but left you a few comments with some requested changes (mostly the name of the method itself).

Regarding preserving the API, I believe the @functools.lru_cache() decorator is the stdlib way to cache the output of methods. The only thing to note is that the method returns the cached output based on the arguments you pass to the function at call time. So foo(first='Eliot') and foo(last='Robson') will each be cache misses on the first call. But I think this is the behavior we want in most cases, anyway.

@eliotwrobson
Copy link
Collaborator Author

@caleb531 sounds good! As you pointed out, I made a mistake in the docs with the name in the first place 😅 so changing it is no problem.

Regarding caching, using lru cache here is actually a mistake: https://youtu.be/sVjtp6tGo0g

The use here would be for caching something like isempty, where there's no sense in recomputing the result each time for an immutable DFA. The canonical way of doing this as of 3.8 is with this decorator: https://docs.python.org/3/library/functools.html#functools.cached_property

However, this changes the method into a property (so it is accessed by the client code differently). This might be more natural since users may expect cached properties to behave this way, but I also don't know if changing the API this way is the right move.

@caleb531
Copy link
Owner

@eliotwrobson Ah yes, I forgot that lru_cache is problematic for methods (ironically, this is something I myself have mentioned in a previous issue, yet have apparently forgotten 😅). However, it seems that there are at least two workarounds:

  1. Implement the weakref-based decorator for caching methods (if we go this approach, I would call this decorator cached_method instead of memoized_method to achieve parity with the below package, plus I just like the name better)
  2. Require the cached_method package as an additional dependency

The package doesn't seem like much overhead, but given how small this decorator is, we could just as easily include the implementation directly in the code. I could go either way.

So this whole post of mine obviously implies that I'd still prefer members like isempty to be methods, not properties. For me, it isn't just about preserving API backwards-compatibility, but about the semantics of methods vs. properties and the flexibility that methods provide.

Methods can have additional arguments in a way that's effectively future-proof with keyword arguments; properties can't no matter what you do. And as for the semantic argument: I tend to use properties to represent nouns/things, but when I have some check or operation to perform, I use a method.

I recognize that Automaton.input_parameters is the only exception to my rule of thumb because it represents a thing (or things), not an operation. Plus, the code is pretty simple and hardly expensive.

Okay, thank you for coming to my TED talk. Please let me know your thoughts!

@eliotwrobson
Copy link
Collaborator Author

I tend to agree, having these be methods allows more flexibility and conveys these are something to be computed. In the interest of writing less code, I'm inclined to go with the adding the optional dependency. So I will do that along with the renames described.

@caleb531 caleb531 marked this pull request as draft June 17, 2023 21:29
@caleb531
Copy link
Owner

@eliotwrobson Just want to say please feel free to continue to make changes, and just convert this PR out of Draft mode when it's ready for my re-review.

@eliotwrobson
Copy link
Collaborator Author

@caleb531 looking over the changes I think this is ready to go! Flipping to open.

@eliotwrobson eliotwrobson marked this pull request as ready for review June 17, 2023 21:41
@caleb531 caleb531 self-requested a review June 17, 2023 21:42
Copy link
Owner

@caleb531 caleb531 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eliotwrobson LGTM! 👍

@eliotwrobson eliotwrobson merged commit 766090f into caleb531:develop Jun 17, 2023
5 checks passed
@caleb531
Copy link
Owner

Actually, one quick afterthought:

How will all this additional caching behave if a user enables mutable automata? I guess they can just clear the caches manually as they make changes, correct?

@eliotwrobson eliotwrobson mentioned this pull request Jun 17, 2023
@eliotwrobson
Copy link
Collaborator Author

How will all this additional caching behave if a user enables mutable automata? I guess they can just clear the caches manually as they make changes, correct?

@caleb531 yes, they can manually clear cached methods by emptying the __dict__. I don't think people will run into issues (as the API still assumes immutability even though the data structures themselves don't prevent this behavior).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants