New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make validation optional #95
Comments
@Tagl I have definitely thought about the performance impact of the validation, and I do like the idea of adding an option to skip the validation. However, since the validation check is more of an implementation behavior rather than a property that is intrinsic to the automaton (per the theory), I would prefer not to make this an input parameter to Rather, I think it would be better to make this a global setting as part of a new base module (see below). We could (and maybe should) also offer an option to disable automaton immutability. Therefore, perhaps we could introduce an
The @eliotwrobson What do you think about this? |
A global config seems like a nice approach. The unnecessary freezing was avoided by just passing in frozensets and frozendicts instead of sets and dicts, something the user already has control over. I think optional immutability might complicate implementations that rely on immutability. Yes, |
@caleb531 I also like the global config approach, that's a nice way to give users control over what's going on under the hood if necessary. And I agree with @Tagl , changing the freezing behavior using the global config could get confusing and it isn't a big deal as long as the user is providing frozensets in their input. I suppose we could do it just for fun to see if anyone uses it, but I think it's very unlikely to be something people are really taking advantage of in any substantial way. And yes, the Also @Tagl out of curiosity, what is your code doing exactly? Even large artificial workloads I was using didn't have that many operations. |
Examining permutation classes and determining whether they have an infinite or finite amount of simple permutations. |
@Tagl Isn't there still a performance impact, though? Because I am still recursively processing the transitions, etc. regardless of the type of iterable you pass in, even if it's a |
@caleb531 I think the issue persists even without the freezing, as you make a copy of all of the sets / dicts given as input, instead of just keeping aliases to them (even though the copies are themselves mutable). It would be great if there were a way in Python to transfer ownership of a data structure to the class (something move-constructor-esque, like in C++). If you want to allow for this though, you could add a global flag that turns off copying these data structures with the burden on the user not to modify them anymore. As an aside, I think that the recursive processing doesn't quite work as intended. For example, I ran into issues trying to pass in an iterable directly as the set of |
Tuples, frozensets and frozendicts are just passed through as objects (by reference) since |
That makes sense actually, the automaton constructor doesn't need to make a copy because the given structure itself is immutable (just needs a reference to the immutable data). In that case it probably doesn't make sense to have a flag that controls freezing or whether the constructors make copies, just inform the user they should pass in things as frozensets (or iterables) when possible. |
@eliotwrobson @Tagl The fact that my However, it does raise an interesting question. Before we introduced immutability, every input parameter was copied so that pass-by-reference wouldn't get in the way (e.g. if you passed the same transitions dict to two different NFA constructors, However, we have an inconsistency in this new immutability-based implementation where some input params are copied (e.g. if you pass a And what if the user passes a custom object (which doesn't inherit from Therefore, because of all this, I'm actually starting to wonder if we should strip out Of course, the above would mean we just wasted all that time refactoring the tests to not mutate the automaton's attributes (e.g. What do you guys think? |
I'm kindof in favor of keeping things as they are. Enforcing the immutability (in my mind) is less about the ability to hash the automatons and really about removing a way of accidentally modifying / breaking the language they represent somewhere in your code. I'm actually ok leaving the freezing behavior as-is (even though there's some inconsistency with the way that references get passed in). I'm not sure what the standard behavior for objects like this is, but it's generally hard (impossible) to prevent users from doing things they aren't supposed to in Python (even with current safeguards, the user can just call the object setattr function themselves and make changes). Also, I think that even using the copy module, immutable objects simply return references to themselves, since it makes no difference for the end user whether a shallow copy or reference to the original is returned for the final object: https://stackoverflow.com/questions/37100944/i-think-immutable-types-like-frozenset-and-tuple-not-actually-copied-what-is-th Although it's technically an inconsistency to use the same reference to the given input sets, these are really under-the-hood type performance details that don't make a huge difference to the end user in terms of functionality. The main benefit is giving people fewer ways of accidentally causing problems for themselves. |
@eliotwrobson You have some very good points. And I agree with keeping immutability from the validation perspective. Running a quick test in the Python interpreter, I did also confirm that the
As far as this issue goes, I think we've established that we should neither add a global configuration setting for immutability nor for copying (and as you set, just recommend passing it frozensets/frozendicts). So the only configuration setting we would have at this point is to globally disable validation, correct? |
@caleb531 Yes, based on the discussion it seems like the only global config that makes sense is for the validation. |
I agree, keep immutability. This could be noted in documentation for performance as a recommendation. A flag for validation is necessary in my opinion, as validation is one of the most demanding computation and is mainly a quality of life check when constructing the automata. |
@Tagl @eliotwrobson Great, thank you for the feedback. I already have a separate branch live for the global configuration, and will submit a PR for that within the next few days. |
Closing this ticket since the recent v7 release has added this configuration functionality that enables optional validation. |
I was trying to optimize some code of mine using the library (billions of DFA operations) so I ran a profiler and saw a lot of time was spent doing unions, specifically in PartitionRefinement.refine, cross product graph creation and then what surprised me was the
__init__
function. And half of the time was due to validation.I think validation should be optional, so the user could pass in
validate=False
as a keyword argument when the user is a 100% sure that the construction is valid.Thoughts on this?
The text was updated successfully, but these errors were encountered: