Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maybe a problem during finalization #8

Closed
amirouche opened this issue Aug 16, 2020 · 6 comments · Fixed by #10
Closed

Maybe a problem during finalization #8

amirouche opened this issue Aug 16, 2020 · 6 comments · Fixed by #10
Assignees

Comments

@amirouche
Copy link
Contributor

amirouche commented Aug 16, 2020

In the search_lss method:

suffix = state.longest_strict_suffix
if suffix.longest_strict_suffix is None:
self.search_lss(suffix)
for symbol, next_state in suffix.transitions.items():
if (symbol not in state.transitions and
suffix != self._zero_state):
state.transitions[symbol] = next_state

The line before the last is strange: suffix != self._zero_state that test can be done before we enter the loop.

So there might be some performance to gain during finalization.

Let me know what you think 🙂

@FrederikP
Copy link
Collaborator

The test can certainly be done before. I will take closer look when I get a chance.
Thanks for pointing that out.

@FrederikP FrederikP self-assigned this Aug 19, 2020
FrederikP added a commit that referenced this issue Aug 19, 2020
@FrederikP FrederikP linked a pull request Aug 19, 2020 that will close this issue
@FrederikP
Copy link
Collaborator

@amirouche Thanks for finding this. I must've overlooked it during some refactoring or so. As you can see in #10 it really speeds up setup considerably.
I'll do some more performance tests and release a new version soon.

@amirouche
Copy link
Contributor Author

By the way, this is not a performance improvement but a remark about the following:

if state.longest_strict_suffix is None:
parent = state.parent
traversed = parent.longest_strict_suffix
while True:
if state.symbol in traversed.transitions and\
traversed.transitions[state.symbol] != state:
state.longest_strict_suffix =\
traversed.transitions[state.symbol]
break
elif traversed == self._zero_state:
state.longest_strict_suffix = self._zero_state
break
else:
traversed = traversed.longest_strict_suffix
suffix = state.longest_strict_suffix
if suffix.longest_strict_suffix is None:
self.search_lss(suffix)

The test if suffix.longest_strict_suffix is None: is done at the beginning and at the end of the snippet before the call. That will save a python call. So maybe it is a benefit.

Thanks a lot for sharing this library 👍

@amirouche
Copy link
Contributor Author

Last thing, in my implementation I replaced the list to_process with a set, instead of append I use the equivalent of set.add. What I noticed, is that there a few calls less to search_lss (still my implementation is less fast). I am wondering if it does matter to use a set or list in that place?

@FrederikP
Copy link
Collaborator

  1. Concerning the first comment: True, I will remove one of those occurrences. It doesn't change performance much, but the code is a little less ugly. Thanks again 👍
  2. If I remember correctly I had a set there as well, but for performance reasons switched to using a list. It's not really the correct datatype from a logical standpoint, but it's just an optimization decision.

@FrederikP
Copy link
Collaborator

Released performance fix with 1.6.0

Thanks for contributing with such detailed suggestions @amirouche

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants