Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memoize convert2RegExp #1089

Closed
erikvold opened this issue Mar 17, 2010 · 7 comments
Closed

Memoize convert2RegExp #1089

erikvold opened this issue Mar 17, 2010 · 7 comments
Milestone

Comments

@erikvold
Copy link
Contributor

Anthony brought this issue to my attention in issue 1088 and I agree it is a good idea.

@erikvold
Copy link
Contributor Author

Here is a commit I wrote to memoize convert2RegExp on my improved-convert2RegExp branch.

This is blazing fast btw.

@arantius
Copy link
Collaborator

arantius commented Apr 2, 2010

commit 9beb729

@erikvold
Copy link
Contributor Author

erikvold commented Apr 2, 2010

I thought about a limit to the memoization of convert2RegExp, but with limit implemented this way, when I have > limit unique @includes & @excludes, and since they will all be executed in the same order, over and over, this will render the memoization meaningless, and it will simply waste cpu + memory.

This memoization will be fine to use with GM_isGreasemonkeyable, but not convert2RegExp in this context.

The other way to implement limit is so that the first 1000 (or w/e) results are cached and once the cache is full no more results are cached. But I don't think this makes sense in convert2RegExp's case, because this will mean that on every page load reg exp's that are not already in memory, are created, put into memory (so the memory is used any how), then removed, and this is repeated for every page load.

@arantius
Copy link
Collaborator

arantius commented Apr 2, 2010

The idea behind limit is that I don't want to create a memory leak. If we memoize GM_isGreasemonkeyable (which I did) or anything else that applies to i.e. every page loaded, it has the potential to grow arbitrarily large. If the cache never expires, memory gets eaten up.

Perhaps a larger limit is a good enough compromise? (The average # of include&exclude rules per script is 2.5. Even at just limit=1000, and assuming all patterns are completely distinct, that means a user needs 400 scripts before this is a problem.)

@erikvold
Copy link
Contributor Author

erikvold commented Apr 3, 2010

memoizing GM_isGreasemonkeyable with a limit and using shift() makes sense in GM_isGreasemonkeyable's case. Using a limit and shift() does not make sense in convert2RegExp's case, because it is a memory leak in that case (and also a burden on the cpu).

I suppose if we implement a limit such that when the limit is reached the cache does not change (for convert2RegExp's case), that will be alright, can we make it a configurable limit though?.

@erikvold
Copy link
Contributor Author

erikvold commented Apr 3, 2010

After thinking about it, I don't think memoizing GM_isGreasemonkeyable is a good idea, because I figure 90% of the time a avg user uses GM_isGreasemonkeyable it will be for a new url.

@erikvold
Copy link
Contributor Author

erikvold commented Apr 3, 2010

Hmm,

I thought I read somewhere that if I create two reg exp that are exactly the same, then two pieces of memory would not be taken in order to store the two expressions, only one piece. I wish I could verify if this is true or not..

If it is true though, then webmonkey seems to have a solution which I think is perfect, which is to build the regexps once when their Script is created, then access that directly when testing against urls. So the regexps are created once and cached by default, then convert2RegExp isn't needed until a scripts @include/@exclude list changes, or when a new user script is created.

In GM's case, for convert2RegExp I don't think that there should be a limit, and to avoid memory leaks when removing a user script, or updating one, the specific @includes/@excludes that are being removed should be removed from the cache, a lazy fix can be to wipe the cache when these events occur.

Photodeus pushed a commit to Photodeus/greasemonkey that referenced this issue Nov 22, 2012
* Define a general purpose memoize decorator, use it for GM_isGreasemonkeyable() and convert2RegExp().
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants