-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run DELETE on expired session based on a timed check instead of a random factor. #13
Conversation
@voxpelli do you have any feedback on this, if it's possible to merge or not ? Thanks. |
Haven't had time to look at and think about this yet unfortunately. Am I understanding it correctly that the purpose is to lessen the performance impact the random delete might cause? Or are there any other purpose for choosing this new option rather than the old one? If we are to support both we should probably explain in the Readme when one might want to chose this option rather than the standard one so that it's easy for new users to figure out. |
I've already updated the readme to reflect this option and difference between one or the other method. Default behavior is keeping random check for retro-compatibility. I've simply added the options to use a timed check if preferred. The timed check is indeed to lessen the number of delete queries, because lots of requests means more checks, and is not relevant on elapsed time, on which expiry is based |
Instead of internalising different approaches for when to purge expired sessions, as different projects can have varying needs, it might be simpler to just make the "should we purge"-logic extensible: hugowetterberg@1f5e6b0 |
What your suggesting is indeed a good approach for flexibility. Here my reflection on all this: I think the question should be "why should it be an option?". We are talking about the "garbage collection" of elapsed session (since the SELECT makes sure that you never retrieve a timed out session). I don't see the need to cleanup the same data ~5000 times because there is 10000 requests at the same moment; especially because we are comparing a TIMED value. The first request would do the garbage collection for all elapsed sessions, and the 4999 other requests would run for nothing because time hasn't changed. It's an extreme example, but it's to make my point on the fact that the check should be only timed, because we are comparing time values. Providing an option to have your own custom garbage collection is a nice to have, but I can't see when it could be useful to do this differently because the "need" represented here doesn't vary. Garbage collection and that's it. The only option should be to set a delay on which you want garbage collection to occur, in case you have a lot of short sessions occurring in your program that leaves a lot of garbage data in a short period of time. Does any of that makes sense ? Thanks for your input ! |
I think @paquettea's analysis is solid here. I see no reason to issue a DELETE statement more frequently than some minimum threshold (perhaps 1 minute). I also see no reason this particularly needs to be tied to an incoming request arriving. I think basically a |
+1 on @paquettea solution |
I will update my PR soon to use setInterval. |
OK great. FYI a simple configuration optional setting of the interval value
with a default of 1 minute will probably get it done.
|
+1 with caveat: if the aim isn't flexibility it's probably better to just replace the existing purge logic entirely instead of supporting both. Can't see any real benefits with multiple solutions. |
And that was not a "Bohoo!" over my solution not getting traction ;) I just don't see the value proposition in keeping the "random" method. |
haha! don't worry, I understood the right intention with the first comment :) here is the (final ?) solution I propose : |
along with paquettea@044587e |
fixed self this closure issue in delete query callback
ran the last version in my own project and seems to do the trick |
I agree that this solution is a better one (less arbitrary as #15 points out). I've created a new branch where I've pulled it in and where I've done some alterations to it to match the coding style of the rest a bit more: https://github.com/voxpelli/node-connect-pg-simple/tree/session-pruning Changes I've done on top of the code from here (which I squashed and pulled in its entirety):
I also added another branch on top of this branch which added a full-blown linting setup, using the Lintlovin Grunt boilerplate, and that will be connected to Travis CI once the branch for this PR has been merged. Any feedback on the session-pruning branch before I merge it? Will try to merge and release it later during this weekend. |
That's an even better implementation. looking forward for the merge of session-pruning branch ! :) |
We might want to at least consider and document the implications for this
in a clustered/multihost deployment. I don't think there's any harm in
running the DELETE sql too frequently but it's worth consideration I think.
|
Looking at it now, it seems like there's no index on the expire column by default (this module wasn't originally intended for larger scale deployments) – so if anyone wants to do a follow up PR which adds an index to it then there shouldn't be any harm in running the delete-statement often. I'll make sure to mention something about the implications in the changelog entry. |
I now realize that this is going to be a breaking change as it will require additional code to not block graceful shutdowns. By default it will now prune session tokens infinitely no matter if the rest of the app is closed or not while previously it wouldn't do anything unless the rest of the app told it to, which meant it would shutdown the moment the rest of the app shutdown. I'm therefore going to add a |
Released now as |
Since the random method is based on a random factor match per request, more requests means more expiry checks in the database. Since expiry is based on time, I though it would make sense to base the checks frequency on a timed delay instead. This was added as an option, defaulting to the original random factor checks. I've also updated the readme to reflect my changes.