Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chain concurrent gets #73

Closed
kpdecker opened this issue Sep 3, 2014 · 8 comments · Fixed by #79
Closed

Chain concurrent gets #73

kpdecker opened this issue Sep 3, 2014 · 8 comments · Fixed by #79
Assignees
Labels
bug
Milestone

Comments

@kpdecker
Copy link
Contributor

@kpdecker kpdecker commented Sep 3, 2014

Ideally the framework should monitor what get requests are still in flight and chain all such calls to reduce upstream load. I.e.

get('foo', callback1)
get('foo', callback2)
(callback1 called)
(callback2 called)

Should only call the relevant engine method once.

@hueniverse hueniverse added the request label Sep 5, 2014
@hueniverse

This comment has been minimized.

Copy link
Member

@hueniverse hueniverse commented Sep 5, 2014

I assume you mean for only getOrGenerate?

@kpdecker

This comment has been minimized.

Copy link
Contributor Author

@kpdecker kpdecker commented Sep 5, 2014

That is our primary use case but the get primitive seems like it could benefit from this behavior as well.

@chapel

This comment has been minimized.

Copy link

@chapel chapel commented Sep 5, 2014

We have thought about this as well, but it would require having an internal memory cache in catbox itself and would breakdown between various modules.

As an aside, I could see something like CLS being used for this, where after the first call, any subsequent calls in the same call stack/chain could use the in memory cached version. With that said I actually think this would be better served as a module that wrap catbox to provide this functionality, and to let catbox itself focus on what it needs to do.

@kpdecker

This comment has been minimized.

Copy link
Contributor Author

@kpdecker kpdecker commented Sep 5, 2014

How does this require an in-memory cache? You just need a list of pending requests, basically a hash of cache keys and an array of callbacks to call once the data is available. This data need not be kept around any longer than the duration of the request. If another get comes in while there are no pending operations then the cycle is restarted.

My end goal is that this behavior occurs in hapi. I don't quite care where the logic is implemented, although catbox would be easier for for the cases where I use catbox directly.

@chapel

This comment has been minimized.

Copy link

@chapel chapel commented Sep 5, 2014

So even in that case you have the problem of different instances of catbox.
They wouldn't be able to share the result since they wouldn't know about
each other.
On Sep 5, 2014 8:23 AM, "Kevin Decker" notifications@github.com wrote:

How does this require an in-memory cache? You just need a list of pending
requests, basically a hash of cache keys and an array of callbacks to call
once the data is available. This data need not be kept around any longer
than the duration of the request. If another get comes in while there are
no pending operations then the cycle is restarted.

My end goal is that this behavior occurs in hapi. I don't quite care where
the logic is implemented, although catbox would be easier for for the cases
where I use catbox directly.


Reply to this email directly or view it on GitHub
#73 (comment).

@kpdecker

This comment has been minimized.

Copy link
Contributor Author

@kpdecker kpdecker commented Sep 5, 2014

I don't think that different instances of catbox can assume that different cache values are the same for a given key. Even when they are, this is a performance optimization whose goal is to reduce overhead where possible, not to reduce all unnecessary calls to whatever the engine speaks to. 1 cache request per catbox instance at any given time is much nicer on the upstream system than 1 cache request per server request.

@hueniverse

This comment has been minimized.

Copy link
Member

@hueniverse hueniverse commented Sep 5, 2014

I don't want to touch get(). The whole point of the cache engine is to do this really fast. This only makes sense for getOrGenerate() since that's where you have a costly generate method that you want to avoid calling twice.

@hueniverse hueniverse added bug and removed request labels Sep 6, 2014
@hueniverse hueniverse added this to the 3.2.0 milestone Sep 6, 2014
@hueniverse hueniverse self-assigned this Sep 6, 2014
@hueniverse

This comment has been minimized.

Copy link
Member

@hueniverse hueniverse commented Sep 6, 2014

I've decided this is actually a bug. Best case it is just causing duplicated effort. Worst case it is generating conflicting values at the same time with a race condition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.