Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache subscribers list to improve performance #1320

Closed
4 tasks
johnsimons opened this issue Jul 18, 2013 · 16 comments
Closed
4 tasks

Cache subscribers list to improve performance #1320

johnsimons opened this issue Jul 18, 2013 · 16 comments

Comments

@johnsimons
Copy link
Member

Background

At the moment we are querying the storage every time we publish a message to retrieve the list of subscribers.
But this list does not actually change that much frequently once the endpoint is started.
So it is a good candidate for some good old caching 😉

Proposed Solution

  • Have a background thread that keeps in sync a cached list of subscribers
  • Make this polling configurable with something like:
<appSettings>   
   <add key="NServiceBus/Publisher/SubscribersListPollingInterval" value="# of seconds"/>
</appSettings>
  • Make default polling interval 60 secs ?
  • At start-up we may need to poll more frequently because we may be receiving subscriber requests (this is true specially while debugging pub/sub sample).
    So maybe we pool every 5 secs for the first minute and then change it to use the set polling interval.
@SimonCropp
Copy link
Contributor

another perspective. for many people the list of subscribers is effectively static an known at startup time. so a cache forever would be preferable for these people

@mauroservienti
Copy link
Member

When using RavenDB the server/client infrastructure caches things for us by default, and we can use, for a specific query, the aggressive caching feature. I also like Kijana suggestion to use the out-of-the-box push capability of RavenDB using the Changes API.

@danielmarbach
Copy link
Contributor

I'm also a fan of using the default caching of the persistency layer. NH has different caching capabilities and as mentioned also ravendb. Don't implement it yourself.

Am 19.07.2013 um 07:40 schrieb mauroservienti notifications@github.com:

When using RavenDB the server/client infrastructure caches things for us by default, and we can use, for a specific query, the aggressive caching feature. I also like Kijana suggestion to use the out-of-the-box push capability of RavenDB using the Changes API.


Reply to this email directly or view it on GitHub.

@johnsimons
Copy link
Member Author

Guys, there is nothing preventing us from using the native caches or more efficient native APIs.

For example, to modify RavenDB to use the new "Changes API" all you would need is change the current implementation.
So what I'm saying is that even though we have a background thread polling it doesn't mean that we are going to hit the storage every time, we are still calling GetSubscriberAddressesForMessage

@andreasohlund
Copy link
Member

How about adding a optional "push enabled storage" interface as well so storages that supports it can bypass the polling?

Sent from my iPhone

On 19 jul 2013, at 07:53, John Simons notifications@github.com wrote:

Guys, there is nothing preventing us from using the native caches or more efficient native APIs.

For example, to modify RavenDB to use the new "Changes API" all you would need is change the current implementation.
So what I'm saying is that even though we have a background thread polling it doesn't mean that we are going to hit the storage every time, we are still calling GetSubscriberAddressesForMessage


Reply to this email directly or view it on GitHub.

@mauroservienti
Copy link
Member

Do we really need it?

As John said the infrastructure will only call GetSubscriberAddressesForMessagehttps://github.com/NServiceBus/NServiceBus/blob/master/src/NServiceBus.Core/Unicast/Subscriptions/MessageDrivenSubscriptions/ISubscriptionStorage.cs#L31 so is an internal matter of the subscription component to decide how to cache/refresh/poll/donothing.

Or am I missing something?

.m

From: Andreas Öhlund [mailto:notifications@github.com]
Sent: venerdì 19 luglio 2013 09.43
To: NServiceBus/NServiceBus
Cc: Mauro Servienti
Subject: Re: [NServiceBus] Cache subscribers list to improve performance (#1320)

How about adding a optional "push enabled storage" interface as well so storages that supports it can bypass the polling?

Sent from my iPhone

On 19 jul 2013, at 07:53, John Simons <notifications@github.commailto:notifications@github.com> wrote:

Guys, there is nothing preventing us from using the native caches or more efficient native APIs.

For example, to modify RavenDB to use the new "Changes API" all you would need is change the current implementation.
So what I'm saying is that even though we have a background thread polling it doesn't mean that we are going to hit the storage every time, we are still calling GetSubscriberAddressesForMessage


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1320#issuecomment-21235994.

@johnsimons
Copy link
Member Author

How about adding a optional "push enabled storage" interface as well so storages that supports it can bypass the polling?

Let's please not add more marker interfaces!

@johnsimons
Copy link
Member Author

The only issue is that we end-up having 2 caches!
The one maintained by the persister and the one maintained by the
framework, but I don't really see a problem with that.

On 19 July 2013 17:47, mauroservienti notifications@github.com wrote:

Do we really need it?

As John said the infrastructure will only call
GetSubscriberAddressesForMessage<
https://github.com/NServiceBus/NServiceBus/blob/master/src/NServiceBus.Core/Unicast/Subscriptions/MessageDrivenSubscriptions/ISubscriptionStorage.cs#L31>
so is an internal matter of the subscription component to decide how to
cache/refresh/poll/donothing.

Or am I missing something?

.m

From: Andreas Öhlund [mailto:notifications@github.com]
Sent: venerdì 19 luglio 2013 09.43
To: NServiceBus/NServiceBus
Cc: Mauro Servienti
Subject: Re: [NServiceBus] Cache subscribers list to improve performance
(#1320)

How about adding a optional "push enabled storage" interface as well so
storages that supports it can bypass the polling?

Sent from my iPhone

On 19 jul 2013, at 07:53, John Simons <notifications@github.com<mailto:
notifications@github.com>> wrote:

Guys, there is nothing preventing us from using the native caches or
more efficient native APIs.

For example, to modify RavenDB to use the new "Changes API" all you
would need is change the current implementation.
So what I'm saying is that even though we have a background thread
polling it doesn't mean that we are going to hit the storage every time, we
are still calling GetSubscriberAddressesForMessage


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/NServiceBus/NServiceBus/issues/1320#issuecomment-21235994>.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1320#issuecomment-21236146
.

Regards
John Simons
NServiceBus

@andreasohlund
Copy link
Member

I was talking conceptually:)

Sent from my iPhone

On 19 jul 2013, at 09:47, John Simons notifications@github.com wrote:

How about adding a optional "push enabled storage" interface as well so storages that supports it can bypass the polling?

Let's please not add more marker interfaces!


Reply to this email directly or view it on GitHub.

@andreasohlund
Copy link
Member

This was my main concern. Also the delay. So even if you use Raven push you would still have to wait for the poller, feels like a suboptimal solution to me

Sent from my iPhone

On 19 jul 2013, at 09:50, John Simons notifications@github.com wrote:

The only issue is that we end-up having 2 caches!
The one maintained by the persister and the one maintained by the
framework, but I don't really see a problem with that.

On 19 July 2013 17:47, mauroservienti notifications@github.com wrote:

Do we really need it?

As John said the infrastructure will only call
GetSubscriberAddressesForMessage<
https://github.com/NServiceBus/NServiceBus/blob/master/src/NServiceBus.Core/Unicast/Subscriptions/MessageDrivenSubscriptions/ISubscriptionStorage.cs#L31>
so is an internal matter of the subscription component to decide how to
cache/refresh/poll/donothing.

Or am I missing something?

.m

From: Andreas Öhlund [mailto:notifications@github.com]
Sent: venerdì 19 luglio 2013 09.43
To: NServiceBus/NServiceBus
Cc: Mauro Servienti
Subject: Re: [NServiceBus] Cache subscribers list to improve performance
(#1320)

How about adding a optional "push enabled storage" interface as well so
storages that supports it can bypass the polling?

Sent from my iPhone

On 19 jul 2013, at 07:53, John Simons <notifications@github.com<mailto:
notifications@github.com>> wrote:

Guys, there is nothing preventing us from using the native caches or
more efficient native APIs.

For example, to modify RavenDB to use the new "Changes API" all you
would need is change the current implementation.
So what I'm saying is that even though we have a background thread
polling it doesn't mean that we are going to hit the storage every time, we
are still calling GetSubscriberAddressesForMessage


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub<
https://github.com/NServiceBus/NServiceBus/issues/1320#issuecomment-21235994>.


Reply to this email directly or view it on GitHubhttps://github.com//issues/1320#issuecomment-21236146
.

Regards
John Simons
NServiceBus

Reply to this email directly or view it on GitHub.

@johnsimons
Copy link
Member Author

Agree 😄

@eben-roux
Copy link

If I may:

For a production environment adding subscribers and/or publishers is a non-trivial affair (as with most things in production). So the subscriber list should be 100% consistent across the various publishers. Polling introduces some blocking overhead for a couple of milliseconds every so often along with a slight window of inconsistency (low probability).

So (manual or automated)

  1. When a new subscriber is introduced on existing events then the store should be updated and the publishers restarted. Well, the publisher for the relevant event, really :)

  2. When a new event is introduced then it stands to reason that at least on subscriber goes along. The subscription store is updated and the new/updated event publisher is deployed so it starts with the consistent list.

This really shouldn't happen often and the small overhead in ensuring 100% consistency is worth caching forever. The subscription management implementation used should be responsible for the caching (however it chooses to do so).

@chrisbednarski
Copy link
Contributor

Is there a need for polling at all?

Unless the list of subscribers is updated outside of the NSB framework, couldn't the cached list of subscribers be retrieved once (at startup) and then updated on each subscribe/unsubscribe message?

This would get a little more complex when using the master node/distributor model ... but should work if the master node writes to the subscriber storage and forwards subscriptions to each worker node (for local caching)

@andreasohlund
Copy link
Member

There could be other reason for sharing storage but without the distributor
in play. Eg OldCRM and NewCRM sharing the same subscription DB. So I don't
think we can assume that all subscription requests will arrive at the same
endpoint at all times

On Mon, Dec 30, 2013 at 11:48 AM, chrisbednarski
notifications@github.comwrote:

Is there a need for polling at all?

Unless subscribers are updated outside of the NSB framework, couldn't the
cached list of subscribers be retrieved once (at startup) and then updated
on each subscribe/unsubscribe message?

This would get a little more complex when using the master
node/distributor model ... but should work if the master node writes to the
subscriber storage and forwards subscriptions to each worker node (for
local caching)


Reply to this email directly or view it on GitHubhttps://github.com//issues/1320#issuecomment-31340873
.

@andreasohlund
Copy link
Member

Performance tests show that we're quite good (924 msg/s with the msmq + inmemory and 724 with Raven)

This is with a local raven , and we haven't tested sql yet.

@andreasohlund
Copy link
Member

With the NH caching in v5 and the perf of the others being good I'd say we can close this one?

@johnsimons johnsimons modified the milestone: Backlog Oct 3, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants