Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Measuring mailbox size #383

Closed
stefansedich opened this issue Sep 9, 2014 · 13 comments
Closed

Measuring mailbox size #383

stefansedich opened this issue Sep 9, 2014 · 13 comments

Comments

@stefansedich
Copy link
Contributor

After a discussion with Aaron this morning over here: https://github.com/Aaronontheweb/akka-monitoring/issues/2

I thought it would be a good idea to bring this here, does anyone have any ideas on how this can best be implemented in the core? happy to help with implementation if required.

Cheers
Stefan

@stefansedich
Copy link
Contributor Author

As an example this is how I am going about this at the moment, if we could expose the required properties publicly that would be a start, unless there is a more elegant way to do this.

    protected PropertyInfo NumberOfMessagesProperty =     typeof(Mailbox).GetProperty("NumberOfMessages", BindingFlags.Instance | BindingFlags.NonPublic);

    protected void ReportMailboxSize()
    {
        var context = Context as ActorCell;
        if (context == null)
            return;

        var mailbox = context.Mailbox;
        var numberOfMessages = (int) NumberOfMessagesProperty.GetValue(mailbox);

        Context.Gauge("mailbox", numberOfMessages);
    }

@HCanber
Copy link
Contributor

HCanber commented Sep 10, 2014

I don't mind making HasMessages and NumberOfMessages public.
However in Akka JVM they have this comment for NumberOfMessage which is sound to me and something we should have as well.

Should return the current number of messages held in this queue; may
always return 0 if no other value is available efficiently. Do not use
this for testing for presence of messages, use hasMessages instead.

@Aaronontheweb
Copy link
Member

cc @akkadotnet/developers this plays into a larger issue that I'd like to bring up, which is adding some monitoring hooks to measure the internals of Akka.NET in production. Stuff like mailbox queue length, supervisor restarts / shutdowns, transport errors, etc... You can capture a lot of this in userspace via manual logging calls inside actors and the logging system (which is how Akka.Monitoring currently does it) but some of this data you can't really access and it's onerous to capture everything manually inside each actor.

The way this gets handled in canonical Akka is often via AspectJ and bytecode injection - .NET doesn't really have anything that can do that (PostSharp can kind of do this.)

Is there any interest in designing some of our internals to support optional monitoring hooks? The default behavior would be NoOp - no monitoring, so there shouldn't be any overhead, but I'd love to be able to instrument some of these metrics in production.

@rogeralsing
Copy link
Contributor

Yes, we should add some infrastructure for monitoring imo.
Could you make a list of all the interception points you currently need?
Just so we focus effort on points that matters and not just throw interception all over the place just because.

How is your monitoring designed right now?
Using the event stream?

@Aaronontheweb
Copy link
Member

@rogeralsing some of it uses the event stream (unhandled messages and debug statistics) and others require a manual call inside Actor.OnReceive, PreStart, or PostStop inside each actor implementation.

So here's what I would do for a standard monitoring API:

Actors (by type)

  1. AroundReceive
  2. Around Post/Pre Start/Stop
  3. Supervisor directives

Mailboxes

  1. Queue length by actor type

Remoting

  1. Transport associates / disassociates
  2. Network packets
  3. Handled network errors
  4. Failure detector "heartbeat failed"

A bunch of this stuff is available inside internal classes inside the EventBus, so a third party component can't really subscribe to it without an API exposing it.

What I'd propose doing is create a standard Counter\Gauge\Timer increment interface, let's call it IAkkaMetric for now, that gets updated whenever the monitoring hook is invoked. Then it's up to a third-party concrete monitoring implementation to provide an implementation of that interface and a mechanism for pushing those counter values to the monitoring service.

The IAkkaMetric instance for each counter type would be created lazily once and from there it would maintain something like an AtomicCounter internally to update its values - or it can do what Akka.Monitoring.NStatsD currently does and just write to a UDP socket. Depends on how the monitoring plugin is implemented.

@rogeralsing
Copy link
Contributor

I'm all for this, and the lifecycle events should be hooked directly into the actorcell call sites, right?
That would make all of that automatic and don't require the actor to call base methods or such.

At what point should mailbox size be measured? for each posted message?

@wiig-with-a-k
Copy link

Hi, has there been any progress on this? It is a very important area I think for monitoring where messages are backing up, whats slowing down the system.

@AlbertoMonteiro
Copy link

Some news?

@splitice
Copy link

+1

1 similar comment
@alexvaut
Copy link

alexvaut commented Nov 7, 2018

+1

@Aaronontheweb
Copy link
Member

So we support this inside Phobos, Petabridge's Enterprise DevOps Suite for Akka.NET: https://phobos.petabridge.com/articles/monitoring/configuration.html

If you set enable the phobos.monitoring.monitor-mailbox-depth = on we record these metrics and emit them automatically to whatever monitoring back-end you've configured.

I should note here that Phobos is a commercial product and requires a license.

@Aaronontheweb
Copy link
Member

If you want to access this data yourself in Akka.NET today, there's a way to do it but it requires getting access to some of the "internal," yet publicly exposed APIs:

// from within an actor
Context.AsInstanceOf<ActorCell>().Mailbox..MessageQueue.Count);

I think that should allow you to get access to the number, but note that on really high throughput actors this will may have a performance impact as I believe accessing the Count locks the underlying queue temporarily. In the majority of cases though that should be fine - you'll only really see something come up if your actors are processing messages in the millions / second range. Nevertheless though, that's why we have this setting turned off by default inside Phobos.

@alexvaut
Copy link

alexvaut commented Nov 7, 2018

Thanks I will try that, we don't manage millions of events per second but we do load the machine heavily with other processes that can consume a lot of CPU...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants