New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized runtime backpressure #2264

Merged
merged 72 commits into from Nov 17, 2017

Conversation

Projects
None yet
5 participants
@SeanTAllen
Member

SeanTAllen commented Oct 9, 2017

This is a first draft of generalized runtime backpressure. A final version would require a changes to TCPConnection and anything else that can become "overloaded" and would need to exert backpressure based on external conditions (such as a slow receiver).

@SeanTAllen SeanTAllen requested a review from sylvanc Oct 9, 2017

@Praetonus

Nice! I've left some comments on implementation details.

Show outdated Hide outdated src/libponyrt/actor/actor.c
}
}
void maybe_mute(pony_ctx_t* ctx, pony_actor_t* to, pony_msg_t* first,

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

The function name should be ponyint_maybe_mute according to the runtime naming conventions.

@Praetonus

Praetonus Oct 9, 2017

Member

The function name should be ponyint_maybe_mute according to the runtime naming conventions.

Show outdated Hide outdated src/libponyrt/actor/actor.c
pony_msg_t* m = first;
while(m != last)

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

Iterating the message chain here is going to be expensive, so I think it would be nice to refactor in order to remove the need for the loop.

In the current state of things, a message chain cannot contain ORCA messages, so a possible alternative would be to take the message chain length as a parameter. If we want to future-proof the function now, the parameter could be the number of application messages in the chain instead.

@Praetonus

Praetonus Oct 9, 2017

Member

Iterating the message chain here is going to be expensive, so I think it would be nice to refactor in order to remove the need for the loop.

In the current state of things, a message chain cannot contain ORCA messages, so a possible alternative would be to take the message chain length as a parameter. If we want to future-proof the function now, the parameter could be the number of application messages in the chain instead.

This comment has been minimized.

@jemc

jemc Oct 9, 2017

Member

If we use that assumption about message chains not containing ORCA messages, we should likely try to add a pony_assert somewhere to ensure that.

@jemc

jemc Oct 9, 2017

Member

If we use that assumption about message chains not containing ORCA messages, we should likely try to add a pony_assert somewhere to ensure that.

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

That's true. I think the best place for that assertion would be in pony_chain.

@Praetonus

Praetonus Oct 9, 2017

Member

That's true. I think the best place for that assertion would be in pony_chain.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

@Praetonus if we had the number of application messages in the chain, that would become much easier. if there is more than 0, we do our check. if there are none, then we don't need our check. i was going to bring that up.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

@Praetonus if we had the number of application messages in the chain, that would become much easier. if there is more than 0, we do our check. if there are none, then we don't need our check. i was going to bring that up.

This comment has been minimized.

@Praetonus

Praetonus Oct 11, 2017

Member

Discussed this with @SeanTAllen, I'm going to implement that change (counting the number of application messages in the chain) since it also requires a change to one of the optimisation passes, and I'll submit the patch in this PR.

@Praetonus

Praetonus Oct 11, 2017

Member

Discussed this with @SeanTAllen, I'm going to implement that change (counting the number of application messages in the chain) since it also requires a change to one of the optimisation passes, and I'll submit the patch in this PR.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 13, 2017

Member

Praetonus' patch was applied.

@SeanTAllen

SeanTAllen Oct 13, 2017

Member

Praetonus' patch was applied.

Show outdated Hide outdated src/libponyrt/actor/actor.c
@@ -212,8 +212,20 @@ bool ponyint_actor_run(pony_ctx_t* ctx, pony_actor_t* actor, size_t batch)
app++;
try_gc(ctx, actor);
// if we become muted as a result of handling a message, bail out now.
if(actor->muted > 0)

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

This should be an atomic_load_explicit with memory_order_relaxed. Atomic operations without an explicit memory order implicitly use memory_order_seq_cst and that's a performance hit.

@Praetonus

Praetonus Oct 9, 2017

Member

This should be an atomic_load_explicit with memory_order_relaxed. Atomic operations without an explicit memory order implicitly use memory_order_seq_cst and that's a performance hit.

Show outdated Hide outdated src/libponyrt/actor/actor.c
@@ -225,6 +237,15 @@ bool ponyint_actor_run(pony_ctx_t* ctx, pony_actor_t* actor, size_t batch)
// We didn't hit our app message batch limit. We now believe our queue to be
// empty, but we may have received further messages.
pony_assert(app < batch);
pony_assert(actor->muted == 0);

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

Show outdated Hide outdated src/libponyrt/actor/actor.c
if(ponyint_messageq_push(&to->q, first, last))
{
if(!has_flag(to, FLAG_UNSCHEDULED))
if(!has_flag(to, FLAG_UNSCHEDULED) && (to->muted == 0)) {

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

Show outdated Hide outdated src/libponyrt/actor/actor.c
// 2. the sender isn't overloaded
// AND
// 3. we are sending to another actor (as compared to sending to self)
if((has_flag(to, FLAG_OVERLOADED) || (to->muted > 0)) &&

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

@Praetonus

Praetonus Oct 9, 2017

Member

atomic_load_explicit here.

Show outdated Hide outdated src/libponyrt/sched/scheduler.c
for(uint32_t i = 0; i < scheduler_count; i++)
{
if(&scheduler[i] != sched)
send_msg(i, SCHED_UNMUTE_ACTOR, (intptr_t)actor);

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

Now that all schedulers can broadcast, send_msg_single is unsafe. Calls to send_msg_single should be replaced by calls to send_msg.

@Praetonus

Praetonus Oct 9, 2017

Member

Now that all schedulers can broadcast, send_msg_single is unsafe. Calls to send_msg_single should be replaced by calls to send_msg.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

does that mean that we should remove send_msg_single entirely? i believe that is the implication but I want to verify.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

does that mean that we should remove send_msg_single entirely? i believe that is the implication but I want to verify.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 10, 2017

Member

I only find one instance in send_msg_all in scheduler.c

updating.

@SeanTAllen

SeanTAllen Oct 10, 2017

Member

I only find one instance in send_msg_all in scheduler.c

updating.

This comment has been minimized.

@Praetonus

Praetonus Oct 11, 2017

Member

Yes, this is what I meant.

@Praetonus

Praetonus Oct 11, 2017

Member

Yes, this is what I meant.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 11, 2017

Member

Its gone!

@SeanTAllen

SeanTAllen Oct 11, 2017

Member

Its gone!

Show outdated Hide outdated src/libponyrt/sched/scheduler.c
if(r == NULL)
{
ponyint_muteset_putindex(&mref->value, sender, index2);
sender->muted += 1;

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

This is currently equivalent to atomic_fetch_add_explicit(&sender->muted, 1, memory_order_seq_cst), i.e. a very expensive operation.

As far as I can see only one scheduler can mute/unmute a given actor at a time, so this can be replaced with

uint8_t muted = atomic_load_explicit(&sender->muted, memory_order_relaxed);
atomic_store_explicit(&sender->muted, muted + 1, memory_order_relaxed);

If I'm wrong and multiple schedulers can modify the muted field of an actor at the same time, this should be atomic_fetch_add_explicit(&sender->muted, 1, memory_order_relaxed) instead.

@Praetonus

Praetonus Oct 9, 2017

Member

This is currently equivalent to atomic_fetch_add_explicit(&sender->muted, 1, memory_order_seq_cst), i.e. a very expensive operation.

As far as I can see only one scheduler can mute/unmute a given actor at a time, so this can be replaced with

uint8_t muted = atomic_load_explicit(&sender->muted, memory_order_relaxed);
atomic_store_explicit(&sender->muted, muted + 1, memory_order_relaxed);

If I'm wrong and multiple schedulers can modify the muted field of an actor at the same time, this should be atomic_fetch_add_explicit(&sender->muted, 1, memory_order_relaxed) instead.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

that's correct. only a single scheduler can mute an actor at a time as it happens on message send.

@SeanTAllen

SeanTAllen Oct 9, 2017

Member

that's correct. only a single scheduler can mute an actor at a time as it happens on message send.

Show outdated Hide outdated src/libponyrt/sched/scheduler.c
void ponyint_sched_unmute(pony_ctx_t* ctx, pony_actor_t* actor, bool inform)
{
// this needs a better name. its not unmuting actor.

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

I'd suggest ponyint_sched_unmute_senders.

@Praetonus

Praetonus Oct 9, 2017

Member

I'd suggest ponyint_sched_unmute_senders.

Show outdated Hide outdated src/libponyrt/sched/scheduler.c
while((muted = ponyint_muteset_next(&mref->value, &i)) != NULL)
{
pony_assert(muted->muted > 0);
muted->muted -= 1;

This comment has been minimized.

@Praetonus

Praetonus Oct 9, 2017

Member

Same as in ponyint_sched_mute, these two lines and the if below can be replaced with

uint8_t muted_count = atomic_load_explicit(&muted->muted, memory_order_relaxed);
pony_assert(muted_count > 0);
muted_count--;
atomic_store_explicit(&muted->muted, muted_count, memory_order_relaxed);

if(muted_count == 0)
...
@Praetonus

Praetonus Oct 9, 2017

Member

Same as in ponyint_sched_mute, these two lines and the if below can be replaced with

uint8_t muted_count = atomic_load_explicit(&muted->muted, memory_order_relaxed);
pony_assert(muted_count > 0);
muted_count--;
atomic_store_explicit(&muted->muted, muted_count, memory_order_relaxed);

if(muted_count == 0)
...
@slfritchie

This comment has been minimized.

Show comment
Hide comment
@slfritchie

slfritchie Oct 10, 2017

Contributor

@SeanTAllen Would you consider also adding some variation of https://gist.github.com/slfritchie/0dab74fd729b7ecdd2a11c32c1f984cb?

Contributor

slfritchie commented Oct 10, 2017

@SeanTAllen Would you consider also adding some variation of https://gist.github.com/slfritchie/0dab74fd729b7ecdd2a11c32c1f984cb?

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 11, 2017

Member

@slfritchie i think that's reasonable.

do you think coarse grained tracking on the number of times an actor is overloaded or muted would be interesting? (also overloading cleared and unmuted)

Member

SeanTAllen commented Oct 11, 2017

@slfritchie i think that's reasonable.

do you think coarse grained tracking on the number of times an actor is overloaded or muted would be interesting? (also overloading cleared and unmuted)

Show outdated Hide outdated src/libponyrt/sched/scheduler.c
muted_count--;
atomic_store_explicit(&muted->muted, muted_count, memory_order_relaxed);
if (muted->muted == 0)

This comment has been minimized.

@Praetonus

Praetonus Oct 11, 2017

Member

This needs to either be an atomic_load_explicit, or use the muted_count local.

@Praetonus

Praetonus Oct 11, 2017

Member

This needs to either be an atomic_load_explicit, or use the muted_count local.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 11, 2017

Member

thanks i missed that

@SeanTAllen

SeanTAllen Oct 11, 2017

Member

thanks i missed that

@slfritchie

This comment has been minimized.

Show comment
Hide comment
@slfritchie

slfritchie Oct 11, 2017

Contributor

@SeanTAllen tl;dr: yes.

DTrace (and presumably SystemTap) permit easy dynamic probes at entry & exit to any function. This code is structured almost enough dynamic function entry probes could tell you most of what you'd like to know. Overload & not overload have dedicated functions. But mute & unmute do not. Muting status changes are buried far inside of ponyint_sched_mute() and ponyint_sched_unmute_senders().

The code could be restructured to give dedicated small functions for the actual mute state changes; then dynamic function entry probes are easy. However, there's infrastructure value in defining static probes for important events in the system. These new events are the kinds of thing that affect scheduling, and visibility into scheduling is Good.

Contributor

slfritchie commented Oct 11, 2017

@SeanTAllen tl;dr: yes.

DTrace (and presumably SystemTap) permit easy dynamic probes at entry & exit to any function. This code is structured almost enough dynamic function entry probes could tell you most of what you'd like to know. Overload & not overload have dedicated functions. But mute & unmute do not. Muting status changes are buried far inside of ponyint_sched_mute() and ponyint_sched_unmute_senders().

The code could be restructured to give dedicated small functions for the actual mute state changes; then dynamic function entry probes are easy. However, there's infrastructure value in defining static probes for important events in the system. These new events are the kinds of thing that affect scheduling, and visibility into scheduling is Good.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 11, 2017

Member

Ive found a couple problems with this implementation.

  1. program termination doesnt take into account that there might be muted actors (unscheduled) which will become scheduled again, which can lead to early program termination. the fewer actors running, the more like that is to occur. Really we need to know if there are any muted actors before termination. because if there are, we should keep trying to steal actors rather than exiting that scheduler thread.

  2. the incrementing and decrementing of mute value for an actor isnt thread safe. more than 1 scheduler could try to decrement the mute value at a time. which could result in a data race and FUN. we’d probably want a CAS operation for that OR… what would also solve the issue 1. we need an actor like the cycle detector that handles all muting and unmuting. and can know if there are any “live” muted actors around (in which case a scheduler shouldn’t exit)

Member

SeanTAllen commented Oct 11, 2017

Ive found a couple problems with this implementation.

  1. program termination doesnt take into account that there might be muted actors (unscheduled) which will become scheduled again, which can lead to early program termination. the fewer actors running, the more like that is to occur. Really we need to know if there are any muted actors before termination. because if there are, we should keep trying to steal actors rather than exiting that scheduler thread.

  2. the incrementing and decrementing of mute value for an actor isnt thread safe. more than 1 scheduler could try to decrement the mute value at a time. which could result in a data race and FUN. we’d probably want a CAS operation for that OR… what would also solve the issue 1. we need an actor like the cycle detector that handles all muting and unmuting. and can know if there are any “live” muted actors around (in which case a scheduler shouldn’t exit)

@Praetonus

This comment has been minimized.

Show comment
Hide comment
@Praetonus

Praetonus Oct 12, 2017

Member

@SeanTAllen The first problem can be solved by two small modifications to the workstealing and quiescence detection algorithm.

  1. A scheduler shouldn't send SCHED_BLOCK if the size of its mutemap isn't 0.
  2. When a scheduler is looping in steal and reschedules a previously muted actor as a result of receiving SCHED_UNMUTE_ACTOR, it should resume its normal execution.

Could you detail the circumstances in which the second problem can occur? It seems to me that a given actor can only be in one mutemap at a time.

Member

Praetonus commented Oct 12, 2017

@SeanTAllen The first problem can be solved by two small modifications to the workstealing and quiescence detection algorithm.

  1. A scheduler shouldn't send SCHED_BLOCK if the size of its mutemap isn't 0.
  2. When a scheduler is looping in steal and reschedules a previously muted actor as a result of receiving SCHED_UNMUTE_ACTOR, it should resume its normal execution.

Could you detail the circumstances in which the second problem can occur? It seems to me that a given actor can only be in one mutemap at a time.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 12, 2017

Member

@Praetonus excellent ideas. and now that I am a little less tired, I realize you are correct about an actor only being able to be in a single mutemap at a time. I need to add comments to that effect.

Member

SeanTAllen commented Oct 12, 2017

@Praetonus excellent ideas. and now that I am a little less tired, I realize you are correct about an actor only being able to be in a single mutemap at a time. I need to add comments to that effect.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 12, 2017

Member

At the next sync, I'd like to discuss what sort of documentation this might need. Inclusion in the tutorial? On the website? Just notes in the code?

Member

SeanTAllen commented Oct 12, 2017

At the next sync, I'd like to discuss what sort of documentation this might need. Inclusion in the tutorial? On the website? Just notes in the code?

@Praetonus

This comment has been minimized.

Show comment
Hide comment
@Praetonus

Praetonus Oct 12, 2017

Member

@SeanTAllen Here's the diff containing the changes needed to remove the message chain iteration in ponyint_maybe_mute: https://gist.github.com/Praetonus/e6f9d24d1f88e4d1fbfd97dbdc340fef

Member

Praetonus commented Oct 12, 2017

@SeanTAllen Here's the diff containing the changes needed to remove the message chain iteration in ponyint_maybe_mute: https://gist.github.com/Praetonus/e6f9d24d1f88e4d1fbfd97dbdc340fef

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 13, 2017

Member

@Praetonus patch applied. looking good.

Member

SeanTAllen commented Oct 13, 2017

@Praetonus patch applied. looking good.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 19, 2017

Member

@Praetonus everything we talked about is in place. Sylvan helped me track down a bug.

I have to add the ability for actors to manual indicate they can't make progress so that is included in the backpressure system and perf testing. but this is getting close.

right now "can't make progress" would be a TCPConnection that is unable to send (backpressure for example).

Member

SeanTAllen commented Oct 19, 2017

@Praetonus everything we talked about is in place. Sylvan helped me track down a bug.

I have to add the ability for actors to manual indicate they can't make progress so that is included in the backpressure system and perf testing. but this is getting close.

right now "can't make progress" would be a TCPConnection that is unable to send (backpressure for example).

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 21, 2017

Member

@slfritchie I added the telemetry info. Can you have a look to make sure I did it correctly?

Member

SeanTAllen commented Oct 21, 2017

@slfritchie I added the telemetry info. Can you have a look to make sure I did it correctly?

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 21, 2017

Member

Things I need to do:

  • add docs for the Backpressure package
  • performance testing using Wallaroo

Please give this another review. It's ready for more feedback.

Question, how, if at all should we document this somewhat advanced feature beyond the package level docs in Backpressure?

Member

SeanTAllen commented Oct 21, 2017

Things I need to do:

  • add docs for the Backpressure package
  • performance testing using Wallaroo

Please give this another review. It's ready for more feedback.

Question, how, if at all should we document this somewhat advanced feature beyond the package level docs in Backpressure?

Show outdated Hide outdated src/libponyrt/actor/actor.c
if(ponyint_messageq_push_single(&to->q, first, last))
{
if(!has_flag(to, FLAG_UNSCHEDULED))
if(!has_flag(to, FLAG_UNSCHEDULED) &&
(atomic_load_explicit(&to->muted, memory_order_relaxed) == 0)) {

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 21, 2017

Member

should i move this to a nicely named function?

@SeanTAllen

SeanTAllen Oct 21, 2017

Member

should i move this to a nicely named function?

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 21, 2017

Member

I just updated ProcessMonitor to use backpressure mechanism to prevent unbounded pending queue growth. This is a breaking API change to the constructor as a ApplyReleaseBackpressureAuth token is now required.

Member

SeanTAllen commented Oct 21, 2017

I just updated ProcessMonitor to use backpressure mechanism to prevent unbounded pending queue growth. This is a breaking API change to the constructor as a ApplyReleaseBackpressureAuth token is now required.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 21, 2017

Member

Ok with the updates to TCPConnection documentation and with the addition of backpressure to ProcessMonitor, it appears to me that any of the "runaway memory growth" actors in the standard library have some sort of backpressure coverage.

Member

SeanTAllen commented Oct 21, 2017

Ok with the updates to TCPConnection documentation and with the addition of backpressure to ProcessMonitor, it appears to me that any of the "runaway memory growth" actors in the standard library have some sort of backpressure coverage.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 22, 2017

Member

There's a problem with work stealing and block messages. At the time a scheduler enters into steal() it might have a muted actor. This will cause it to not send a block message. When that actor is unmuted, it might be stolen by another scheduler, leaving the existing scheduler blocked but looping in steal, without ever being able to exit and without ever having sent a block message.

Member

SeanTAllen commented Oct 22, 2017

There's a problem with work stealing and block messages. At the time a scheduler enters into steal() it might have a muted actor. This will cause it to not send a block message. When that actor is unmuted, it might be stolen by another scheduler, leaving the existing scheduler blocked but looping in steal, without ever being able to exit and without ever having sent a block message.

@slfritchie

This comment has been minimized.

Show comment
Hide comment
@slfritchie

slfritchie Oct 23, 2017

Contributor

I added the telemetry info. Can you have a look to make sure I did it correctly?

The change to examples/dtrace/telemetry.d looks fine, @SeanTAllen.

Contributor

slfritchie commented Oct 23, 2017

I added the telemetry info. Can you have a look to make sure I did it correctly?

The change to examples/dtrace/telemetry.d looks fine, @SeanTAllen.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 23, 2017

Member

We did performance testing with Wallaroo.

Using our standard testing app under a normal load of 3 million messages a 2nd, we saw no change in latencies. Awesome!

Member

SeanTAllen commented Oct 23, 2017

We did performance testing with Wallaroo.

Using our standard testing app under a normal load of 3 million messages a 2nd, we saw no change in latencies. Awesome!

Show outdated Hide outdated src/libponyrt/actor/actor.c
ponyint_sched_unmute_senders(ctx, actor, true);
}
PONY_API void pony_apply_backpressure()

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

I think this should follow the convention for runtime functions and take a pony_ctx_t* parameter.

@Praetonus

Praetonus Oct 25, 2017

Member

I think this should follow the convention for runtime functions and take a pony_ctx_t* parameter.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

This was intentionally done this way to make calling from Pony straightforward. Sylvan C and I spent a while coming up w this approach

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

This was intentionally done this way to make calling from Pony straightforward. Sylvan C and I spent a while coming up w this approach

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

Ok, that makes sense.

@Praetonus

Praetonus Oct 25, 2017

Member

Ok, that makes sense.

Show outdated Hide outdated src/libponyrt/actor/actor.c
set_flag(pony_ctx()->current, FLAG_UNDER_PRESSURE);
}
PONY_API void pony_release_backpressure()

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

See above

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

See above

Show outdated Hide outdated src/libponyrt/actor/actor.c
ponyint_sched_unmute_senders(ctx, ctx->current, true);
}
bool ponyint_triggers_muting(pony_actor_t* actor)

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

Given this only needs the actor it’s unclear to me why we should do that

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

Given this only needs the actor it’s unclear to me why we should do that

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

That's true, I missed that.

@Praetonus

Praetonus Oct 25, 2017

Member

That's true, I missed that.

Show outdated Hide outdated src/libponyrt/actor/actor.c
ponyint_is_muted(actor);
}
bool ponyint_is_muted(pony_actor_t* actor)

This comment has been minimized.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

@Praetonus

Praetonus Oct 25, 2017

Member

Same as above.

This comment has been minimized.

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

See above

@SeanTAllen

SeanTAllen Oct 25, 2017

Member

See above

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 26, 2017

Member

I want to add DTrace probes for when an actor is muted and unmuted. However, there's nothing "magical" about that. Really there are already probes for that, an actor is UNSCHEDULED when muted and SCHEDULED when unmuted. That would appear to be the correct probes to add for those.

And, then, probes for...

  • actor is overloaded
  • actor no longer overload
  • actor under pressure
  • actor no longer under pressure

which should be the last addition to this before merging.

@slfritchie @sylvanc @Praetonus @jemc

Do those seem like reasonable plans for probes?

EDIT: those are the probes I've added and pushed sound good?

Member

SeanTAllen commented Oct 26, 2017

I want to add DTrace probes for when an actor is muted and unmuted. However, there's nothing "magical" about that. Really there are already probes for that, an actor is UNSCHEDULED when muted and SCHEDULED when unmuted. That would appear to be the correct probes to add for those.

And, then, probes for...

  • actor is overloaded
  • actor no longer overload
  • actor under pressure
  • actor no longer under pressure

which should be the last addition to this before merging.

@slfritchie @sylvanc @Praetonus @jemc

Do those seem like reasonable plans for probes?

EDIT: those are the probes I've added and pushed sound good?

@slfritchie

This comment has been minimized.

Show comment
Hide comment
@slfritchie

slfritchie Oct 26, 2017

Contributor

@SeanTAllen The probes added in commit 3b75716 are mostly reasonable. If I were to qubble, it would be that the new probes are related pairs, but the naming convention of the pairs isn't the same. One uses 'condition' and 'condition-change' while the other uses 'changeA-condition' and 'condition-changeB'. To find names of probes, DTrace users frequently do queries based on glob-style wildcard matches, e.g., dtrace -ln 'pony$target:::condition*'. A consistent naming convention makes it easier to craft helpful glob patterns.

A quick fix: the change the names of the new pressure probes to be 'actor-pressure' and 'actor-pressure-released'. A bigger change: change the probes to be simply 'condition' and also add an argument of 0=condition off or 1=condition on.

Contributor

slfritchie commented Oct 26, 2017

@SeanTAllen The probes added in commit 3b75716 are mostly reasonable. If I were to qubble, it would be that the new probes are related pairs, but the naming convention of the pairs isn't the same. One uses 'condition' and 'condition-change' while the other uses 'changeA-condition' and 'condition-changeB'. To find names of probes, DTrace users frequently do queries based on glob-style wildcard matches, e.g., dtrace -ln 'pony$target:::condition*'. A consistent naming convention makes it easier to craft helpful glob patterns.

A quick fix: the change the names of the new pressure probes to be 'actor-pressure' and 'actor-pressure-released'. A bigger change: change the probes to be simply 'condition' and also add an argument of 0=condition off or 1=condition on.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Oct 28, 2017

Member

I'm ready for this be squashed and merged if everyone wants to sign off.

@slfritchie I'm going to leave DTrace as is now. I think it would be better to come back and revisit all the DTrace probes and make sure there is a consistent scheme for how to do things (then make those changes and document). If you feel like taking that on, let me know.

Member

SeanTAllen commented Oct 28, 2017

I'm ready for this be squashed and merged if everyone wants to sign off.

@slfritchie I'm going to leave DTrace as is now. I think it would be better to come back and revisit all the DTrace probes and make sure there is a consistent scheme for how to do things (then make those changes and document). If you feel like taking that on, let me know.

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Nov 16, 2017

Member

@jemc @Praetonus @mfelsche @sylvanc i added comments and did some clean up. please have a look. where should there be additional explanation, comments etc.

Member

SeanTAllen commented Nov 16, 2017

@jemc @Praetonus @mfelsche @sylvanc i added comments and did some clean up. please have a look. where should there be additional explanation, comments etc.

SeanTAllen added some commits Nov 16, 2017

@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Nov 16, 2017

Member

Latest perf testing round looks good. On the cleanup and blog post etc.

Member

SeanTAllen commented Nov 16, 2017

Latest perf testing round looks good. On the cleanup and blog post etc.

@EpicEric EpicEric referenced this pull request Nov 16, 2017

Closed

Backpressure support #6

jemc added a commit that referenced this pull request Nov 17, 2017

A microbenchmark for measuring message passing rates in the Pony runt…
…ime.

A microbenchmark for measuring message passing rates in the Pony runtime.

This microbenchmark executes a sequence of intervals.  During an interval,
1 second long by default, the SyncLeader actor sends an initial
set of ping messages to a static set of Pinger actors.  When a Pinger
actor receives a ping() message, the Pinger will randomly choose
another Pinger to forward the ping() message.  This technique limits
the total number of messages "in flight" in the runtime to avoid
causing unnecessary memory consumption & overhead by the Pony runtime.

This small program has several intended uses:

* Demonstrate use of three types of actors in a Pony program: a timer,
  a SyncLeader, and many Pinger actors.

* As a stress test for Pony runtime development, for example, finding
  deadlocks caused by experiments in the "Generalized runtime
  backpressure" work in pull request
  #2264

* As a stress test for measuring message send & receive overhead for
  experiments in the "Add DTrace probes for all message push and pop
  operations" work in pull request
  #2295
@SeanTAllen

This comment has been minimized.

Show comment
Hide comment
@SeanTAllen

SeanTAllen Nov 17, 2017

Member

I'm planning on squashing and merging this today. Here's the planned commit comment. Anything else that should be included? If no, I'll get this merged down then start working on a blog post that would announce the feature.


This commit has backpressure to Pony runtime scheduling.

Prior to this commit, it was possible to create Pony programs that would be able to cause runaway memory growth due to a producer/consumer imbalance in message sending. There are a variety of actor topologies that could cause the problem.

Because Pony actor queues are unbounded, runaway memory growth is possible. This commit contains a program that demonstrates this. examples/overload has a large number of actors sending to a single actor. Under the original scheduler algo, each of these actors would receive a fairly equivalent number of chances to process messages. For each time an actor is given access to the scheduler, it is allowed to process up to batch size number of messages. The default for batch size is 100. The overload example many many actors sending to a single actor where it can't keep up with the send.

This commit adjusts the Pony scheduler to apply backpressure. The basic idea is:

1- Pony message queues are unbounded
2- Memory can grow without end if an actor isn't able to keep up with the incoming messages
3- We need a way to detect if an actor is overloaded and if it is, apply backpressure

With this commit, we apply backpressure according to the following rules:

1- If an actor processes batch size application messages then it is overloaded. It wasn't able to drain its message queue during a scheduler run.
2- Sending to an overloaded actor will result in the sender being "muted"
3- Muting means that an actor won't be scheduled for a period of time allowing overloaded actors to catch up

Particular details on this

1- Sending to an overloaded or muted actor will result in the sender being muted unless the sender is overloaded.
2- Muted actors will remain unscheduled until any actors that they sent to that were muted/overloaded are no longer muted/overloaded

With this commit, the basics of backpressure are in place. Still to come:

backpressure isn't currently applied from the cycle detector so its queue can still grow in an unbounded fashion. More work/thought needs to go into addressing that problem.

Its possible that due to implementation bugs that this commit results in deadlocks for some actor topologies. I found a number of implementation issues that had to be fixed after my first pass. The basic algo though should be fine.

There are a number of additional work items that could be added on to the basic scheme. Some might turn out to be actual improvements, some might turn out to not make sense.

1- Allow for notification of senders when they send to a muted/overloaded actor. This would allow application level decisions on possible load shedding or other means to address the underlying imbalance.

2- Allow an actor to know that it has become overloaded so it can take application level

3- Allow actors to have different batch sizes that might result in better performance for some actor topologies

This work was performance tested at Wallaroo Labs and was found under heavy loads to have no noticeable impact on performance.

Member

SeanTAllen commented Nov 17, 2017

I'm planning on squashing and merging this today. Here's the planned commit comment. Anything else that should be included? If no, I'll get this merged down then start working on a blog post that would announce the feature.


This commit has backpressure to Pony runtime scheduling.

Prior to this commit, it was possible to create Pony programs that would be able to cause runaway memory growth due to a producer/consumer imbalance in message sending. There are a variety of actor topologies that could cause the problem.

Because Pony actor queues are unbounded, runaway memory growth is possible. This commit contains a program that demonstrates this. examples/overload has a large number of actors sending to a single actor. Under the original scheduler algo, each of these actors would receive a fairly equivalent number of chances to process messages. For each time an actor is given access to the scheduler, it is allowed to process up to batch size number of messages. The default for batch size is 100. The overload example many many actors sending to a single actor where it can't keep up with the send.

This commit adjusts the Pony scheduler to apply backpressure. The basic idea is:

1- Pony message queues are unbounded
2- Memory can grow without end if an actor isn't able to keep up with the incoming messages
3- We need a way to detect if an actor is overloaded and if it is, apply backpressure

With this commit, we apply backpressure according to the following rules:

1- If an actor processes batch size application messages then it is overloaded. It wasn't able to drain its message queue during a scheduler run.
2- Sending to an overloaded actor will result in the sender being "muted"
3- Muting means that an actor won't be scheduled for a period of time allowing overloaded actors to catch up

Particular details on this

1- Sending to an overloaded or muted actor will result in the sender being muted unless the sender is overloaded.
2- Muted actors will remain unscheduled until any actors that they sent to that were muted/overloaded are no longer muted/overloaded

With this commit, the basics of backpressure are in place. Still to come:

backpressure isn't currently applied from the cycle detector so its queue can still grow in an unbounded fashion. More work/thought needs to go into addressing that problem.

Its possible that due to implementation bugs that this commit results in deadlocks for some actor topologies. I found a number of implementation issues that had to be fixed after my first pass. The basic algo though should be fine.

There are a number of additional work items that could be added on to the basic scheme. Some might turn out to be actual improvements, some might turn out to not make sense.

1- Allow for notification of senders when they send to a muted/overloaded actor. This would allow application level decisions on possible load shedding or other means to address the underlying imbalance.

2- Allow an actor to know that it has become overloaded so it can take application level

3- Allow actors to have different batch sizes that might result in better performance for some actor topologies

This work was performance tested at Wallaroo Labs and was found under heavy loads to have no noticeable impact on performance.

@jemc

This comment has been minimized.

Show comment
Hide comment
@jemc

jemc Nov 17, 2017

Member

Excellent! 👍

One small typo I noticed in the first line: I think "has" should be "adds".

Member

jemc commented Nov 17, 2017

Excellent! 👍

One small typo I noticed in the first line: I think "has" should be "adds".

@SeanTAllen SeanTAllen merged commit 1104a6c into master Nov 17, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@SeanTAllen SeanTAllen deleted the sean-runtime-backpressure branch Nov 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment