New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Short call leakage mitigation (v2) #610
Conversation
2eb660f
to
1758779
Compare
This should now avoid call setup rate issues. |
e34f523
to
f2578b5
Compare
By delegating call routing to Call actors
f2578b5
to
4f81c9a
Compare
4f81c9a
to
01050a6
Compare
Avoids leaking call actors in cases where the call ended before it was dispatched and we missed the real End event.
They are racey and add a blocking operation where we might want to be totally async. Dispatching events to dead calls is not interesting in logs; the window in which this condition applies is so slim it almost never occurs and it is not actionable.
01050a6
to
5e7b65d
Compare
I've tested this in a M.R.C. test app, and found the following in both JRuby and CRuby:
Thank you for doing this! 👍 |
I am trying to speed up our Sippy Cup testing by running a bunch of scenarios in parallel. While I never had any trouble with a single SIPp instance, possibly because it staggers the calls ever so slightly, launching several at the same time causes calls to jam up. It's fine if I add a short sleep between them. I traced the problem through to this line in Adhearsion::Events.trigger_immediately :call_routed, call: call, route: self Since this line isn't critical, commenting it out makes things work perfectly. It looks like this pull request might deal with this issue so I'll try rebasing it. Is it otherwise good to go? Was there a reason it wasn't merged? |
I don’t know why this wasn’t merged. I don’t object to it as long as the tests pass. |
Looks good to me, but are not the tests failing? |
Thanks for the info. I've just rebased it, trying to get the tests passing. Note that I'm doing this against 3.0.0.rc1, not using v2 any more. I'll file a new pull request when ready. |
@@ -7,15 +7,15 @@ module Adhearsion | |||
# This manages the list of calls the Adhearsion service receives | |||
class Calls | |||
def initialize | |||
@mutex = ::Monitor.new | |||
@mutex = ::Mutex.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After rebasing this on 3.0.0.rc1, all the tests pass except for this one change. I must admit that I don't really understand what the difference is. I'm also unsure if the change below is strictly related as that seems more like a pure performance tweak to avoid method_missing
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, forgot to say how it broke. Using a Mutex first results in a weird error relating to the spec helpers but once you comment out the block that prints the backtrace, it reveals the real error to be:
<ThreadError> deadlock; recursive locking
/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:80:in `synchronize'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:80:in `method_missing'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:32:in `block in remove_inactive_call'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:25:in `synchronize'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:25:in `remove_inactive_call'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:363:in `clear_from_active_calls'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:253:in `block in register_initial_handlers'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:162:in `call_handler'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:141:in `block (3 levels) in trigger_handler'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:136:in `catch'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:136:in `block (2 levels) in trigger_handler'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `each'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `find'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `block in trigger_handler'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:133:in `catch'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:133:in `trigger_handler'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:201:in `block in deliver_message'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/foundation/exception_handler.rb:5:in `catching_standard_errors'
/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:200:in `deliver_message'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:63:in `dispatch'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, I hadn't noticed that the rest of this file has changed quite a lot in the meantime because of de171a9. I guess I'll just skip this commit entirely.
Closing, as this was superseded by #625 |
Currently this most likely reduces Adhearsion's call setup capacity; I'd guess at a CPS reduction of around 2-5x, because of the serial routing this introduces. Calls should be routed in their own threads to improve this.
Give this a try while I work on that, @sfgeorge? I think this fixes your issue with out of order event dispatch.