Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short call leakage mitigation (v2) #610

Closed

Conversation

benlangfeld
Copy link
Member

Currently this most likely reduces Adhearsion's call setup capacity; I'd guess at a CPS reduction of around 2-5x, because of the serial routing this introduces. Calls should be routed in their own threads to improve this.

Give this a try while I work on that, @sfgeorge? I think this fixes your issue with out of order event dispatch.

@benlangfeld benlangfeld force-pushed the bugfix/2.x.x/ordered-event-dispatch branch from 2eb660f to 1758779 Compare August 11, 2016 14:04
@benlangfeld
Copy link
Member Author

This should now avoid call setup rate issues.

@coveralls
Copy link

coveralls commented Aug 11, 2016

Coverage Status

Coverage decreased (-2.4%) to 95.065% when pulling 1758779 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.6%) to 95.781% when pulling fe8f6f3 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.693% when pulling e34f523 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.693% when pulling e34f523 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.693% when pulling e34f523 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@benlangfeld benlangfeld force-pushed the bugfix/2.x.x/ordered-event-dispatch branch from e34f523 to f2578b5 Compare August 22, 2016 14:10
@benlangfeld benlangfeld force-pushed the bugfix/2.x.x/ordered-event-dispatch branch from f2578b5 to 4f81c9a Compare August 22, 2016 14:13
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.747% when pulling 4f81c9a on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.747% when pulling 4f81c9a on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@benlangfeld benlangfeld force-pushed the bugfix/2.x.x/ordered-event-dispatch branch from 4f81c9a to 01050a6 Compare August 22, 2016 15:36
@benlangfeld benlangfeld changed the title Ordered event dispatch Short call leakage mitigation Aug 22, 2016
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.743% when pulling 01050a6 on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

Avoids leaking call actors in cases where the call ended before it was dispatched and we missed the real End event.
They are racey and add a blocking operation where we might want to be totally async. Dispatching events to dead calls is not interesting in logs; the window in which this condition applies is so slim it almost never occurs and it is not actionable.
@benlangfeld benlangfeld force-pushed the bugfix/2.x.x/ordered-event-dispatch branch from 01050a6 to 5e7b65d Compare August 22, 2016 16:03
@coveralls
Copy link

Coverage Status

Coverage decreased (-1.7%) to 95.741% when pulling 5e7b65d on bugfix/2.x.x/ordered-event-dispatch into c0a9e9b on support/2.x.x.

@sfgeorge
Copy link
Member

I've tested this in a M.R.C. test app, and found the following in both JRuby and CRuby:

  • Upgrading to the latest changes from this PR solved the issue for Accepting routes, as you mentioned
  • Adding a rescue clause that you provided resolved the issue for Unaccepting routes

Thank you for doing this! 👍

@chewi
Copy link
Contributor

chewi commented Mar 20, 2018

I am trying to speed up our Sippy Cup testing by running a bunch of scenarios in parallel. While I never had any trouble with a single SIPp instance, possibly because it staggers the calls ever so slightly, launching several at the same time causes calls to jam up. It's fine if I add a short sleep between them. I traced the problem through to this line in Adhearsion::Router::Route#dispatch.

Adhearsion::Events.trigger_immediately :call_routed, call: call, route: self

Since this line isn't critical, commenting it out makes things work perfectly. It looks like this pull request might deal with this issue so I'll try rebasing it. Is it otherwise good to go? Was there a reason it wasn't merged?

@benlangfeld
Copy link
Member Author

I don’t know why this wasn’t merged. I don’t object to it as long as the tests pass.

@lpradovera
Copy link
Member

Looks good to me, but are not the tests failing?

@chewi
Copy link
Contributor

chewi commented Mar 20, 2018

Thanks for the info. I've just rebased it, trying to get the tests passing. Note that I'm doing this against 3.0.0.rc1, not using v2 any more. I'll file a new pull request when ready.

@@ -7,15 +7,15 @@ module Adhearsion
# This manages the list of calls the Adhearsion service receives
class Calls
def initialize
@mutex = ::Monitor.new
@mutex = ::Mutex.new
Copy link
Contributor

@chewi chewi Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After rebasing this on 3.0.0.rc1, all the tests pass except for this one change. I must admit that I don't really understand what the difference is. I'm also unsure if the change below is strictly related as that seems more like a pure performance tweak to avoid method_missing.

Copy link
Contributor

@chewi chewi Mar 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, forgot to say how it broke. Using a Mutex first results in a weird error relating to the spec helpers but once you comment out the block that prints the backtrace, it reveals the real error to be:

<ThreadError> deadlock; recursive locking
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:80:in `synchronize'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:80:in `method_missing'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:32:in `block in remove_inactive_call'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:25:in `synchronize'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/calls.rb:25:in `remove_inactive_call'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:363:in `clear_from_active_calls'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:253:in `block in register_initial_handlers'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:162:in `call_handler'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:141:in `block (3 levels) in trigger_handler'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:136:in `catch'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:136:in `block (2 levels) in trigger_handler'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `each'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `find'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:134:in `block in trigger_handler'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:133:in `catch'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/has-guarded-handlers-1.6.3/lib/has_guarded_handlers.rb:133:in `trigger_handler'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:201:in `block in deliver_message'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/foundation/exception_handler.rb:5:in `catching_standard_errors'
	/home/jlecuirot/Open/adhearsion/lib/adhearsion/call.rb:200:in `deliver_message'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `public_send'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:26:in `dispatch'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/calls.rb:63:in `dispatch'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/cell.rb:60:in `block in invoke'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/cell.rb:71:in `block in task'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/actor.rb:357:in `block in task'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/tasks.rb:57:in `block in initialize'
	/home/jlecuirot/.rvm/gems/ruby-2.4.3@ahn/gems/celluloid-0.16.0/lib/celluloid/tasks/task_fiber.rb:15:in `block in create'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I hadn't noticed that the rest of this file has changed quite a lot in the meantime because of de171a9. I guess I'll just skip this commit entirely.

@chewi chewi changed the title Short call leakage mitigation Short call leakage mitigation (v2) Mar 21, 2018
@bklang
Copy link
Member

bklang commented Aug 9, 2019

Closing, as this was superseded by #625

@bklang bklang closed this Aug 9, 2019
@bklang bklang deleted the bugfix/2.x.x/ordered-event-dispatch branch August 9, 2019 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants