Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cantata crashes mopidy by overwhelming mopidy's stream backend. #1762

Closed
crocket opened this issue May 16, 2019 · 8 comments
Closed

Cantata crashes mopidy by overwhelming mopidy's stream backend. #1762

crocket opened this issue May 16, 2019 · 8 comments
Labels
C-bug Category: This is a bug

Comments

@crocket
Copy link

crocket commented May 16, 2019

Cantata's timeout for streams is shorter than mopidy's.
Thus, cantata times out before mopidy does and makes new requests before mopidy times out.
Mopidy is overwhelmed and crashes very soon.

@kingosticks
Copy link
Member

Every MPD client I have seen before has a configurable timeout. Where is this setting in Cantana? There are many things that can take a while to complete, especially when dealing with any of the streaming services Mopidy provides. And also when Mopidy is running on a slow system (e.g. Raspberry Pi) it responds slower. I'm not convinced this is our bug.

Do you actually mean Mopidy crashes? We have a configurable setting as to how many connections we accept before they are simply rejected.

@adamcik
Copy link
Member

adamcik commented May 16, 2019

This sounds like classical resource exhaustion. Either we are busy dealing with a request that would timeout, or the actor has crashed. Then Cantata's tries again and takes an other connection/worker down, eventually they are all busy or dead.

So to some degree this is Cantata's fault for possibly being too aggressive with retries. Backoff would be good to have here. And it's also Mopidy's fault for not being more resilient, and not replacing dead actors (assuming they are dying and not just busy).

@kingosticks
Copy link
Member

A complete debug log showing the chain of events would be handy.

@crocket
Copy link
Author

crocket commented May 16, 2019

To reproduce the log, in cantata, Internet > TuneIn > By Location > Asia > South Korea > All Stations > CBS 뮤직 FM

mopidy.log
mopidy-debug.log

@adamcik
Copy link
Member

adamcik commented May 16, 2019

Ok, so this is indeed that the actors are dying and not being cleaned up.

We could perhaps wrap https://github.com/mopidy/mopidy/blob/develop/mopidy/internal/network.py#L423 in a try: ... except: self.connection.stop('Unhandled exception.') (just with some logging etc).

This would fix the long standing bug that this really is where you could use up max_connections by crashing our handler code.

@kingosticks
Copy link
Member

I'm having a hard time reproducing the actual crash but I can see a problem in our handling when:

  1. The MpdSession protocol handler suffers an unhandled exception for whatever reason. It's actor loop handles this exception and prints a traceback. The actor then dies without ever running on_stop to clean up and close the socket.
  2. Some time later, the client after having not had a response decides to give up and closes the socket.
  3. Mopidy's Connection notices this and tries to tell the MpdSession to stop and clean up. But the actor is already dead so this just raises ActorDeadError and the clean up never happens.

We could handle this by wrapping the tell() with a try block or define a LineProtocol.on_failure hook to ensure we always clean up.

kingosticks added a commit to kingosticks/mopidy that referenced this issue May 22, 2019
kingosticks added a commit to kingosticks/mopidy that referenced this issue May 22, 2019
kingosticks added a commit to kingosticks/mopidy that referenced this issue May 26, 2019
kingosticks added a commit to kingosticks/mopidy that referenced this issue Jun 26, 2019
@jodal jodal modified the milestone: v2.2.4 Jun 26, 2019
kingosticks added a commit to kingosticks/mopidy that referenced this issue Jun 26, 2019
@jodal jodal added this to the v2.3.0 milestone Sep 30, 2019
@jodal
Copy link
Member

jodal commented Sep 30, 2019

With #1765 merged, what more do we want to do here?

@jodal jodal added the C-bug Category: This is a bug label Sep 30, 2019
@kingosticks
Copy link
Member

Ideally to try again with Canata but I could never reproduce it.

@jodal jodal modified the milestones: v2.3.0, v2.3.1 Oct 3, 2019
@jodal jodal removed this from the v2.3.2 milestone Dec 14, 2019
@jodal jodal closed this as completed Dec 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug
Projects
None yet
Development

No branches or pull requests

4 participants