Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capacity Cap for B #706

Closed
eladmallel opened this issue Jan 23, 2019 · 8 comments
Closed

Capacity Cap for B #706

eladmallel opened this issue Jan 23, 2019 · 8 comments
Assignees

Comments

@eladmallel
Copy link
Contributor

eladmallel commented Jan 23, 2019

Is your feature request related to a problem? Please describe.
To improve customer experience, B should respond with a clear error message whenever it's at its max capacity of concurrent streams and a new RTMP stream comes in.

Describe the solution you'd like
The user/customer should receive clear feedback from B that it is now at max capacity and therefore cannot handle new streams.

We probably want to add yet another CLI param, such as maxConcurrentStreams of type int. B should reject any incoming stream once its concurrent stream count reaches maxConcurrentStreams.

Not sure yet how to best achieve this goal?

  • Does RTMP support this kind of response?
  • Should we rely on webhooks for this kind of response?
  • Any other approaches?

Describe alternatives you've considered
Discussed above.

Additional context
None.

@eladmallel
Copy link
Contributor Author

@j0sh @ericxtang @darkdarkdragon can you please share your thoughts on how we should approach this?

@j0sh
Copy link
Collaborator

j0sh commented Jan 23, 2019

Does RTMP support this kind of response?

Not really, the best we can do here is disconnect the stream. We might be able to add a special code to the connect response prior to disconnect, but that requires changes to LPMS and probably joy4 (the golang RTMP implementation), in addition to any specialized handling on the client (customer) side. Most RTMP application hooks do not expose this type of detail, so it's probably more trouble than it's worth.

Should we rely on webhooks for this kind of response?

We should certainly be able to stop the stream based on the webhook response, but I'm inclined to also add capacity caps as a separate node-level feature. Enforcing capacity via webhook means the endpoint(s) need to have a synchronized view of the broadcaster's internal state, which is error-prone.

Any other approaches?

This is pretty easy to implement. However, for M2, I'd say O-level caps are more important since this will allow the transcoding network to self-balance.

From experience with other RTMP providers, most customers will probably take a non-explicit approach to load balancing their B nodes:

  • Over-provision RTMP servers slightly above anticipated demand
  • Randomly assign customers to servers (whether assigning them a fixed node, or dynamically via ingest restreaming)

Of course, a capacity cap is still useful here, if only to preserve existing QoS at the upper limit and/or alert some monitoring system.

@eladmallel
Copy link
Contributor Author

I'd say O-level caps are more important

I agree that's important, and we have a separate issue for that as part of M2.

This is pretty easy to implement.

I'd love to understand more specifically how you see this implemented? What I had in mind was to introduce a new webhook message that lets the user know why the stream was rejected. A bit more in detail:

  • We could name this message something like StreamRejectedDueToMaxCapacity
  • It should contain the host the rejected the stream, and the user-defined ManifestID, so the user can have sufficient information to decide how to react
  • B would invoke this webhook immediately after rejecting an RTMP stream

Does this sound good, or did you have a different approach in mind?

cc @darkdarkdragon

@darkdarkdragon
Copy link
Contributor

@j0sh

Enforcing capacity via webhook means the endpoint(s) need to have a synchronized view of the broadcaster's internal state

Can you elaborate on that?

@darkdarkdragon
Copy link
Contributor

@eladmallel

  • We could name this message something like StreamRejectedDueToMaxCapacity
  • It should contain the host the rejected the stream, and the user-defined ManifestID, so the user can have sufficient information to decide how to react
  • B would invoke this webhook immediately after rejecting an RTMP stream

Sounds good for me

@j0sh
Copy link
Collaborator

j0sh commented Jan 24, 2019

@eladmallel I may have mis-interpreted this part in the writeup: "Should we rely on webhooks for this kind of response?"

Specifically, this sounded like we would be outsourcing the logic for tracking capacity and rejecting streams beyond the max to the endpoint receiving the stream auth webhook (running outside the node). This would be problematic.

If we're just talking about an event notification that the capacity cap has been reached -- then all this sounds fine. I see that it's also mentioned within the issue for system-wide instrumentation, #671 so perhaps the specific approach to notifications (webhooks, existing metrics, collectd, something else) could be decided as part of addressing the instrumentation feature.

It should contain the host the rejected the stream

Note this will probably require changes deeper down, since I don't think connection level information is exposed to the goclient. But having such info available would generally be useful, so I'm in favor of making this change at eventually.

@darkdarkdragon Happy to elaborate on the problematic aspects if it's still relevant, but didn't want to get sidetracked over my own misunderstanding.

I'd love to understand more specifically how you see this implemented?

// Check this within createRTMPStreamIDHandler and/or gotRTMPStreamIDHandler
func (s *LivepeerServer) isAtCapacity (url *url.URL) bool {
    s.connectionLock.RLock()
    defer s.connectionLock.RUnlock()
    if len(s.rtmpConnections) > MaxBroadcastStreams {
            go sendAtCapacityNotif(url) 
            return true
     }
    return false
}

@eladmallel
Copy link
Contributor Author

@j0sh I think you now understand my intention well. When I was thinking to use webhooks it was with the purpose of helping the user/customer of B be aware that:

  1. B rejected a stream just now for a good and specific reason
  2. You should know exactly which B so you can potentially direct traffic to other Bs

I think that your pseudocode is in that same line of thinking.

Are we then good with relying on webhooks to provide users of B with this useful information? @j0sh @darkdarkdragon

@j0sh
Copy link
Collaborator

j0sh commented Jan 31, 2019

Are we then good with relying on webhooks to provide users of B with this useful information?

This feels quite close to #704 -- the "B at capacity" notification really is just another event, so let's take the discussion of the actual mechanism there. Also related is #671

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants