High-Availability OME setup with CDN/Cloudflare #1153
Replies: 2 comments 3 replies
-
One tip for high availability is to use the Origins configuration to have Edge look at multiple Origin servers. The settings below start ingesting from the next URL if there is no input packet for more than 3 seconds. This works for Publishers (LLHLS, WebRTC) to work continuously. In other words, even if the OriginMap falls back to the next origin URL, the publisher's stream is not deleted and recreated, but continues. So this is a good option for high availability. Of course, origin1.com/app/stream and origin2.com/app/stream must have exactly the same encoding option so that the player does not malfunction.
|
Beta Was this translation helpful? Give feedback.
-
@getroot I wanted to reference a statement you made back in this issue... Where you indicated that if we wanted to leverage Edge nodes in front of an Origin node, and properly leverage a CDN, that we should not be using Edge servers for LLHLS reverse proxying, but instead a formal reverse proxy such as NGINX or other. The issue stems from the packetizing done in the Edge node and its reliance on the session identifier matching from subsequent requests from the same player, which would require sticky sessions and a load balancer to be put into place as well. Sticky sessions is not ideal in a HA setup. However, if you're utilizing a standard CDN, and DNS round-robin (also suggested) sending requests to a bank of Edge servers, then sticky sessions are no longer a viable option, and to truly scale, you want direct to IP transfers from the CDN to the Edge vs going through a load balancer (which will hit networking limits at some point). We have a simple case... Multiple Origins (we host streams across multiple transcoding servers) Of course, this does not work based on the way the CDN is routing to any Edge, because if the session ids don't match, the fragment will fail to load. We even tried putting the Edge in OriginMode but that failed as well. We really want to use OME in Edge mode for this purpose :) If we remove Edge out of the picture, then we fall into another problem set where we don't have an intelligent edge layer aware of which origin to pull from (a benefit of the Redis/OVT setup you've developed). If we put NGINX servers in front of the Origin servers, we then have to create some other routing table lookup system that nginx can pull from... doable, but not desirable, as you already have all the goodies in OME. Is there not a way where each Edge could be placed in an LLHLS optimized reverse proxy mode, let the work be done on the Origin, forget about session ids, and leverage the Redis lookup system as-is so requests that hit any Edge can properly route and pull the proper content from the Origin? That way the CDN can be dumb, send requests into as many Edge servers as we choose to run in parallel, and it all just works. Right now, we're unable to use our Edge servers for delivery via the CDN, and routing clients directly to our Edge servers is not an option due to bandwidth concerns. We have the Edge servers in place to offload the Origins, but we have the CDN in place to offload our network. How would you propose we solve this implementation using OME, I'll also add that we do not care about WebRTC in this implementation, only LLHLS. |
Beta Was this translation helpful? Give feedback.
-
For those of us using OME that would like a highly available setup with a CDN (in our case, we're leaning toward cloudflare) I thought it would be useful to start a thread to determine best practices and what you'll need to do.
Providing HA RTMP ingest is very difficult due to the protocol. I think at best we can have quick failover if RTMP ingest (whether OME or otherwise) goes down via Cloudflare's Load Balancing service - I am unsure exactly how their DNS query counting works, so perhaps that is going to be too expensive to be viable.
HA WebRTC ingest may work more smoothly, but I'm unfamiliar. Anyone able to chime in?
CDN for LLHLS files seems straightforward - Set up a cache rule to not cache the
llhls.m3u8
file and cache all the other llhls files and you're good to go. I am unsure how this works when say, a stream goes live -> generates1.ts
2.ts
3.ts
(I forget the exact filename at the moment and can't easily look it up) -> stream goes offline, then online again -> OME generates new1.ts
2.ts
3.ts
which cause false positives for the CDN's cache. Perhaps API call to CDN to clear those files? Also will add some unknown latency.CDN/cloudflare interaction with WebRTC playback I'm unsure about. @Wallacy commented here about how they use cloudflare to automatically scale their cloud server, which isn't particularly applicable to us with two dedicated servers but perhaps they can explain how the load balancing was set up with OME and cloudflare. I suspect the CDN won't matter for cloudflare, but hopefully we can at least proxy webrtc behind cloudflare to hide our IP, at a minimum, and maybe load balance between the two OME servers.
On the OME side, there are some robust clustering options. For a simple double origin/double edge (eg two instances of OME that both ingest and serve streams) we can use the originmapstore to automatically allow anyone to stream to and watch from either server, presumably intelligently if one server dies abruptly?
This is all pretty surface level but I plan to update as we test and implement something along these lines, and hopefully provide some useful info to others
Beta Was this translation helpful? Give feedback.
All reactions