High-Availability OME setup with CDN/Cloudflare #1153

naanlizard · 2023-03-27T12:00:20Z

naanlizard
Mar 27, 2023

For those of us using OME that would like a highly available setup with a CDN (in our case, we're leaning toward cloudflare) I thought it would be useful to start a thread to determine best practices and what you'll need to do.

Providing HA RTMP ingest is very difficult due to the protocol. I think at best we can have quick failover if RTMP ingest (whether OME or otherwise) goes down via Cloudflare's Load Balancing service - I am unsure exactly how their DNS query counting works, so perhaps that is going to be too expensive to be viable.

HA WebRTC ingest may work more smoothly, but I'm unfamiliar. Anyone able to chime in?

CDN for LLHLS files seems straightforward - Set up a cache rule to not cache the llhls.m3u8 file and cache all the other llhls files and you're good to go. I am unsure how this works when say, a stream goes live -> generates 1.ts 2.ts 3.ts (I forget the exact filename at the moment and can't easily look it up) -> stream goes offline, then online again -> OME generates new 1.ts 2.ts 3.ts which cause false positives for the CDN's cache. Perhaps API call to CDN to clear those files? Also will add some unknown latency.

CDN/cloudflare interaction with WebRTC playback I'm unsure about. @Wallacy commented here about how they use cloudflare to automatically scale their cloud server, which isn't particularly applicable to us with two dedicated servers but perhaps they can explain how the load balancing was set up with OME and cloudflare. I suspect the CDN won't matter for cloudflare, but hopefully we can at least proxy webrtc behind cloudflare to hide our IP, at a minimum, and maybe load balance between the two OME servers.

On the OME side, there are some robust clustering options. For a simple double origin/double edge (eg two instances of OME that both ingest and serve streams) we can use the originmapstore to automatically allow anyone to stream to and watch from either server, presumably intelligently if one server dies abruptly?

This is all pretty surface level but I plan to update as we test and implement something along these lines, and hopefully provide some useful info to others

getroot · 2023-03-28T09:11:44Z

getroot
Mar 28, 2023
Maintainer Sponsor

One tip for high availability is to use the Origins configuration to have Edge look at multiple Origin servers.

The settings below start ingesting from the next URL if there is no input packet for more than 3 seconds. This works for Publishers (LLHLS, WebRTC) to work continuously. In other words, even if the OriginMap falls back to the next origin URL, the publisher's stream is not deleted and recreated, but continues. So this is a good option for high availability.

Of course, origin1.com/app/stream and origin2.com/app/stream must have exactly the same encoding option so that the player does not malfunction.

<Origins>
	<Properties>
		<NoInputFailoverTimeout>3000</NoInputFailoverTimeout>
		<UnusedStreamDeletionTimeout>60000</UnusedStreamDeletionTimeout>
	</Properties>
	<Origin>
		<Location>/</Location>
		<Pass>
			<Scheme>ovt</Scheme>
			<Urls>
				<Url>origin1.com:9000/</Url>
				<Url>origin2.com:9000/</Url>
				<Url>origin3.com:9000/</Url>
			</Urls>
		</Pass>
	</Origin>
</Origins>

3 replies

naanlizard Mar 30, 2023
Author

Assume identical configurations for two origin servers

If users usually stream to origin1 and origin1 forwards via srt to origin2. If the edge can no longer connect to origin1 it will switch automatically to origin2?

I'm not sure what the use case is here. I would assume the connections between origin1 and edge are reliable, or at least the only time I'd expect them to fail is when origin1 completely crashes or something.

Unless this is useful somehow when origin1 crashes entirely? I think the default obs reconnect interval is fifteen seconds so in that time whatever we use to proxy rtmp connections needs to redirect streams to origin2 (which is still an open question at this point sadly)

getroot Mar 30, 2023
Maintainer Sponsor

I was involved in streaming the World Cup last year and I then configured a very high availability OvenMediaEngine cluster. They (my customer) have two encoder devices, and each wants to send to a different server. Also, the encoder devices are installed on different networks. Two Origin Servers are also configured on different networks. Edge servers are also distributed across multiple IDCs, and they all look at Origin1 and Origin2.

naanlizard Mar 30, 2023
Author

Ahh, I was not considering other use cases! Generally our users won't want to configure multiple rtmp streams. Thank you for the example

fairbairn · 2023-05-15T03:41:42Z

fairbairn
May 15, 2023

@getroot I wanted to reference a statement you made back in this issue...

#837 (comment)

Where you indicated that if we wanted to leverage Edge nodes in front of an Origin node, and properly leverage a CDN, that we should not be using Edge servers for LLHLS reverse proxying, but instead a formal reverse proxy such as NGINX or other.

The issue stems from the packetizing done in the Edge node and its reliance on the session identifier matching from subsequent requests from the same player, which would require sticky sessions and a load balancer to be put into place as well. Sticky sessions is not ideal in a HA setup.

However, if you're utilizing a standard CDN, and DNS round-robin (also suggested) sending requests to a bank of Edge servers, then sticky sessions are no longer a viable option, and to truly scale, you want direct to IP transfers from the CDN to the Edge vs going through a load balancer (which will hit networking limits at some point).

We have a simple case...

Multiple Origins (we host streams across multiple transcoding servers)
Redis in the middle (so the edges can properly find and pull from the proper origin on each request)
Multiple Edges (each identified in our DNS system and configured to work with redis)
CDN fronting all requests to the Edges

Of course, this does not work based on the way the CDN is routing to any Edge, because if the session ids don't match, the fragment will fail to load. We even tried putting the Edge in OriginMode but that failed as well.

We really want to use OME in Edge mode for this purpose :)

If we remove Edge out of the picture, then we fall into another problem set where we don't have an intelligent edge layer aware of which origin to pull from (a benefit of the Redis/OVT setup you've developed). If we put NGINX servers in front of the Origin servers, we then have to create some other routing table lookup system that nginx can pull from... doable, but not desirable, as you already have all the goodies in OME.

Is there not a way where each Edge could be placed in an LLHLS optimized reverse proxy mode, let the work be done on the Origin, forget about session ids, and leverage the Redis lookup system as-is so requests that hit any Edge can properly route and pull the proper content from the Origin?

That way the CDN can be dumb, send requests into as many Edge servers as we choose to run in parallel, and it all just works.

Right now, we're unable to use our Edge servers for delivery via the CDN, and routing clients directly to our Edge servers is not an option due to bandwidth concerns.

We have the Edge servers in place to offload the Origins, but we have the CDN in place to offload our network.

How would you propose we solve this implementation using OME, I'll also add that we do not care about WebRTC in this implementation, only LLHLS.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High-Availability OME setup with CDN/Cloudflare #1153

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

High-Availability OME setup with CDN/Cloudflare #1153

naanlizard Mar 27, 2023

Replies: 2 comments · 3 replies

getroot Mar 28, 2023 Maintainer Sponsor

naanlizard Mar 30, 2023 Author

getroot Mar 30, 2023 Maintainer Sponsor

naanlizard Mar 30, 2023 Author

fairbairn May 15, 2023

naanlizard
Mar 27, 2023

Replies: 2 comments 3 replies

getroot
Mar 28, 2023
Maintainer Sponsor

naanlizard Mar 30, 2023
Author

getroot Mar 30, 2023
Maintainer Sponsor

naanlizard Mar 30, 2023
Author

fairbairn
May 15, 2023