Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster: K8s: Proxy to extend origin servers. #3138

Closed
winlinvip opened this issue Aug 8, 2022 · 8 comments
Closed

Cluster: K8s: Proxy to extend origin servers. #3138

winlinvip opened this issue Aug 8, 2022 · 8 comments
Assignees
Labels
Feature It's a new feature. help wanted Extra attention is needed Kubernetes For K8s, Prometheus, APM and Grafana. TransByAI Translated by AI/GPT.
Milestone

Comments

@winlinvip
Copy link
Member

winlinvip commented Aug 8, 2022

Both cluster and proxy works for system load balancing, in short to serve a large set of connections or clients, but there are some differences.

中文: Cluster和Proxy都是为了解决系统的负载均衡问题,简单来说就是支持很多连接或客户端,但是它们从方案上看是有区别的。

Cluster works as an integrity, a set of servers of a cluster works like one server. So it supports a large number of streams, and each stream supports a lots of connections. For example, a cluster supports 100k streams and 100m viewers, or infinety system capacity.

中文:Cluster作为一个整体工作,很多服务器工作起来像是单个服务器一样。Cluster支持很多流,每个流支持很多播放连接。比如,Cluster支持10万路流,支持1亿播放观看,或者可以简单认为Cluster的系统容量是无限的。

Proxy works as a agent of SRS Origin server, to proxy streams to a special set of origin servers, and proxy to the same server for a specifical stream. Proxy doesn't extend system capacity for it proxy each UDP or TCP packet, but it helps media server for load balancing.

中文:Proxy是作为源站的代理,将流代理到一系列的源站,并且某个特定的流只会代理到同一个源站。Proxy并不会增加系统的容量,因为Proxy转发了每个UDP或TCP包,但是Proxy能帮助实现系统的负载均衡。从系统的整体来看,Proxy也可以实现系统能力的扩容。

Proxy is designed to make origin server stateless, to build a cluster from isolated stateless origin servers to a cluster, and can also be a part of a cluster.

中文:Proxy是为了解决源站的状态问题,将源站变成无状态服务。Proxy可以将多个无关的源站,组建成一个源站集群。因此Proxy当然也是集群的一个部分。

Architecture

Proxy works with SRS Origin servers, the stream flow works like this:

中文:Proxy一般和源站服务器一起工作,它的流图如下所示。

Client ---> LB --> Proxy Servers --> Origin Servers

OBS/FFmpeg --RTMP--> K8s(Service) --Proxy--> SRS(pod A)

Browsers --FLV/HLS/SRT--> K8s(Service) --Proxy--> SRS(pod A)

Browsers --+---HTTP-API--> K8s(Service) --Proxy--> SRS(pod A)
           +---WebRTC----> K8s(Service) --Proxy--> SRS(pod A)

LB is load balancer, such as K8s service, or cloud load balancer. Generally, LB binds public internet IP, for clients to connect to.

中文:LB就是负载均衡,比如K8s Service,或者云服务的负载均衡。一般来说,LB绑定了公网IP,对客户端提供服务。

Proxy is stateless, so you're able to deploy a set of proxy servers, for example, use K8s Deployment to deploy a set of proxies.

中文:Proxy是无状态服务,所以你可以很方便的部署一系列的Proxy服务器。例如,可以用K8s Deployment部署很多Proxy。

When client request stream from proxy server, it find stream from its infrastructure, and proxy to the specifical origin server, so that each stream is served by a specifical backend server. For the first time, proxy randomly pick one origin server for a fresh stream.

中文:当客户端连接到Proxy时,它会从基础设施查询流所在的后端服务器,并且将流代理到这台服务器,这样保证每个流是被同一个后端服务器服务的。第一次Proxy随机选择一个后端服务器。

Service Discovery

Regarding service discovery, it involves two aspects: how the proxy discovers the origin servers and how the proxy obtains stream information from other proxy servers.

中文:关于服务发现,它涉及两个方面:Proxy如何发现Origin Server以及Proxy如何从其他Proxy Server获取流信息。

There are two types of service discovery mechanisms: one that caters to simple scenarios like demos or small clusters, and another that provides stability for large-scale online products.

中文:有两种服务发现机制:一种适用于简单场景,如演示或小型集群;另一种为大规模在线产品提供稳定性。

  1. Simple Static Discovery: The origin and proxy server IP addresses are configured in the server configuration file. This allows the proxy server to access the target origin server for API and media streams and fetch stream information from other proxy servers.

中文:Simple Static Discovery: Proxy和Origin Server IP 地址直接配置在配置文件。这使得Proxy可以访问目标Origin Server,并从其他Proxy获取流信息。

Client --> Proxy 0-2 --> Origin 0-2

Proxy X config:
id: proxyX
origin: origin0, origin1, origin2

Origin X config:
proxy: proxy0, proxy1, proxy2
  1. Robust Dynamic Discovery: The origin and proxy server utilize HTTP API to register and query from a service manager, which employs Redis for storing server and stream metadata. The entire system can be deployed as a Kubernetes deployment.

中文:Robust Dynamic Discovery: Proxy和Origin Server使用 HTTP API 向Service Manager注册和查询,Service Manager使用 Redis 存储服务器和流元数据。整个系统可以作为 Kubernetes Deployment进行部署。

Client --> Proxy 0-2 --> Origin 0-2
Proxy --> Service Manager --> Redis
Origin --> Service Manager --> Redis

Proxy X config:
id: proxyX
origin: service manager

Origin X config:
id: originX
proxy: service manager

In this architecture, both the proxy and origin servers, as well as the service manager, utilize the same HTTP API for service discovery, and the configuration file shares a consistent format across all components.

中文:在此架构中,Proxy和Origin以及Service Manager都使用相同的HTTP API进行服务发现,配置文件在所有组件之间采用一致的格式。

Proxy of Proxies

Edge is actually another type of proxy, but with origin ip configured, but origin pod IP is variant not fixed. While proxy doesn't need to configure the origin IP because it depends on redis or other service discover mechanism, so proxy can also be used for upstream server for edge. In this situation, proxy is like a K8s service of origin servers for edge servers.

中文:其实Edge也是一种Proxy,不过它需要预先配置源站IP地址,但源站Pod的IP总是变化的,并不是固定的IP。由于Proxy使用Redis或其他服务发现机制,所以Proxy并不需要配置源站IP,此时Proxy可以作为Edge的上游服务器,这样就可以避免Edge依赖源站的IP的问题。这种情况下,Proxy实际上被当做了源站的K8s Service使用,Edge回源到这个固定的Service名字。

Client --RTMP--> Edge ---RTMP--> Proxy --RTMP--> Origin
Edge is (K8s Edge Service --RTMP--> K8s Edge Pods)
Proxy is (K8s Proxy Service --RTMP--> K8s Proxy Pods)
Origin is (K8s Origin Pods)

Note: Please note that both edge and proxy is behind K8s service, not only K8s pods, while Origin is a set of K8s pods.

Note: 请注意Edge和Proxy都是部署在K8s service后面的,而不是直接的Pods。而Origin源站则是直接部署的K8s pods。

With this architecture, we're able to support a huge set of streams and viewers, without origin cluster which is stateful and complex. Please note that edge only works for live streaming, such as RTMP/FLV. Other edge also works well, for example, HLS edge cluster works with Proxy, from where NGINX fetch and cache HLS. After WebRTC supports cascading, it also works with proxy.

中文:使用这种架构,可以支持很多流,也支持很多观看。我们并没有使用源站集群,因为源站集群是有状态的,因此很复杂。请注意Edge是为直播设计的,支持RTMP和FLV。当然HLS边缘集群也可以和Proxy工作得很好,NGINX可以从Proxy拉HLS流。如果WebRTC支持级联后,那么级联服务器也可以从Proxy获取流。

Proxy Mirrors

To extend stream concurrency, proxy can mirror streams to multiple origin servers, to enable you to play stream from different origin server, for fault tolerance and scale out cluster capability.

中文:为了支持单流的并发数目,Proxy可以讲流镜像到多个源站,你可以在多个不同的源站访问这个流,增加单个流的并发能力。

Proxy --+---> Origin Server A
        +---> Origin Server B
        +---> Origin Server C

For example, a RTC origin server could serve about 300 to 700 players for each stream, you can use proxy to mirror the same stream to 3 origin servers, to enable 900 to 2100 players.

中文:例如,单个RTC源站可能支持300到700个播放器拉流,可以用Proxy将流镜像到3个源站后,就可以支持900到2100个播放器访问。

Limitation

The limitation of proxy is the number of viewers for a stream, which should never exceed each single server's capacity, becuase proxy always proxy the same stream to the same backend server. For example, SRS support 5k viewers for RTMP/FLV, about 500 viewers for WebRTC, please test the capacity by srs-bench.

中文:Proxy的限制是流的播放的数量,不能超过单个服务器支持的播放数量,这是因为Proxy总是将同一个流代理到同一个服务器。比如SRS支持5k客户端播放RTMP/FLV,或者500客户端播放WebRTC,详细的播放数量的支持请使用srs-bench测试。

It's the responsibility of Cluster to support a large set of viewers, such as Edge Cluster for RTMP/FLV, or HLS Cluster for HLS. WebRTC doesn't support cluster now, please read wiki of WebRTC.

中文:支持很多播放客户端,是集群需要支持的功能。比如Edge Cluster支持很多RTMP/FLV客户端,HLS Cluster支持很多HLS客户端。WebRTC目前还没有集群能力,详细请看WebRTC相关的文档。

For most use scenario, Proxy is much simple and useful ability for load balancing, because there're a set of streams to serve but not too much, and there're also a set of viewers for each stream but not too much. For example, building a system for 1k streams and each stream has 1k viewers, the total connections of system is no more than 1m.

中文:大多数场景中,Proxy比集群更有用,因为大多是有一些流但不会特别多,每个流有一些播放但不是特别多。比如,支持1k个流,每个流有1k个观看,整个系统的连接数不超过100万。

Proxy also works with cluster, for example, if you have 1k streams and each stream has 100k FLV viewers, the architecture is like this:

中文:Proxy可以和Cluster一起工作,比如你有1k个流,每个流有10万观看,系统结构如下。

Publisher ---RTMP--> Proxy --> Origin Servers 
Origin Server --RTMP--> Proxy --RTMP--> Edge Cluster
Edge Cluster --RTMP--> Players

Keep in mind that proxy is design for Origin server, so there should always be proxy for a origin server, even for edge server to pull stream from. Proxy is another similar solution like origin cluster, but it's much simple and works for all protocols like RTMP/FLV/HLS/WebRTC/SRT, while origin cluster only works for RTMP.

中文:请记住Proxy是为Origin设计的,所以每个Origin都需要通过Proxy提供服务,哪怕是Edge集群回源时也需要通过Proxy。其实Proxy是和Origin Cluster相似的方案,不过Proxy要更简单而且支持所有的协议包括RTMP/FLV/HLS/WebRTC/SRT,而Origin Cluster目前只支持RTMP协议。

Notes

Proxy should be written by Go/C++, stable and simple. Should be C++, because we might use eBPF.

中文:Proxy可以使用Go/C++实现,这样无状态化更容易实现,稳定性也更高。可能用C++,因为要使用eBPF。

It's possible to directly forward IP packets by kernel, to make proxy less CPU.

中文:由于内核的不断完善,也可考虑非流量代理方式,直接让内核将流量转移到不同的进程。

Proxy is much simpler than cluster, because both proxy and origin server is stateless, which can be deploy by K8s Deployment.

中文:Proxy是比Cluster要简单非常多的,因为Proxy和Origin都是无状态服务,都可以用K8s的Deployment部署。

@winlinvip winlinvip self-assigned this Aug 8, 2022
@winlinvip winlinvip added Feature It's a new feature. help wanted Extra attention is needed labels Aug 8, 2022
@winlinvip winlinvip changed the title Proxy: Support proxy to use one IP. 支持代理减少暴露的IP. K8s: Proxy: Support proxy to use one IP. 支持代理减少暴露的IP. Sep 1, 2022
@winlinvip winlinvip added the Kubernetes For K8s, Prometheus, APM and Grafana. label Sep 1, 2022
@winlinvip winlinvip changed the title K8s: Proxy: Support proxy to use one IP. 支持代理减少暴露的IP. K8s: Proxy: Support proxy to extend multiple origins. 代理支持扩展多个独立源站。 Sep 11, 2022
@winlinvip winlinvip changed the title K8s: Proxy: Support proxy to extend multiple origins. 代理支持扩展多个独立源站。 K8s: Proxy to extend multiple origins. 代理支持扩展多个独立源站。 Sep 11, 2022
@winlinvip winlinvip added this to the 5.0 milestone Sep 11, 2022
@qiantaossx
Copy link

frp 上的基础上改改 ?

@winlinvip winlinvip changed the title K8s: Proxy to extend multiple origins. 代理支持扩展多个独立源站。 Cluster: K8s: Proxy to extend multiple origins. 代理支持扩展多个独立源站。 Nov 24, 2022
@winlinvip winlinvip changed the title Cluster: K8s: Proxy to extend multiple origins. 代理支持扩展多个独立源站。 Cluster: K8s: Proxy to extend origin servers. 代理支持扩展多个源站。 Nov 24, 2022
@not2007
Copy link

not2007 commented Nov 24, 2022

请问一下目前有哪些代理方案可以研究一下,拿来修改吗

@winlinvip
Copy link
Member Author

更新进展:Proxy本质上是一个负载均衡问题,eBPF 比较合适实现。

另外,用 libbpf C++比较合适,而不是cilium/ebpf Go,两个差异挺大。

@jinleileiking
Copy link

jinleileiking commented Dec 20, 2022

webrtc要给sdp,而lb后面的rs是不暴露的,好像没法解啊, 在考虑nodeport是否能解这个问题

@jinleileiking
Copy link

jinleileiking commented Dec 20, 2022

ebpf太复杂了

我们目前实现, hls:

pub -> lb1 -> srs -> pub hook -> go server -> save pub pod ip to mysql

play -> lb2 -> go server -> get pod ip> get m3u8, ts from srs pod > return

缺点,依赖中心。依赖业务,go server加入的话,就是一个有点复杂的系统了.

有点,简单,短时间小流量绝对够用了,加个pod ip的local cache

@qiantaossx
Copy link

qiantaossx commented Dec 23, 2022

对于 rtmp 流转 webrtc 流的这种业务场景来说,由于 源站本身是有状态的,无法放到 K8S 上,目前我们的业务只能直接放在物理机上部署

是否可以考虑通过增加 这一层代理解决,源站的状态问题。
player -> loadbalance -> 代理(转发 http 报文和 udp 报文) -> SRS 源站

由于 proxy 这一层无状态,可以方便通过 deployment 部署, 向外暴露统一的 loadbalance 和域名

udp 的 LB,不会改写 udp 包的 源ip 和源 端口(https://exampleloadbalancer.com/nlbudp_detail.html)
proxy 可以通过 这个二元组判断该报文,应该转发给哪个 后端 SRS
@winlinvip 请问一下,可以这么用么

@winlinvip
Copy link
Member Author

winlinvip commented Dec 25, 2022

@qiantaossx Proxy就是这么设计的,会识别请求的内容,看客户端请求的是哪个流,然后再转发给后端服务器,所以可以用于所有协议的源站。

另外,关于源站有状态这事情,其实不用完全无状态也做不到,但可以通过重试的方式避免依赖。比如如果源站挂掉了,那么推流重试后,会选择新的源站,拉流重试后,也可以再次定位到这个新的源站。

这样源站和Proxy都可以无状态部署,做到非常简单的运维。

PS: 源站集群其实也是无状态的,只是因为目前的配置是需要一个地址,所以一般源站集群是Stateful,但如果改进下从一个API获取其他源站地址,那么其实也是无状态的。

PS: Proxy初步预计在SRS 6.0实现,估计从2023年初开始开发,有兴趣可以一起来参与。

@winlinvip winlinvip modified the milestones: 5.0, 6.0 Jan 2, 2023
@jinleileiking
Copy link

最近看livekit,可以参考它的实现,pod会找到自己的对外地址(通过nat),然后将这个地址给sdp,这样就完成了集群webrtc源站分发

@winlinvip winlinvip linked a pull request Mar 25, 2023 that will close this issue
@winlinvip winlinvip changed the title Cluster: K8s: Proxy to extend origin servers. 代理支持扩展多个源站。 Cluster: K8s: Proxy to extend origin servers. Jul 18, 2023
@ossrs ossrs locked and limited conversation to collaborators Jul 18, 2023
@winlinvip winlinvip converted this issue into discussion #3634 Jul 18, 2023
@winlinvip winlinvip added TransByAI Translated by AI/GPT. and removed TransByAI Translated by AI/GPT. labels Jul 28, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Feature It's a new feature. help wanted Extra attention is needed Kubernetes For K8s, Prometheus, APM and Grafana. TransByAI Translated by AI/GPT.
Projects
None yet
Development

No branches or pull requests

4 participants