Skip to content

Commit

Permalink
RTC: Refine performance about 700+ streams. 4.0.71
Browse files Browse the repository at this point in the history
  • Loading branch information
winlinvip committed Feb 10, 2021
1 parent b7c7d65 commit b431ad7
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ For previous versions, please read:

## V4 changes

* v4.0, 2021-02-10, RTC: Refine performance about 700+ streams. 4.0.71
* v4.0, 2021-02-08, RTC: Print stat for pli and timer every 5s. 4.0.70
* v4.0, 2021-02-07, RTC: Parse PT fast and refine udp handler. 4.0.69
* v4.0, 2021-02-05, RTC: Refine UDP packet peer fast id. 4.0.68
Expand Down
2 changes: 1 addition & 1 deletion trunk/src/core/srs_core_version4.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,6 @@
#ifndef SRS_CORE_VERSION4_HPP
#define SRS_CORE_VERSION4_HPP

#define SRS_VERSION4_REVISION 70
#define SRS_VERSION4_REVISION 71

#endif

1 comment on commit b431ad7

@winlinvip
Copy link
Member Author

@winlinvip winlinvip commented on b431ad7 Feb 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这一轮RTC性能优化,大幅提升了推拉流的路数,重点的优化如下。

首先,只有完善的Benchmark工具压测,才能做性能优化。srs-bench基于Pion做了RTC的Benchmark,推流和拉流。

然后,UDP大量包的场景下,ContextId也是个需要优化的点:

  • 避免频繁的ContextId设置,在UDP大量包的场景下,频繁创建和释放ContextId对象,影响较大:19a7c76e31169d
  • UDP接收协程,避免每次设置ContextId,只有在error时才切换:5eafcea
  • 避免全局的map缓存ContextId信息,直接用ST thread的key,即直接从当前协程的private data取,因为ContextId就是和协程绑定的:102434b
  • Connection处理RTP Packet,对于RTP不切换ContextId:51e630d

另外,由于端口复用,每个UDP包需要查找对应的Conntion,以及对应的处理对象,需要优化:

  • 改进查找Publisher,快速解析SSRC,避免每次都解析完整的RTP Header:cd06f2d80985c7dffbebf
  • 快速解析PT,判断是否需要丢弃,避免解析完整RTP Header:79a6907
  • 避免dynamic cast,由于每个UDP包需要找到Connection,每次都转换开销较大:9c17721
  • 快速解析TWCC SN,避免每次都解析完整的RTP Header:9a9efb8719df6f
  • 改进快速查找Connection,避免每次使用string五元组查找,而是用8字节的整型作为map的key:9519397d41a925c3414a32b73c1c

当然,基本的内存和缓冲区的优化:

  • 改进UDP包接收,复用对象和缓冲区,避免频繁开辟内存:2b85ad1b020802
  • 改进加解密,重复使用缓冲区,避免每次开辟新缓冲区:0c07459ef279a8aec27457f4d8a4
  • 避免一定拷贝对象,延迟到需要拷贝时才拷贝:43d4240

最后,相关机制的优化:

  • 改进NACK的检查,避免由每个UDP包触发(包太多了),而是定时器触发:407ae1d
  • ST在较多SleepQ时,比如timer或者超时read/write时,可能会出现(0, 1ms)之内的误差,导致epoll_wait空转:9ada516
  • 默认打开ASM优化,对SRTP影响较大:6e3bd61

Please sign in to comment.