-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): 完成reactor server开发,但测试过程中遇到个问题 #26
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #26 +/- ##
==========================================
+ Coverage 70.53% 72.81% +2.28%
==========================================
Files 3 4 +1
Lines 733 401 -332
==========================================
- Hits 517 292 -225
+ Misses 140 83 -57
+ Partials 76 26 -50 ☔ View full report in Codecov by Sentry. |
额外补充一点哈,出现问题的是ReactorServer,其他server实现请忽略,后面我会清理一下 |
补充更加详细的上下文信息:
|
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
1. 增加prometheus打点监控,便于排查问题。没有使用在应用程序开个端口prometheus 服务器主动拉取这种模式,而是启动PushGateway作为sidecar中转,主要考虑是调试阶段 程序运行时间短,不好收集指标。 2. 修复runtime: marked free object in span 0x12a642b58, elemsize=32 freeindex=18 (bad use of unsafe.Pointer? try -d=checkptr)问题,原因是*conn通过unsafe.Pointer 转*byte会导致conn结构超过8字节的部分失去引用被gc(个人推测,还得深入研究runtime 包明确结论;至于为什么要用`udata = *(**byte)(unsafe.Pointer(&conn))`是因为 cloudwego/netpoll是这样用的) 3. syscall.INTR可能导致系统调用被中断,给waiter、reactor、acceptor这类遇到syscall 错误会崩溃的协程加特判 4. 给各环节间通信管道加buf,提高整体协作效率。但从accept qps看没有明显优化收益, 后续量化请求qps给个优化结论。 5. 由于accept性能瓶颈(1s获取10个链接)存在,高并发场景下会出现由于backlog满导致 大量请求被server端reset。后续动作考虑看下我的用法是否正确,以及池化链接对象那内存 分配回收性能收益。 后续动作:kqueue改成非阻塞水平触发,reactor负责协议解析,handler纯处理业务逻辑。 Signed-off-by: Trino <sujun.trinoooo@gmail.com>
给reactor;不使用kqueue的udata存放conn,因为会有奇怪的问题 Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
RocooHash
approved these changes
Jun 24, 2024
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Signed-off-by: Trino <sujun.trinoooo@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@RocooHash 我最近搓了一个reactor服务器,但测试过程中遇到一个难以解决的问题,于是寄希望于求助你:)
问题描述:
当我执行测试用例时,会出现下文所示调试信息。信息指出socket在server端被不知名原因关闭(这是我能定位到的问题,但不一定准确)
另一个用于佐证的证据是goroutine pprof快照,可以看到reactor都阻塞在select,waitor都阻塞在syscall,dispatcher阻塞在connectons接收:
额外信息:
concurrency
调整到18(不同计算机可能不同)以上才会出现,我推测是并发度低问题难以复现。感谢关注!辛苦有思路随时联系 :)