Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

butex_wake_all support nosignal flag #1751

Merged
merged 1 commit into from
Jun 6, 2022

Conversation

yanglimingcn
Copy link
Contributor

butex_wake*的需求 #1749

@@ -284,15 +286,19 @@ int butex_wake(void* arg, bool nosched) {
if (!nosched) {
TaskGroup::exchange(&g, bbw->tid);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exchange是不是也要传nosignal参数进去

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exchange是希望调度的,所以应该是默认signal的,此外exchange也没有nosignal的参数

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exchange的时候会挂起原来的bthread,这里nosignal的含义是是否立即调度原来的bthread,可以给exchange加个参数。就像bthread_start_urgent也可以传nosignal。

src/bthread/countdown_event.cpp Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
@yanglimingcn yanglimingcn force-pushed the butex_wake_nosched branch 4 times, most recently from a1f5745 to 7307477 Compare April 26, 2022 06:43
@yanglimingcn
Copy link
Contributor Author

如果在TaskGroup::exchange的参数变成 TaskGroup::exchange(TaskGroup** pg, bthread_t next_tid, bool nosched, bool nosignal) 把nosched的逻辑放到exchange里面,整个代码逻辑能更简单些。但是exchange的本意就是要切换协程,不知道这样是否合理。

@wwbmmm
Copy link
Contributor

wwbmmm commented Apr 27, 2022

如果在TaskGroup::exchange的参数变成 TaskGroup::exchange(TaskGroup** pg, bthread_t next_tid, bool nosched, bool nosignal) 把nosched的逻辑放到exchange里面,整个代码逻辑能更简单些。但是exchange的本意就是要切换协程,不知道这样是否合理。

exchange本意是切换协程,加上nosched参数不太合理。但是nosignal参数是有意义的。

src/bthread/task_group_inl.h Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
src/bthread/butex.cpp Outdated Show resolved Hide resolved
@yanglimingcn yanglimingcn force-pushed the butex_wake_nosched branch 2 times, most recently from 972e867 to 625af37 Compare May 13, 2022 07:22
@yanglimingcn
Copy link
Contributor Author

已经按照你的建议修改了,这样代码清晰很多

@yanglimingcn
Copy link
Contributor Author

yanglimingcn commented May 13, 2022

nosched参数是不是可以去掉了,如果nosignal是true,就不要调用exchange,而是调用ready_to_run
另外,我觉得exchange里面的nosignal=true会有问题,因为当前协程切出去了,再次回来需要一个flush,这个flush不知道谁去执行。这和其它地方的nosignal不一样,其它地方都是本协程唤醒其它协程,最后执行一次flush

@yanglimingcn yanglimingcn requested a review from wwbmmm May 15, 2022 23:25
@wwbmmm
Copy link
Contributor

wwbmmm commented May 16, 2022

nosched参数是不是可以去掉了,如果nosignal是true,就不要调用exchange,而是调用ready_to_run 另外,我觉得exchange里面的nosignal=true会有问题,因为当前协程切出去了,再次回来需要一个flush,这个flush不知道谁去执行。这和其它地方的nosignal不一样,其它地方都是本协程唤醒其它协程,最后执行一次flush

你的意思是只需要支持nosched=true && nosignal=true 和 nosched=false && nosignal=false这两种场景是吗?
如果暂时没有nosched=true && nosignal=false和nosched=false && nosignal=true这两种场景的需求,也可以先去掉nosched参数。将来有需求再添加

@yanglimingcn
Copy link
Contributor Author

yanglimingcn commented May 16, 2022

nosched参数是不是可以去掉了,如果nosignal是true,就不要调用exchange,而是调用ready_to_run 另外,我觉得exchange里面的nosignal=true会有问题,因为当前协程切出去了,再次回来需要一个flush,这个flush不知道谁去执行。这和其它地方的nosignal不一样,其它地方都是本协程唤醒其它协程,最后执行一次flush

你的意思是只需要支持nosched=true && nosignal=true 和 nosched=false && nosignal=false这两种场景是吗? 如果暂时没有nosched=true && nosignal=false和nosched=false && nosignal=true这两种场景的需求,也可以先去掉nosched参数。将来有需求再添加

嗯,这个需求最初的意思其实是让当前协程不被切换出去,尽量把一些事情处理完了,再统一去调度其他的协程。我再发出一版,辛苦你再Review一下。

@yanglimingcn yanglimingcn changed the title butex_wake_all support nosched flag butex_wake_all support nosignal flag May 16, 2022
@wwbmmm wwbmmm merged commit 341ad4d into apache:master Jun 6, 2022
guodongxiaren added a commit to guodongxiaren/incubator-brpc that referenced this pull request Jun 19, 2022
* rm ParseHostname and restore http_message.cpp

* reduce ParseUrl() for Init()

* update en/http_client.mp

* Update circuit_breaker.md

* Update overview.md

* community: update lorin's oncall report from 08/23/2021

* redesign InitSingle

* update docs

* Update oncall.md

Add caojin's  weekly oncall report

* add cases.md

* add ut

add ut

* Update oncall.md

add  link to helei's oncall report

* Add use case of Baidu

* Improve client doc

* Improve client en doc

* Improve error_code doc

* Improve error_code en doc

* Improve server doc

* Improve server en doc

* Improve cpu_profiler doc

* Improve status doc

* Improve status en doc

* Improve bthread_id doc

* EndPoint support ipv4 and unix domain socket

* Add EndPoint test for ipv4 and unix domain socket

* Support ipv6 and unix domain socket for brpc

* Support ipv6 and uds for example echo_c++

* Add doc for IPV6 and Unix domain socket

* Add test for consistent_hashing LB with IPV6 and UDS

* Update cases.md

* Create cases.md

* Fix noflush compile fail with glog

* Fix typo

* Update sampler.cpp

fix typo

* community: update oncall.md

* Fix LA selection runs too long

* wr/wrr policy degradation

* only support http not http2

* doc(circuit_breaker): add ema wiki link

* Update cases.md

* Update cases.md

* update oncall report

* community: add release doc

* Update release_cn.md

* release 1.0.0

* Update http2_rpc_protocol.cpp

grpc add H2 SETTINGS extension,remove this WARNING log。

* Update http2_rpc_protocol.cpp

* Add flag `pb_single_repeated_to_array'

Add an option to allow serialize/deserialize to/from a json array.

* Fix a test case on json to pb

* Add test case for `pb_single_repeated_to_array'

* Fix a typo of compress type

* Update client doc about channel destruction

* Fix typo

* fix小typo

found when following the instruction

* update gnp edit command options

* when http header parse error, send 400 response

* Exclude mesalink/openssl/ssl.h when find openssl

Resolve apache#1622

* change varable name

* fix status

* add release check related content

* Update releasecheck.md

* update docs

* Update images in docs to remove hostname and ip

* Add brpc logo

Supply a vector graphic logo of brpc(svg)

* Update gifs in docs to remove hostname and ip

* Update rpc_press doc to remove specific hostname

* Replace baidu internal wiki address

* minor, fix typo in Stream::OnReceived

* Update load_balancer.cpp

* added brpc in iqiyi

* added case in iqiyi

* minor, fix typo in execution_queue.md

* Update cpu profiler doc

* fix: rpc_view defer_close_second flag typo

* update oncall report

* doc: add use case

* docs(circuit_breaker): fix word error

* Add rpc_replay BUILD file

* Improvement: brpc support higher version of protobuf

* fix(input_messenger) client side retry policy

client retry parse message when baidu_std fall to streaming_rpc

* Update oncall.md

* Create newcommitter.md

* add http_host gflag for rpc_replay

* update doc for rpc_replay

* Support parse proto-text format http request body

* make msg more clear

* Update oncall.md

* change fatal to error

* add unittest for proto-text content-type

* add protobuf_text content-type ut

* docs: fix typo

* Fix work_stealing_queue_unittest for ARM.

* Fix LatencyRecorder qps not accurate

* Fix unstable RecorderTest case

* Fix unstable UT link error

* Fix unstable UT link error

* Add `;` for consistency to origin code.

* Typo Fix: Change Resourse to resource

* fix(input_messenger) update preferred index

fix comments

* Update flat_map_unittest.cpp insert test case

* [Compile] avoid compile error after gcc11 with --std=c++20

* flat_map: fix code annotation

* Update newcommitter.md

* Update Thrift download url to avoid pr build failed

thrift download url  `https://downloads.apache.org/thrift/0.11.0/thrift-0.11.0.tar.gz` has moved to `https://archive.apache.org/dist/thrift/0.11.0/thrift-0.11.0.tar.gz`

* ByteSize() is deprecated, use ByteSizeLong() instead

* Fix compiler fail after merge apache#1560 and apache#1460

* Fix merge conflicts

* Adjust parameter name to be more obvious

* Add docs for option `pb_single_repeated_to_array'

* Release 1.1.0

* Update newcommitter.md

* Update newcommitter.md

* Update year in NOTICE

* Add list-sigs in release step

* add macro

* use method instead of macro

* change to camel case

* Add vote results to incubator vote mail template

* doc: Update incubator vote mail template

* Update oncall.md

* doc: update download link in release doc

* [UPDATE] fix c struct compile error

solve this error. make it the same as bthread.h

* doc: Update announce step in release doc

* Oncall report from Apr 10th to Apr 16th

* add redis request empty notice

* add redis request empty notice

* add redis request empty notice

* Add redis auth doc

* Add oncall report Apr.17th - Apr.24th 2022

* client.md 文档优化

https:// 后面补一个空格,防止后面的中文被Github网页识别成链接的一部分

* fix auto https check

* Update oncall.md

* ``

* client.md 文档优化 (apache#1755)

* client.md 文档优化

https:// 后面补一个空格,防止后面的中文被Github网页识别成链接的一部分

* ``

* Update oncall.md

* comment fix

* Update client.md (apache#1767)

* json2pb::JsonToProtoMessage() supports parsing multiple jsons

* butex_wake* support nosignal flag, use bthread_flush signal batch (apache#1751)

* Fix compile error due to std limits header absent (apache#1764)

fix compile error like `incubator-brpc/src/brpc/redis_command.cpp:411:29: error: ‘numeric_limits’ is not a member of ‘std’`

* Fix _dl_sym undefined reference

* Fix a deadlock happened in ClearAbandonedStreamsImpl path(issues/1778) (apache#1781)

* Update newcommitter.md

* Update newcommitter.md

* Update newcommitter.md

* Update newcommitter.md

* Update newcommitter.md

* Support apple silicon

* Elaborate how-to-build steps for MacOS

* BugFix: Send WindowUpdate when ClearAbandonedStreams is called (apache#1786)

* add try catch for thrift

* update catch exception

* update exception

* update exception

* update exception

* Update oncall.md

* Fix discovery naming service core (apache#1802)

* Update weighted_randomized_load_balancer.cpp

* Update weighted_round_robin_load_balancer.cpp

Co-authored-by: LorinLee <lorinlee1996@gmail.com>
Co-authored-by: serverglen <serverglen@gmail.com>
Co-authored-by: Tanzhongyi(Jerry Tan) <13718272827@163.com>
Co-authored-by: Jiashun Zhu <zhujiashun2010@gmail.com>
Co-authored-by: tanzhongyi003 <jerrytan@apache.org>
Co-authored-by: jamesge <jge666@gmail.com>
Co-authored-by: wangweibing <wangweibing@baidu.com>
Co-authored-by: YIXIAO SHI <aieruishi@gmail.com>
Co-authored-by: jiangtao <hohojiang@126.com>
Co-authored-by: yangshuaijie <yangshuaijie@baidu.com>
Co-authored-by: huxiguo <uestc.hugo@gmail.com>
Co-authored-by: jiaozilang <qql7041@gmail.com>
Co-authored-by: caidj <31362185+cdjingit@users.noreply.github.com>
Co-authored-by: cdjingit <642580887@qq.com>
Co-authored-by: 李磊 <lilei.rd@bytedance.com>
Co-authored-by: Jairo <947600215@qq.com>
Co-authored-by: Zhangyi Chen <frozen.zju@gmail.com>
Co-authored-by: wxf <xiaofeng.wang@bqvision.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: egolearner <45122959+egolearner@users.noreply.github.com>
Co-authored-by: therealnick233 <nickpanmail@gmail.com>
Co-authored-by: 焦龙 <jiaolong02@meituan.com>
Co-authored-by: zhouwk <zhouwk@zenme.com>
Co-authored-by: wwbmmm <wwbmmm@163.com>
Co-authored-by: Clement Ma <mahongweichina@gmail.com>
Co-authored-by: Jiatao Tao <245915794@qq.com>
Co-authored-by: ehds <ehds@qq.com>
Co-authored-by: caidaojin <caidaojin@qiyi.com>
Co-authored-by: dl239 <dl239@126.com>
Co-authored-by: taoxu <taoxu@bilibili.com>
Co-authored-by: Shuai Zhang <zhangshuai.ustc@gmail.com>
Co-authored-by: HongboLiu <lhbf@qq.com>
Co-authored-by: Tudou <gao568321994@outlook.com>
Co-authored-by: TousakaRin <lhestz@163.com>
Co-authored-by: Yangshen⚡Deng <yangshen.d@bupt.edu.cn>
Co-authored-by: mwish <anmmscs_maple@qq.com>
Co-authored-by: Xiaofeng Wang <wasphin@gmail.com>
Co-authored-by: Franklin Shan <37760924+wolfdan666@users.noreply.github.com>
Co-authored-by: lzfhust <lzfhust@126.com>
Co-authored-by: devin.zhang <devin.zhang@shopee.com>
Co-authored-by: lei.li <lei.li@clickzetta.com>
Co-authored-by: 372046933 <372046933@users.noreply.github.com>
Co-authored-by: Yang,Liming <liming.yang@139.com>
Co-authored-by: GOGOYAO <shunyuyao@126.com>
Co-authored-by: helei.sig11 <helei.sig11@bytedance.com>
@MrGuin
Copy link
Contributor

MrGuin commented Dec 13, 2023

nosched参数是不是可以去掉了,如果nosignal是true,就不要调用exchange,而是调用ready_to_run 另外,我觉得exchange里面的nosignal=true会有问题,因为当前协程切出去了,再次回来需要一个flush,这个flush不知道谁去执行。这和其它地方的nosignal不一样,其它地方都是本协程唤醒其它协程,最后执行一次flush

no_signal 为 true,不调用 exchange,ready_to_run* 会传递 no_signal 参数,被唤醒的 bthread 不就无法被及时调度了吗?
我们有同样的场景,不希望当前执行 butex_wake 的 bthread 被 exchange 转去执行被唤醒的 bthread,所以使用了这个 feature,现在观察到被唤醒的 bthread 从唤醒到被调度执行会有上百毫秒的延迟,因为现在的实现 no_signal 传递给了 ready_to_run*,导致无法被及时调度处理。

@MrGuin
Copy link
Contributor

MrGuin commented Dec 13, 2023

现在我这里每次 butex_wake 后主动调 bthread_flush 解决了不能及时调度的问题。
我看了下上下文,感觉当前的 no_signal 的实现针对场景是消费者 bthread 批量 butex_wake,然后再一次 bthread_flush,no_signal 参数不希望被调度,顺带也不会把当前 bthread 给 exchange 过去。但如果消费者无法批量去 butex_wake 最后统一 flush 的话,就只能传 no_signal 参数然后手动去 bthread_flush 了。
比如我们是 braft 状态机 on_apply 线程,每次修改状态机然后唤醒等待在 butex 上的 RPC线程,因为每个 raft group 的 on_apply 是严格串行的,是性能瓶颈,所以我们不希望它在唤醒别人的时候被切走,所以传了 no_signal,但被唤醒的 RPC bthread 也是要及时调度执行的,不然会影响 RPC 服务的 responsiveness。如果只是要达到不让当前调用 butex_wake 的线程被 exchange 走的效果,只能 no_signal 然后 bthread_flush,感觉不够优雅。
@wwbmmm 佬怎么看呢?

@wwbmmm
Copy link
Contributor

wwbmmm commented Dec 19, 2023

只能 no_signal 然后 bthread_flush,感觉不够优雅

虽然不够优雅,但能解决问题吧,再加参数的话感觉这个接口也太复杂了。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants