Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix data race around spop Dobinlog #2674

Merged

Conversation

cheniujh
Copy link
Collaborator

@cheniujh cheniujh commented May 23, 2024

这个PR针对Issue #2673

引发Issue #2673的原因概述:Spop从指定set中随机移除一个或者多个元素,所以Binlog不能直接发spop给从节点,否则在从节点中spop所移除的元素可能和主节点不一样,所以之前有PR #2541 进行过相应修复,让Spop命令实际产生的binlog是Srem,该PR大体上没有问题,就是遗漏了针对SpopCmd这个类的拷贝构造函数的重写,一般来说,其他普通的的Cmd都是没有重写拷贝构造,所以从全局的cmd_table进行Clone的时候,一律都是值拷贝。
但是Spop因为对Binlog进行过修改,所以不能直接值拷贝,原因如下:Spop为了能写Srem的Binlog,给自己新增了一个成员:std::shared_ptr srem_cmd_ ,这个成员在构造时会进行make_shared赋值。而并发执行Spop命令时,所有的SpopCmd都是从全局的cmd_table中clone过来的,Spop没有重写拷贝构造的话,srem_cmd_就是浅拷贝,也就是所有并发的SpopCmd在写binlog的时候,用的都是同一个SremCmd对象,进而引发了data race等问题。

所以这个PR进行的修补就是给Spop重写了拷贝构造,确保每个Spop实例都使用一个独立的SremCmd对象来写Binlog。

This PR addresses Issue #2673

Summary of the cause of Issue #2673: SPOP removes one or more elements randomly from a specified set. Therefore, the Binlog cannot directly send SPOP to the slave node, as the elements removed by SPOP on the slave node may differ from those on the master node. Hence, in a previous PR #2541, a fix was implemented to have the Binlog for the SPOP command actually generate an SREM command. This PR was largely correct but missed overriding the copy constructor for the SpopCmd class. Generally, other common Cmd classes do not override the copy constructor, so when cloning from the global cmd_table, a value copy is made. However, because Spop modifies the Binlog, it cannot be value copied directly for the following reason: To write the SREM Binlog, Spop adds a new member: std::shared_ptr<SremCmd> srem_cmd_, which is assigned via make_shared during construction. When SPOP commands are executed concurrently, all SpopCmd instances are cloned from the global cmd_table. Without overriding the copy constructor, srem_cmd_ is shallow copied, meaning all concurrent SpopCmd instances use the same SremCmd object to write the Binlog, leading to data races and other issues.

Therefore, this PR's fix is to override the copy constructor for Spop to ensure each Spop instance uses an independent SremCmd object to write the Binlog.

@github-actions github-actions bot added the ☢️ Bug Something isn't working label May 23, 2024
@AlexStocks AlexStocks merged commit 1b2bfbe into OpenAtomFoundation:unstable May 24, 2024
12 checks passed
chenbt-hz pushed a commit to chenbt-hz/pika that referenced this pull request Jun 3, 2024
* fix spop binlog data race

* adjust format

---------

Co-authored-by: cjh <1271435567@qq.com>
@cheniujh cheniujh deleted the fix_data_race_spop_binlog branch June 24, 2024 03:21
@cheniujh cheniujh restored the fix_data_race_spop_binlog branch June 25, 2024 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.5.5 4.0.0 ☢️ Bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants