Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix the issue that the Lua script is not synchronized when the redis sentinel master node is down #5990

Merged
merged 22 commits into from Nov 6, 2023

Conversation

PeppaO
Copy link
Contributor

@PeppaO PeppaO commented Nov 2, 2023

  • I have registered the PR changes.

Ⅰ. Describe what this PR did

redis主从复制,当主节点宕机后,从节点选举为主,当遇到NOSCRIPT异常时,重新script load一遍

Ⅱ. Does this pull request fix one issue?

#5936

Ⅲ. Why don't you add test cases (unit test/integration test)?

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

@PeppaO PeppaO changed the title bugfix: fix the redis master node does not synchronize the eval scriptLoad operation causing NOSCRIPT bugfix: fix the issue that the Lua script is not synchronized when the redis sentinel master node is down Nov 3, 2023
@funky-eyes funky-eyes added this to the 2.0.0 milestone Nov 3, 2023
@funky-eyes funky-eyes added type: bug Category issues or prs related to bug. module/server server module labels Nov 3, 2023
Copy link
Contributor

@funky-eyes funky-eyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo:之所以把lua脚本外置,而不是在源码中,是为了发现lua脚本执行有bug的时候可以人工介入修正脚本,重新进行scriptload,但是目前看来这是个伪需求,等待真实需求的用户出现后,再进行重新记录sha1编码的行为也不迟.目前来看,这个pr的处理已经足够了,即便存在线程安全的问题,如scriptload可能会执行多次,但是对于执行结果而言没有影响,且master节点宕机概率并不会太高,没有必要做到分布式下的线程安全,可以考虑下单节点是否避免多次scriptload,以lua文件名为粒度做锁
The reason why the lua script external, rather than in the source code, is to find the lua script execution has a bug when you can manually intervene to correct the script, re-scriptload, but at present it seems that this is a pseudo-need to follow up on the real needs of users can wait for the emergence of users, and then re-record the optimization of sha1 encoding is not too late. At present, this pr processing is already enough, even if there are thread-safety issues, such as scriptload may be executed more than once, but for the implementation of the results have no impact, and the probability of downtime of the master node will not be too high, there is no need to do the distributed thread-safety, you can consider whether the next single node to avoid multiple scriptload, to lua file name as the granularity of the lock


private static final String WHITE_SPACE = " ";

private static final String ANNOTATION_LUA = "--";

private static Map<String, String> LUA_FILE_MAP = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static final

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

try {
return jedis.evalsha(luaSHA, keys, args);
}catch (JedisNoScriptException e) {
LOGGER.warn("jedis ex: " + e.getMessage());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里日志要提示一下,否则用户会认为异常了,但实际上我们继续了reload script和重新执行了lua脚本
Here the logs should be prompted, otherwise the user will think it's an exception, but in fact we continue the reload script and re-execute the lua script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link

codecov bot commented Nov 3, 2023

Codecov Report

Merging #5990 (643dbf4) into 2.x (967906b) will decrease coverage by 0.73%.
The diff coverage is 13.63%.

❗ Current head 643dbf4 differs from pull request most recent head 92d95ec. Consider uploading reports for the commit 92d95ec to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##                2.x    #5990      +/-   ##
============================================
- Coverage     49.29%   48.56%   -0.73%     
+ Complexity     4734     4668      -66     
============================================
  Files           911      911              
  Lines         31305    31297       -8     
  Branches       3772     3768       -4     
============================================
- Hits          15432    15200     -232     
- Misses        14339    14584     +245     
+ Partials       1534     1513      -21     
Files Coverage Δ
...seata/server/storage/redis/JedisPooledFactory.java 24.48% <0.00%> (-1.05%) ⬇️
.../java/io/seata/server/storage/redis/LuaParser.java 36.73% <37.50%> (+0.14%) ⬆️
...eata/server/storage/redis/lock/RedisLuaLocker.java 17.39% <0.00%> (ø)
...e/redis/store/RedisLuaTransactionStoreManager.java 9.09% <0.00%> (ø)

... and 26 files with indirect coverage changes

@leizhiyuan leizhiyuan self-requested a review November 6, 2023 02:19
Copy link
Contributor

@leizhiyuan leizhiyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@funky-eyes funky-eyes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@funky-eyes funky-eyes merged commit b1c43b9 into apache:2.x Nov 6, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module/server server module type: bug Category issues or prs related to bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants