Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Consistency verification failed during statbility test on distributed mode #15452

Closed
1 task done
aressu1985 opened this issue Apr 11, 2024 · 4 comments
Closed
1 task done
Assignees
Labels
Milestone

Comments

@aressu1985
Copy link
Contributor

aressu1985 commented Apr 11, 2024

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Branch Name

1.1-dev

Commit ID

55eb0f9

Other Environment Information

- Hardware parameters:
3*CN: 16C 64G
1*DN: 16C 64G
3*LOG: 4C 16G
- OS type:
- Others:

Actual Behavior

After statbility test on distributed mode about 10 hours, the tpcc consistency verification failed by:

2024-04-11 11:02:05 INFO ConsistencyCheck:109 - db=mo
2024-04-11 11:02:05 INFO ConsistencyCheck:109 - driver=com.mysql.cj.jdbc.Driver
2024-04-11 11:02:05 INFO ConsistencyCheck:109 - conn=jdbc:mysql://10.222.6.254:6001/tpcc_10?characterSetResults=utf8&continueBatchOnError=false&useServerPrepStmts=true&alwaysSendSetIsolation=false&useLocalSessionState=true&zeroDateTimeBehavior=CONVERT_TO_NULL&failoverReadOnly=false&serverTimezone=Asia/Shanghai&useSSL=false&socketTimeout=60000
2024-04-11 11:02:05 INFO ConsistencyCheck:109 - user=tpcc_test:admin
2024-04-11 11:02:11 ERROR ConsistencyCheck:86 - Consistency verification failed for sql : (select d_w_id, sum(d_ytd) from bmsql_district group by d_w_id) except(Select w_id, w_ytd from bmsql_warehouse);
2024-04-11 11:02:11 ERROR ConsistencyCheck:87 - The exceptional result are :
5 21999220.40

2024-04-11 11:02:11 ERROR ConsistencyCheck:86 - Consistency verification failed for sql : (Select w_id, w_ytd from bmsql_warehouse) except (select d_w_id, sum(d_ytd) from bmsql_district group by d_w_id);
2024-04-11 11:02:11 ERROR ConsistencyCheck:87 - The exceptional result are :
5 21998714.97

The value of sum(d_ytd) in bmsql_districts and sum(d_ytd) in warehouse must be the same.

test evn is on srv-128, the login link:
mysql -h10.222.6.254 -utpcc_test:admin -p111 -P6001

mo-log:
https://shanghai.idc.matrixorigin.cn:30001/explore?panes=%7B%22bYQ%22:%7B%22datasource%22:%22loki%22,%22queries%22:%5B%7B%22refId%22:%22A%22,%22expr%22:%22%7Bnamespace%3D%5C%22mo-nightly-55eb0f9-20240411%5C%22%7D%20%7C%3D%20%60%60%22,%22queryType%22:%22range%22,%22datasource%22:%7B%22type%22:%22loki%22,%22uid%22:%22loki%22%7D,%22editorMode%22:%22builder%22%7D%5D,%22range%22:%7B%22from%22:%221712767760731%22,%22to%22:%221712805399352%22%7D%7D%7D&schemaVersion=1&orgId=1

Expected Behavior

No response

Steps to Reproduce

1. run a mo cluster
2. run tpch 10g loop queries 
3. run tpcc 10-10 longrunning test
4. run sysbench mixed cases longrunning test

Additional information

No response

@aressu1985
Copy link
Contributor Author

image

It was not caused by a single trx error

@LeftHandCold
Copy link
Contributor

LeftHandCold commented Apr 11, 2024

经过排查发现两表的差值一直都是“4495.9”并没有像描述一样不停的改变,所以怀疑可能是某一个事务导致的问题。查看statement info,找到了一个可疑的事务。所有信息都可以对得上。
image
使用事务id找到当时这个事务为orphan事务,代表这个事务处理的pk并不会锁住。
image
沟通发现当时的case有两个tpcc的客户端在同时执行,并且分析dn的数据可以确定,有同一个pk的数据被覆盖的现象。

@zhangxu19830126
Copy link
Contributor

fixed

@aressu1985
Copy link
Contributor Author

done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants