Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix unstable tests reindextable_while_reindex_idx #156

Merged
merged 1 commit into from
Aug 24, 2023

Conversation

Ray-Eldath
Copy link
Contributor

@Ray-Eldath Ray-Eldath commented Aug 18, 2023

closes: #26


Change logs

Three unstable tests matching reindex/reindextable_while_reindex_idx_*_part_btree are fixed. They failed in the same way thus could be fixed in the same manner.

These tests actually assumes when txn1 commits, its lock is acquired by txn3, and txn2 is blocked by it. Normally this is the case, but when the system load is high (e.g., when parallel is enabled, or other processes stress the system), it could be the case that the scheduler pass the access exclusive lock acquired by txn1 to txn2 (not txn3) in L22-L23, invalidate the assumption, causing the test to fail. Here I workaround this by instead of relying on the only one expected file which cannot checks two behaviour, SQL statement is used to directly check the correctness of one column.

This problem is also found in the upstream, mainline GPDB.


Note that this is a initial fix of one out of three total problematic cases. I'll fixed others once this passed review.

@avamingli
Copy link
Collaborator

LGTM, please fix other similar cases.

Three unstable tests matching reindex/reindextable_while_reindex_idx_*_part_btree
are fixed. They failed in the same way thus could be fixed in the same
manner.

These tests actually assumes when txn1 commits, its lock is acquired by
txn3, and txn2 is blocked by it. Normally this is the case, but when the
system load is high (e.g., when parallel is enabled, or other processes
stress the system), it could be the case that the scheduler pass the
access exclusive lock acquired by txn1 to txn2 (not txn3) in L22-L23,
invalidate the assumption, causing the test to fail. Here I workaround
this by instead of relying on the only one expected file which cannot
checks two behaviour, SQL statement is used to directly check the
correctness of one column.

This problem is also found in the upstream, mainline GPDB.

See: Issue#26 <cloudberrydb#26 (comment)>
@Ray-Eldath
Copy link
Contributor Author

Ray-Eldath commented Aug 18, 2023

All three cases have now been fixed.

@my-ship-it my-ship-it merged commit be2bddc into cloudberrydb:main Aug 24, 2023
6 checks passed
@Ray-Eldath Ray-Eldath deleted the fix-26 branch August 24, 2023 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Regression reindextable_while_reindex_idx_ao_part_btree failed occasionally
3 participants