-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADBDEV-5187: Fix case pg_rewind_fail_missing_xlog #902
Conversation
Allure report https://allure.adsw.io/launch/67378 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201318 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201319 |
Failed job Regression tests with Postgres on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201310 |
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201312 |
Failed job Regression tests with Postgres on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201311 |
Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1201313 |
Allure report https://allure.adsw.io/launch/67991 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1228565 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1228566 |
Allure report https://allure.adsw.io/launch/68038 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1229441 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1229442 |
Allure report https://allure.adsw.io/launch/68102 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1231287 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1231288 |
Allure report https://allure.adsw.io/launch/68323 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1241065 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1241066 |
I faced such fail once in ~20 iterations of the test (tested locally). Did you encounter such issue? @@ -488,10 +2019,11 @@
-- Wait for the segment promotion finished and accept the connection
3: select connectSeg(600,port,hostname) from gp_segment_configuration where content = 1 and role = 'p';
- connectseg
-------------
- t
-(1 row)
+ERROR: Exception: wait connection timeout (plpy_elog.c:114)
+CONTEXT: Traceback (most recent call last):
+ PL/Python function "connectseg", line 14, in <module>
+ raise Exception("wait connection timeout")
+PL/Python function "connectseg"
-- Reset faults and confirm FTS configuration
3: SELECT gp_inject_fault('wal_sender_loop', 'reset', dbid) FROM gp_segment_configuration WHERE role='p' AND content = 1;
gp_inject_fault |
In the description there is a couple of issues (from the original commit), it would be nice if you fix them:
comma missed in "connection failed error occurs"?
"role change" -> "role changes"? |
No
changed |
Allure report https://allure.adsw.io/launch/68746 |
Thus I think it is better not to block on this. But we'll need to monitor the CI tests for a while to ensure that it isn't reproduced there. |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1268021 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1268023 |
@RekGRpth, Fix Co-authored-by |
fixed |
When execute inject fault `checkpoint_after_redo_calculated` or 'checkpoint_control_file_updated' in a newly promoted mirror node, connection failed, error occurs due to the fatal log 'mirror is being promoted'. It means when connection state is MIRROR_READY but the role changes to primary during promoting, the connection will be declined to avoid confusion. Wait for segment promotion finished and accept connection before inject fault. Co-authored-by: Xing Guo <higuoxing@gmail.com> This is a backport of commit 01d9c59. The connectSeg function has been adapted for the old Python.
Allure report https://allure.adsw.io/launch/68833 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1272128 |
Failed job Regression tests with ORCA on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1274526 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/1275457 |
Fix case pg_rewind_fail_missing_xlog
When execute inject fault
checkpoint_after_redo_calculated
or'checkpoint_control_file_updated' in a newly promoted mirror node,
connection failed, error occurs due to the fatal log 'mirror is being
promoted'. It means when connection state is MIRROR_READY but the
role changes to primary during promoting, the connection will be
declined to avoid confusion.
Wait for segment promotion finished and accept connection before
inject fault.
Co-authored-by: Xing Guo higuoxing@gmail.com
This is a backport of commit 01d9c59.
The connectSeg function has been adapted for the old Python.