-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Platform][Backup] Backup is failing in itests #10907
Labels
Comments
hi @OlegLoginov - Can you do the 2.8 backport asap? 2.8.1 release is going out later today. thanks |
OlegLoginov
added a commit
that referenced
this issue
Dec 21, 2021
Summary: The previous fix changed `YBBackup.find_data_dirs` method: Commit: d6b5658 Diff: https://phabricator.dev.yugabyte.com/D14323 In old implementation the method called `run_ssh_cmd()` (to call `egrep` on every TS). The `run_ssh_cmd()` runs implicitly `upload_cloud_config()` for every TS. So, after the fix the config uploading is called on the next step - in `find_snapshot_directories()`, but the method is called in parallel for every data dir, so if we have 2 dirs on a TS, the `upload_cloud_config()` will be implicitly called twice. (And the second uploading should fail because the config file is already available on the remote node.) The fix explicitly calls `upload_cloud_config()` to prevent race on the call below from multiple parallel calls of `find_snapshot_directories()`. Test Plan: ybd --cxx-test tools_yb-backup-test_ent ybd --java-test org.yb.pgsql.TestYbBackup --tp 1 ybd --java-test org.yb.cql.TestYbBackup --tp 1 ybd --java-test org.yb.cql.ParameterizedTestYbBackup --tp 1 Reviewers: mihnea, achauhan Reviewed By: achauhan Subscribers: jenkins-bot, yql Differential Revision: https://phabricator.dev.yugabyte.com/D14441
OlegLoginov
added a commit
that referenced
this issue
Dec 21, 2021
…uploading. Summary: The previous fix changed `YBBackup.find_data_dirs` method: Commit: d6b5658 Diff: https://phabricator.dev.yugabyte.com/D14323 In old implementation the method called `run_ssh_cmd()` (to call `egrep` on every TS). The `run_ssh_cmd()` runs implicitly `upload_cloud_config()` for every TS. So, after the fix the config uploading is called on the next step - in `find_snapshot_directories()`, but the method is called in parallel for every data dir, so if we have 2 dirs on a TS, the `upload_cloud_config()` will be implicitly called twice. (And the second uploading should fail because the config file is already available on the remote node.) The fix explicitly calls `upload_cloud_config()` to prevent race on the call below from multiple parallel calls of `find_snapshot_directories()`. Original commit: 6ca4b2d Original diff: https://phabricator.dev.yugabyte.com/D14441 Test Plan: Jenkins: rebase: 2.8, hot ybd --cxx-test tools_yb-backup-test_ent ybd --java-test org.yb.pgsql.TestYbBackup --tp 1 ybd --java-test org.yb.cql.TestYbBackup --tp 1 ybd --java-test org.yb.cql.ParameterizedTestYbBackup --tp 1 Reviewers: mihnea, achauhan Reviewed By: achauhan Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D14457
OlegLoginov
added a commit
that referenced
this issue
Dec 22, 2021
…uploading. Summary: The previous fix changed `YBBackup.find_data_dirs` method: Commit: d6b5658 Diff: https://phabricator.dev.yugabyte.com/D14323 In old implementation the method called `run_ssh_cmd()` (to call `egrep` on every TS). The `run_ssh_cmd()` runs implicitly `upload_cloud_config()` for every TS. So, after the fix the config uploading is called on the next step - in `find_snapshot_directories()`, but the method is called in parallel for every data dir, so if we have 2 dirs on a TS, the `upload_cloud_config()` will be implicitly called twice. (And the second uploading should fail because the config file is already available on the remote node.) The fix explicitly calls `upload_cloud_config()` to prevent race on the call below from multiple parallel calls of `find_snapshot_directories()`. Original commit: 6ca4b2d Original diff: https://phabricator.dev.yugabyte.com/D14441 Test Plan: Jenkins: rebase: 2.6, hot ybd --cxx-test tools_yb-backup-test_ent ybd --java-test org.yb.pgsql.TestYbBackup --tp 1 ybd --java-test org.yb.cql.TestYbBackup --tp 1 Reviewers: mihnea, achauhan Reviewed By: achauhan Subscribers: jenkins-bot, yql Differential Revision: https://phabricator.dev.yugabyte.com/D14458
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
Multiple tests are failing due to errors in backup functionality
Examples:
Error looks like
2021-12-17 12:53:54,319 ERROR: Failed to run command [[ scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i /opt/yugabyte/yugaware/data/keys/0c53ead8-5a21-4982-81f8-d1ee0a4c709d/yb-itest-7b9dd2c6a4-20211217-123309_0c53ead8-5a21-4982-81f8-d1ee0a4c709d-key.pem -P 54422 -q /tmp/yb_backup_rflogzfcfxqstdgz/cloud_cfg yugabyte@10.9.203.45:/tmp/yb_backup_rflogzfcfxqstdgz ]]: code=1 output=scp: /tmp/yb_backup_rflogzfcfxqstdgz/cloud_cfg: Permission denied
It happens probably because two upload processes are running simultaneously
2021-12-17 12:53:53,158 INFO: Uploading /tmp/yb_backup_rflogzfcfxqstdgz/cloud_cfg to server 10.9.203.45 2021-12-17 12:53:53,538 INFO: Uploading /tmp/yb_backup_rflogzfcfxqstdgz/cloud_cfg to server 10.9.203.45
The text was updated successfully, but these errors were encountered: