Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACKPORT][v1.6.1][BUG] The activated DR volume do not contain the latest data. #7946

Closed
github-actions bot opened this issue Feb 16, 2024 · 5 comments
Assignees
Labels
area/volume-backup-restore Volume backup restore kind/backport Backport request kind/bug reproduce/always 100% reproducible require/backport Require backport. Only used when the specific versions to backport have not been definied. require/manual-test-plan Require adding/updating manual test cases if they can't be automated require/qa-review-coverage Require QA to review coverage
Milestone

Comments

@github-actions
Copy link

backport #7945

@github-actions github-actions bot added area/volume-backup-restore Volume backup restore kind/backport Backport request kind/bug reproduce/always 100% reproducible require/backport Require backport. Only used when the specific versions to backport have not been definied. require/manual-test-plan Require adding/updating manual test cases if they can't be automated require/qa-review-coverage Require QA to review coverage labels Feb 16, 2024
@github-actions github-actions bot added this to the v1.6.1 milestone Feb 16, 2024
@shuo-wu shuo-wu self-assigned this Feb 20, 2024
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Feb 20, 2024

Pre Ready-For-Testing Checklist

@roger-ryao roger-ryao self-assigned this Feb 23, 2024
@roger-ryao
Copy link

Verified on v1.6.x-head 20240223

The test steps

Test Method 1 : #7945 (comment)

  1. Launch 2 clusters and both have Longhorn installed
  2. Set up a backup target.
  3. Create two volumes and write data in the 1st cluster. Then create volumes' backups.
  4. Restore two backup files as DR volumes in the 2nd cluster.
  5. Modify the backup poll interval to a large value.
  6. Write more data for the volumes in the 1st cluster, and create the 2nd backup.
  7. Activate one of the DR volumes in the 2nd cluster. Then verify the data

Test Method 2 refer:#7947 (comment)

Result
Hi @shuo-wu
The activated DR volume does not contain the latest data. However, manually updating backupVolume.Spec.SyncRequestedAt causes the DR volume to sync up with the latest backup.
Could you help to check it?

supportbundle_91df4b65-3a16-4cb1-8d8f-d901ffb280da_2024-02-23T10-10-56Z.zip
volume name : pvc-3da73260-6e9d-4a18-80ac-ae63864ed6c2

@shuo-wu
Copy link
Contributor

shuo-wu commented Feb 23, 2024

Weird...
The activation is triggered (and finished) at 2024-02-23T10:06:23Z. backupVolume.Spec.SyncRequestedAt should be the same timestamp. But its value in the support bundle is 2024-02-23T10:06:25Z. I have not found that which one updates it again. BTW, the 2nd backup backup-8fefd2d832114dcb creation time is 2024-02-23T10:06:24Z

2024-02-23T10:06:23.531811509Z time="2024-02-23T10:06:23Z" level=info msg="Activating volume pvc-3da73260-6e9d-4a18-80ac-ae63864ed6c2 with frontend blockdev" func="manager.(*VolumeManager).Activate" file="volume.go:422"
2024-02-23T10:06:23.849213385Z time="2024-02-23T10:06:23Z" level=info msg="Restore/DR volume finished with the last restored backup backup-c0d99731ad034d59" func="controller.(*VolumeController).checkAndFinishVolumeRestore" file="volume_controller.go:3214" accessMode=rwx controller=longhorn-volume frontend=blockdev migratable=false node=ryao-16x-w1-29d1f2d8-bwq5j owner=ryao-16x-w1-29d1f2d8-bwq5j shareEndpoint= shareState=stopped state=attached volume=pvc-3da73260-6e9d-4a18-80ac-ae63864ed6c2

Is this case always reproducible? Can this be reproduced in other versions (master or v1.5.x)?

@roger-ryao
Copy link

Weird... The activation is triggered (and finished) at 2024-02-23T10:06:23Z. backupVolume.Spec.SyncRequestedAt should be the same timestamp. But its value in the support bundle is 2024-02-23T10:06:25Z. I have not found that which one updates it again. BTW, the 2nd backup backup-8fefd2d832114dcb creation time is 2024-02-23T10:06:24Z

2024-02-23T10:06:23.531811509Z time="2024-02-23T10:06:23Z" level=info msg="Activating volume pvc-3da73260-6e9d-4a18-80ac-ae63864ed6c2 with frontend blockdev" func="manager.(*VolumeManager).Activate" file="volume.go:422"
2024-02-23T10:06:23.849213385Z time="2024-02-23T10:06:23Z" level=info msg="Restore/DR volume finished with the last restored backup backup-c0d99731ad034d59" func="controller.(*VolumeController).checkAndFinishVolumeRestore" file="volume_controller.go:3214" accessMode=rwx controller=longhorn-volume frontend=blockdev migratable=false node=ryao-16x-w1-29d1f2d8-bwq5j owner=ryao-16x-w1-29d1f2d8-bwq5j shareEndpoint= shareState=stopped state=attached volume=pvc-3da73260-6e9d-4a18-80ac-ae63864ed6c2

Is this case always reproducible? Can this be reproduced in other versions (master or v1.5.x)?

I built up the test environment 3 times, and the issue could be reproduced on v1.6.x. However, I didn't observe the issue on v1.5.4-rc3. I think I should test it on v1.5.4-rc4. I didn't test it with the master-head yet.

@roger-ryao
Copy link

Verified on v1.6.x-head 20240301

The test steps

Test Method 1 : #7945 (comment)

  1. Launch 2 clusters and both have Longhorn installed
  2. Set up a backup target.
  3. Create two volumes and write data in the 1st cluster. Then create volumes' backups.
  4. Restore two backup files as DR volumes in the 2nd cluster.
  5. Modify the backup poll interval to a large value.
  6. Write more data for the volumes in the 1st cluster, and create the 2nd backup.
  7. Activate one of the DR volumes in the 2nd cluster. Then verify the data

Test Method 2 refer:#7947 (comment)

Result Passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/volume-backup-restore Volume backup restore kind/backport Backport request kind/bug reproduce/always 100% reproducible require/backport Require backport. Only used when the specific versions to backport have not been definied. require/manual-test-plan Require adding/updating manual test cases if they can't be automated require/qa-review-coverage Require QA to review coverage
Projects
None yet
Development

No branches or pull requests

3 participants