Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BACKPORT][v1.6.1][BUG] Longhorn api-server PUT request rate #8152

Closed
github-actions bot opened this issue Mar 11, 2024 · 7 comments
Closed

[BACKPORT][v1.6.1][BUG] Longhorn api-server PUT request rate #8152

github-actions bot opened this issue Mar 11, 2024 · 7 comments
Assignees
Labels
area/performance System, volume performance component/longhorn-manager Longhorn manager (control plane) investigation-needed Need to identify the case before estimating and starting the development kind/backport Backport request kind/bug priority/0 Must be fixed in this release (managed by PO) release/post-outstanding-issue Outstanding issues after the release
Milestone

Comments

@github-actions
Copy link

backport #8114

@github-actions github-actions bot added area/performance System, volume performance component/longhorn-manager Longhorn manager (control plane) investigation-needed Need to identify the case before estimating and starting the development kind/backport Backport request kind/bug priority/0 Must be fixed in this release (managed by PO) labels Mar 11, 2024
@github-actions github-actions bot added this to the v1.6.1 milestone Mar 11, 2024
@innobead innobead added the release/post-outstanding-issue Outstanding issues after the release label Mar 12, 2024
@innobead
Copy link
Member

ref: #8167

cc @ejweber

@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Mar 13, 2024

Pre Ready-For-Testing Checklist

@roger-ryao
Copy link

Verified on v1.6.1-rc1 20240314

The test steps
longhorn/longhorn-manager#2432 (comment)
#7425 (comment)

Result Passed

  1. If you wait any length of time, ReplicaA is never cleaned up.
  2. After deleting ReplicaB and waiting for the volume to rebuild from ReplicaA, the data is consistent.

@ejweber
Copy link
Contributor

ejweber commented Mar 15, 2024

Hello @roger-ryao, can you please confirm that you also ran longhorn/longhorn-manager#2685 (comment) to test this fix? Sorry for the confusion. I do think longhorn/longhorn-manager#2432 (comment) should be run again (as you have done), but the former are the primary test steps.

Moving this back to Ready for Testing. Please feel free to close it again directly if you have already run these steps.

@ejweber ejweber reopened this Mar 15, 2024
@roger-ryao
Copy link

roger-ryao commented Mar 18, 2024

Verified on v1.6.1-rc1 20240318

The test steps
longhorn/longhorn-manager#2685 (comment)
There are four cases to verify:

  1. Ensure that the upgrade correctly populates replicaTransitionTimeMap
  2. Confirm that the rebuild operation correctly populates replicaTransitionTimeMap and lastHealthyAt after replica deletion
  3. Validate that the rebuild operation correctly populates replicaTransitionTimeMap and lastHealthyAt after replica failure
  4. Ensure that replicas are not updated unnecessarily when the cluster is stable.

Result

  1. We observed that the replicaTransitionTimeMap is not visible on v1.6.1-rc1, but the test passed on the master-head.
  2. Attaching the screenshot, PUT requests to replica resources persist. Additionally, after updating Longhorn from v1.6.1-rc1 to the master-head, no PUT requests to replica resources were observed.

Screenshot_20240318_134627

Screenshot_20240318_144821

cc. @ejweber

@ejweber
Copy link
Contributor

ejweber commented Mar 18, 2024

This is correct. The necessary changes did not make it to longhorn-manager in https://github.com/longhorn/longhorn-manager/commits/v1.6.1-rc1. They are only in https://github.com/longhorn/longhorn-manager/commits/v1.6.x/.

cc @roger-ryao

@roger-ryao
Copy link

roger-ryao commented Mar 19, 2024

Verified on v1.6.x-head 20240319

Result Passed

  • 1. We observed that the replicaTransitionTimeMap is visible on v1.6.x-head.
  • 2. The rebuild operation correctly populates replicaTransitionTimeMap and lastHealthyAt after replica deletion
  • 3. The rebuild operation correctly populates replicaTransitionTimeMap and lastHealthyAt after replica failure
  • 4. After updating Longhorn from v1.6.1-rc1 to the v1.6.x-head, no PUT requests to replica resources were observed.

Screenshot_20240319_141934

Screenshot_20240319_141902

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/performance System, volume performance component/longhorn-manager Longhorn manager (control plane) investigation-needed Need to identify the case before estimating and starting the development kind/backport Backport request kind/bug priority/0 Must be fixed in this release (managed by PO) release/post-outstanding-issue Outstanding issues after the release
Projects
None yet
Development

No branches or pull requests

4 participants