Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug: we should start conversion webhook before initiating Longhorn datastore #2812

Merged
merged 2 commits into from
May 22, 2024

Conversation

PhanLe1010
Copy link
Contributor

@PhanLe1010 PhanLe1010 commented May 21, 2024

@PhanLe1010
Copy link
Contributor Author

Putting this PR in draft until doing some more testing and have better code style

Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@PhanLe1010
Copy link
Contributor Author

@derekbit Can you review the PR again? It actually still needs a k8s client. Just do not need the Longhorn datastore.

The new modification will create 2 clients:

  1. One without Longhorn datastore
  2. One with Longhorn datastore.

I think it should not create a high memory usage issue since the one without Longhorn datastore doesn't have any cache. However, we should still double-check the test related to multiple clients in Longhorn manager. I forgot the ticket number, do you remember it?

@PhanLe1010 PhanLe1010 marked this pull request as ready for review May 21, 2024 02:35
@derekbit
Copy link
Member

@derekbit Can you review the PR again? It actually still needs a k8s client. Just do not need the Longhorn datastore.

The new modification will create 2 clients:

  1. One without Longhorn datastore
  2. One with Longhorn datastore.

I think it should not create a high memory usage issue since the one without Longhorn datastore doesn't have any cache. However, we should still double-check the test related to multiple clients in Longhorn manager. I forgot the ticket number, do you remember it?

longhorn#6866?

@ChanYiLin
Copy link
Contributor

LGTM
@PhanLe1010 do you mean this one? longhorn/longhorn#6936

@PhanLe1010
Copy link
Contributor Author

Yes, it is the test plan at longhorn/longhorn#6936 (comment)

Thank you @ChanYiLin and @derekbit !

Copy link
Member

@derekbit derekbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The separate clients LGTM.
I think there is no memory usage issue as @PhanLe1010 said.

ChanYiLin
ChanYiLin previously approved these changes May 21, 2024
@PhanLe1010
Copy link
Contributor Author

Looks like there is an new race condition arise after this PR. Please hold on merging

@PhanLe1010 PhanLe1010 marked this pull request as draft May 21, 2024 03:29
@innobead
Copy link
Member

@PhanLe1010 is this still draft?

datastore

longhorn-8578

Signed-off-by: Phan Le <phan.le@suse.com>
longhorn-manager

This is needed because after Longhorn manager daemonset was
deleted (aka conversion webhook is not running), we cannot
delete CRs in older API version in deleteRecreatedCRs().

Signed-off-by: Phan Le <phan.le@suse.com>
@PhanLe1010
Copy link
Contributor Author

PhanLe1010 commented May 22, 2024

It turned out that we not only have problem with upgrade stuck but also the problem with uninstall stuck because we delete the longhorn-manager daemonset (AKA the conversion webhook server) then list and the the BackupTarget which is still in v1beta1 version. This flow will fail. To solve this problem:

  1. A quick hack is ad32177
  2. A better solution would be to migrate the API versions for all Longhorn resources in the upgrade path (thus eliminate the need for conversion webhook for future Longhorn upgrade [IMPROVEMENT] Disable v1beta1 resource support longhorn#7943) OR separating the conversion webhook from longhorn-manager [IMPROVEMENT] Investigate if we should separate the conversion webhook from longhorn-manager longhorn#8612

Due to time pressure, I think we can go with the first approach for now and do the 2nd approach later . WYDT cc @derekbit @innobead @ChanYiLin @ejweber

@innobead
Copy link
Member

Sounds good to me.

@PhanLe1010 PhanLe1010 marked this pull request as ready for review May 22, 2024 00:46
@innobead innobead merged commit 23ccdd1 into longhorn:master May 22, 2024
7 checks passed
@innobead
Copy link
Member

@mergify backport v1.6.x v1.5.x

Copy link

mergify bot commented May 22, 2024

backport v1.6.x v1.5.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants