Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(notready): condition failed #197

Merged
merged 1 commit into from
Dec 15, 2023

Conversation

redref
Copy link
Contributor

@redref redref commented Aug 18, 2023

Reworked condition around "NotReady" node.

Workflow :

  • Node goes to NotReady
  • At some point, pod is deleted by apiserver as node is NotReady for a long time
  • PVC got delete and reschedule as skipLv cannot find matching POD even if annotation is not there

Should not impact any other usecase. Only default when pod is not found.

Signed-off-by: Anthony BESCOND <anthony.bescond@kiln.fi>
@redref
Copy link
Contributor Author

redref commented Aug 18, 2023

/assign @antmoveh

@antmoveh
Copy link
Contributor

antmoveh commented Oct 8, 2023

Your design modifies the conditions under which pods are rebuilt. Only those with special annotations are not allowed to migrate.

https://github.com/carina-io/carina/blob/main/docs/manual_zh/failover.md#%E8%8A%82%E7%82%B9notready%E5%AE%B9%E5%99%A8%E8%BF%81%E7%A7%BB

This can cause some unnecessary trouble, such as node Notready, Pod migration pvc is rebuilt data loss

In most cases, data migration is performed in the database master/slave cluster

Data loss occurs when a single-master database is migrated

@redref
Copy link
Contributor Author

redref commented Oct 10, 2023

Your design modifies the conditions under which pods are rebuilt. Only those with special annotations are not allowed to migrate.

https://github.com/carina-io/carina/blob/main/docs/manual_zh/failover.md#%E8%8A%82%E7%82%B9notready%E5%AE%B9%E5%99%A8%E8%BF%81%E7%A7%BB

This can cause some unnecessary trouble, such as node Notready, Pod migration pvc is rebuilt data loss

In most cases, data migration is performed in the database master/slave cluster

Data loss occurs when a single-master database is migrated

Hello @antmoveh ,

My understanding is on the contrary, it make the default to skip if error or annotation not found.

Here the call where we skip clearPod and LV clearing. Return true or error will not take any action, and keep Volume where it is according to function naming "skipLv". https://github.com/carina-io/carina/blob/main/controllers
/node_controller.go#L196`

Now my modifications :

I might fix the last line to be triggered only when annotation is true (instead of not false), but to my understanding, other lines are fine and fix behavior.

Thanks for reading and please advise in case I missed something.

@antmoveh
Copy link
Contributor

/approve

@antmoveh
Copy link
Contributor

/lgtm

@antmoveh antmoveh added approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. labels Dec 15, 2023
@carina-ci-bot carina-ci-bot merged commit ced6e44 into carina-io:main Dec 15, 2023
3 of 6 checks passed
@carina-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: antmoveh, redref

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants