New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "HDDS-5740. Enable ratis by default for SCM." #3362
Conversation
This reverts commit 3eb7235.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adoroszlai I agree that scm ha should not be enabled by default for upgrade. but for other profiles i feel we can still enable scm ha ?
Also, can you please include the problem faced during upgrade in the description ?
@mukul1987 SCM HA is definitely recommended for new deployments. Please correct me if I'm wrong, but my understanding is that simply enabling Ratis for single-node SCM is not SCM HA, as that requires multiple nodes and explicit configuration. |
When upgrading from an old version to a version supporting SCM HA, none of the SCM HA configs are allowed to be turned on to prevent writing incompatible data before finalization. What happened with the original change was when a non-Ratis SCM was upgraded to a version supporting SCM HA, the ratis enabled flag was automatically set to true. The upgrade framework stopped this before any changes could be made, hence all pre-SCM HA clusters upgraded will fail immediately on startup with the message: I am +1 for reverting the change for now to get the incompatibility out of master. However, I see @mukul1987's point that for new clusters, at least 1 node Ratis for SCM would be the preferred default. I think we can come up with a way to do this in a follow-up PR. |
I agree: if enabling SCM Ratis for single node has benefits, we can have an improvement Jira for that. But it is totally out of scope for this revert, as the original change is clearly breaking upgrades. |
information provided, change request out of scope
Thanks @errose28, @mukul1987 for the review. |
What changes were proposed in this pull request?
Revert #2637. The PR was merged regardless of my concern about changing the default setting without providing seamless upgrade path.
It turns out that enabling Ratis by default is indeed causing problems during upgrade. SCM fails to start with:
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-5740
How was this patch tested?
Regular CI:
https://github.com/adoroszlai/hadoop-ozone/actions/runs/2237408109