Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade scylla major version #51

Closed
dahankzter opened this issue Feb 11, 2020 · 8 comments · Fixed by #294
Closed

Upgrade scylla major version #51

dahankzter opened this issue Feb 11, 2020 · 8 comments · Fixed by #294
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@dahankzter
Copy link
Contributor

Currently we employ a check that the scylla version of the target state is the same as the previous.
This prohibits an upgrade and we need to device a way to allow the user to upgrade.

@dahankzter dahankzter added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 11, 2020
@dahankzter
Copy link
Contributor Author

What do you think @tzach should we put this on 1.0? Right now there is coded logic preventing an upgrade. Removing it requires some care and validation to allow for a consistent upgrade.

@dahankzter dahankzter added this to the 1.0 milestone Feb 11, 2020
@tzach
Copy link
Contributor

tzach commented Feb 12, 2020

I do not think this is very urgent.
We can push it, unless its trivial to add.

@dahankzter
Copy link
Contributor Author

I think perhaps middle ground would be safe and fairly easy. That would be to allow upgrade within a major version. We can perhaps handle major upgrades later although that is also a matter of verification that the user knows what they are doing.

@dahankzter
Copy link
Contributor Author

This has potentially a sizeable amount of work what with snapshots being needed between major versions and some procedure for restoring to an older version. Let's postpone it as we discussed.

@dahankzter dahankzter removed this from the 1.0 milestone Feb 17, 2020
@dahankzter dahankzter changed the title Upgrade scylla Upgrade scylla major version Feb 20, 2020
@dahankzter
Copy link
Contributor Author

Perhaps a scheme suggested by @penberg can work.

We allow a user to upgrade to the latest minor version and only then do we allow them to upgrade
to a new major version. Essentially this gives them the possibility to upgrade to any minor version but only increase to new major version in case they are on the latest minor version.

How do we check this? We use docker hub REST API https://docs.docker.com/registry/spec/api/
from which we can deduce the appropriate rules.

Did I get that right @penberg? What do you think @tzach ?

@dahankzter dahankzter added this to the 1.0 milestone May 5, 2020
@gnumoreno
Copy link
Contributor

I know that upgrades sound simple but they are not. I would avoid major upgrades on the operator until this is handled by scylla-manager itself. There are a lot of requirements.

@dahankzter
Copy link
Contributor Author

After my discussion with @penberg the other day it got much clearer. Is the above strategy not correct? More or less. Waiting for Manager to do upgrades is a long way to go and also an orthogonal feature that I am not sure we want in Manager at all. At least not initially.

@mmatczuk mmatczuk added the noob label Sep 24, 2020
@espindola
Copy link
Contributor

I am not sure I understand why requiring an upgrade to the last minor makes things simpler for the operator. I can understand why it might be a good thing for scylla overall (less combinations for QA), but as long as we support upgrading the major version, knowing that we have the latest minor doesn't make things simpler as far as I can tell.

We also only support backups with the scylla manager, no? How would we handle upgrades without the scylla manager? Do we implement backups directly first?

Is this really a noob task?

@zimnx zimnx removed the noob label Nov 23, 2020
@zimnx zimnx self-assigned this Dec 2, 2020
zimnx added a commit that referenced this issue Dec 10, 2020
zimnx added a commit that referenced this issue Dec 10, 2020
zimnx added a commit that referenced this issue Dec 10, 2020
zimnx added a commit that referenced this issue Dec 10, 2020
zimnx added a commit that referenced this issue Dec 10, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in parallel.
For each node:
* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
** Node is UN
** Native transport port is UP (GET /storage_service/native_transport)
** Version of pod is updaded to desired one
* Clear data snapshot

After last node:
*Delete `system` and `system_schema` table snapshots on all nodes in parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 10, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 10, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 11, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 11, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 11, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 11, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 16, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 16, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 16, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 17, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
zimnx added a commit that referenced this issue Dec 18, 2020
Current patch version upgrade procedure is ran when cluster is being
upgraded to next patch version.
New major upgrade procedure handles all others upgrades.

Procedure:
* Check if the cluster has schema agreement (using API call)
* Take `system` and `system_schema` tables snapshot on all nodes in
parallel.

For each node:

* Drain node
* Backup the data - snapshot of all data keyspaces
* Update Scylla image by restarting Pod
* Validate if node is up and version is updated via API call
* Clear data snapshot

After last node:

* Delete `system` and `system_schema` table snapshots on all nodes in
parallel

Fixes #51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants