Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] [GSOC]VelaHA Part1-Cluster Meta backup & restore #5483

Open
jefree-cat opened this issue Feb 11, 2023 · 10 comments
Open

[Feature] [GSOC]VelaHA Part1-Cluster Meta backup & restore #5483

jefree-cat opened this issue Feb 11, 2023 · 10 comments
Labels
area/controller K8s controller related issue summber-of-code and lfx the issue for lfx and google/alibaba summer of code type/enhancement New feature or request

Comments

@jefree-cat
Copy link
Member

jefree-cat commented Feb 11, 2023

VelaHA Part1-Cluster Meta backup & recovery:
back up management cluster k8s meta resources to nfs or object storage;
support full and incremental backup;
can restore to the cluster, refer to velero;

The vela cluster meta backup&restore is first step.
implement velero as KubeVela addon to install velero in kubevela cluster.
implement cli> vela ha enable to enable cluster meta backup with config scheduled(cron).
implement cli> vela ha restore to restore cluster meta.
The issue can let gsoc mentee understand

how to customize kubevela addon
how to implement vela cli

refer #5418

@jefree-cat jefree-cat added the summber-of-code and lfx the issue for lfx and google/alibaba summer of code label Feb 11, 2023
@jefree-cat jefree-cat changed the title [Feature]VelaHA Part1-Cluster Meta backup & recovery: [Feature] [GSOC]VelaHA Part1-Cluster Meta backup & recovery: Feb 11, 2023
@jefree-cat jefree-cat added the type/enhancement New feature or request label Feb 11, 2023
@jefree-cat jefree-cat changed the title [Feature] [GSOC]VelaHA Part1-Cluster Meta backup & recovery: [Feature] [GSOC]VelaHA Part1-Cluster Meta backup & restore: Feb 11, 2023
@jefree-cat jefree-cat changed the title [Feature] [GSOC]VelaHA Part1-Cluster Meta backup & restore: [Feature] [GSOC]VelaHA Part1-Cluster Meta backup & restore Feb 11, 2023
@wonderflow
Copy link
Collaborator

This is part of the GSoC program refer to cncf/mentoring#814 .

As to the commands, I suggest we use the vela system command:

  • to enable cluster meta backup with config scheduled(cron).
vela system backup --to <the target cluster endpoint>
  • to restore cluster meta.
vela system restore --from <the target cluster endpoint>
  • to disable the backup
vela system backup --stop
  • to see the status
vela system status
> show the backup process along with other system info.

@wonderflow wonderflow added the area/controller K8s controller related issue label Feb 13, 2023
@wonderflow
Copy link
Collaborator

As to the implementation, we should:

  1. keep the installation of the velero( https://velero.io/ ) component as a seprated addon, without the vela metadata syncing logic.
  2. vela system backup to check the installation of the velero addon, if not install it. There's possible to rely on other backup-restore implementation in this case.
  3. vela system backup will generate the syncing policy when the velero addon existed.

@wonderflow
Copy link
Collaborator

The vela system restore case maybe not that easy, it can be affected in the following cases:

  1. the clusters must be joined successfully align with the old control plane, especially the cluster name.
  2. all metedata related must be restored, including:
    • Vela-Core staffs:
      • applications
      • x-definitions(component,trait,workflowstep,policy)
      • applicationrevisions
      • resourcetracker
    • Workflow staffs:
      • several configmaps for context
    • VelaUX staffs:
      • several configmaps/secrets
    • Addon staffs

That's should be complicated if we take all things into considerations, to make it simple, we can just consider the vela-core and workflow as the initial implementation.

The most important thing is WE MUST NOT CAUSE ANY UNDERLYING WORKLOAD TO RESTART during the restore process.

@jefree-cat
Copy link
Member Author

The vela system restore case maybe not that easy, it can be affected in the following cases:

  1. the clusters must be joined successfully align with the old control plane, especially the cluster name.

  2. all metedata related must be restored, including:

    • Vela-Core staffs:

      • applications
      • x-definitions(component,trait,workflowstep,policy)
      • applicationrevisions
      • resourcetracker
    • Workflow staffs:

      • several configmaps for context
    • VelaUX staffs:

      • several configmaps/secrets
    • Addon staffs

That's should be complicated if we take all things into considerations, to make it simple, we can just consider the vela-core and workflow as the initial implementation.

The most important thing is WE MUST NOT CAUSE ANY UNDERLYING WORKLOAD TO RESTART during the restore process.

exactly, the backup& restore will coordinate with those use case.

@swastik959
Copy link

swastik959 commented Feb 26, 2023

Hi @jefree-cat I found this feature quite interesting and would be willing to work on it. I have a background in golang and kubernetes but I am new to kubevela can you lend me your guidance on how can I become a integral part of this community

@Alipebt
Copy link

Alipebt commented Feb 27, 2023

Hello! @jefree-cat
Im interested in this GSOC project. I would like to learn about it and contribute to the community.
Hope to finish this project this summer with GSOC!

@jefree-cat
Copy link
Member Author

jefree-cat commented Feb 28, 2023

@swastik959 @Alipebt Nice to hear that. You can contact me on slack channel. The #kubevela channel under CNCF.

@gueFDF
Copy link

gueFDF commented Feb 28, 2023

Hello! @jefree-cat
I have a strong desire to participate in this GSOC project and want to contribute to the community.
I want to finish it this summer!

@charlie0129
Copy link
Member

As to the commands, I suggest we use the vela system command:

vela system backup --to <the target cluster endpoint>

I'm interested in this project too.

I have some questions as for the suggested command. It seems Velero typically backs up to object storage, not to a specific cluster. Maybe we can let the user configure their target object storage credentials when installing the Velero addon. Then we can just call vela system backup to backup to the predefined location.

@Major-333
Copy link

Hi, I would like to take up this issue for this year's GSoC. I'm going through the kubevela docs. Any advice would be much-appreciated regarding the project.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/controller K8s controller related issue summber-of-code and lfx the issue for lfx and google/alibaba summer of code type/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants