Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi local storage handling for kvm #6699

Merged
merged 19 commits into from Nov 16, 2023

Conversation

DK101010
Copy link
Contributor

@DK101010 DK101010 commented Sep 5, 2022

Description

Is currently a POC, if someone have additional ideas or hints.

Motivation
Users sometimes only have the option to use local storages, so they should have the option to use local storage like other storage types.

Implementation
Decided to use already existing code to prevent breaks in current behavior(adapted agent code by user to add local storage)
With this implementation it is possible to add and use local storage like other storage types.

Key properties to add a local storage
Scope: HOST
Protocol: Filesytem
Path: mount path in KVM

Condition:
Enable local storage for User VMs = true in zone detail settings

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Currently only tested in own environment

@weizhouapache
Copy link
Member

@DK101010

multiple local storage is supported since #6147
can you test it ?

@DK101010
Copy link
Contributor Author

DK101010 commented Sep 5, 2022

@DK101010

multiple local storage is supported since #6147 can you test it ?

@weizhouapache Damn, the last time I checked the PRs, I didn't see that one.... never mind

Hmm ... you implemented this via agent properties and my implementation works via UI like the other storages.
For user acceptance reasons I prefer my solution. User can use local storages like other storage types without touch agent properties. Also no restarts required.
Excuse me for insisting, but It will be hard to explain my users that they have to manipulate the agent properties instead to use the ui to add local storages but for other storages they have to use the ui. :/

Perhaps we find a compromise. But I'm afraid that our implementations are disjunct. Do you have an idea?

@github-actions
Copy link

github-actions bot commented Sep 5, 2022

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@weizhouapache
Copy link
Member

@DK101010
multiple local storage is supported since #6147 can you test it ?

@weizhouapache Damn, the last time I checked the PRs, I didn't see that one.... never mind

Hmm ... you implemented this via agent properties and my implementation works via UI like the other storages. For user acceptance reasons I prefer my solution. User can use local storages like other storage types without touch agent properties. Also no restarts required. Excuse me for insisting, but It will be hard to explain my users that they have to manipulate the agent properties instead to use the ui to add local storages but for other storages they have to use the ui. :/

Perhaps we find a compromise. But I'm afraid that our implementations are disjunct. Do you have an idea?

@DK101010

The reason I choose agent properties is, this can only be operated by administrator. they can do it by automation tools like chef, ansible, etc.

agree with you it would be nice to support it via api and UI, it will be more user-friendly. for example
(1) add host with multiple local storage pools on UI
(2) list all local storage pools of a host on UI (of course API as well)
(3) add local storage pool to a host (create libvirt storage pool, update db and agent.properties)
(4) remove local storage from a host (remove libvirt storage pool, update db and agent.properties)

1 is optional if 2,3,4 are supported. we can start by adding a host with only 1 local storage pool.

@DK101010
Copy link
Contributor Author

@weizhouapache

I would also prefer to ignore number 1.
For everything else, I can adapt the source code.

But I still have two questions.
(on point 3. and 4. - create/remove libvirt storage pool )
Does it make sense that Cloudstack can create local storage? I had thought about that, but I came to the conclusion that it would be a break in logic. As far as I know (KVM, VMware) cloudstack does not create storages. CS only adds them to its own infrastructure.

Why do the agent properties have to be adapted?
My PR works completely without the agent properties and so far I have not noticed any errors that would require an adjustment in the agent properties.
I can add/remove local storages. Create/delete/migrate VM's.

@weizhouapache
Copy link
Member

@weizhouapache

I would also prefer to ignore number 1. For everything else, I can adapt the source code.

But I still have two questions. (on point 3. and 4. - create/remove libvirt storage pool ) Does it make sense that Cloudstack can create local storage? I had thought about that, but I came to the conclusion that it would be a break in logic. As far as I know (KVM, VMware) cloudstack does not create storages. CS only adds them to its own infrastructure.

Why do the agent properties have to be adapted? My PR works completely without the agent properties and so far I have not noticed any errors that would require an adjustment in the agent properties. I can add/remove local storages. Create/delete/migrate VM's.

@DK101010
when cloudstack agent is started or restarted, it will
(1) read the storage pools (uuid and path) in agent.properties
(2) create libvirt storage pools (not the directory itself) if not exist.

you can check createLocalStoragePool method in
https://github.com/apache/cloudstack/blob/main/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/LibvirtComputingResource.java#L3419

@weizhouapache
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✖️ suse15. SL-JID 4284

@weizhouapache
Copy link
Member

@DK101010
when the coding is done, please mark this "Ready for review"

@DK101010
Copy link
Contributor Author

DK101010 commented Oct 5, 2022

@DK101010 when the coding is done, please mark this "Ready for review"

@weizhouapache it isn't done. Currently I try to find out, why it is important to use the agent.properties file instead to create entries directly in the db via rest api.
Also open the libvirt part at these points. (Until now it is possible to attach, detach storages)
(3) add local storage pool to a host (create libvirt storage pool, update db and agent.properties)
(4) remove local storage from a host (remove libvirt storage pool, update db and agent.properties)

@DK101010 DK101010 force-pushed the feat/multi_local_storage_handling branch 2 times, most recently from 67ff6dc to 3a19913 Compare June 9, 2023 07:20
DK101010 and others added 3 commits November 8, 2023 10:35
@shwstppr
Copy link
Contributor

shwstppr commented Nov 8, 2023

@blueorangutan package

@blueorangutan
Copy link

@shwstppr a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 7675

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-8276)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 45281 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr6699-t8276-kvm-centos7.zip
Smoke tests completed. 115 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_08_upgrade_kubernetes_ha_cluster Failure 682.70 test_kubernetes_clusters.py
test_08_migrate_vm Error 0.07 test_vm_life_cycle.py

Copy link
Collaborator

@rajujith rajujith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DK101010 Unfortunately I do not see an option to select the 'filesystem' protocol. Neither do I get an option to add via API. 'Enable local storage for User Instances' is enabled. Restarted the management server as well.
image

image

Screenshot 2023-11-10 at 5 50 53 PM

@DK101010
Copy link
Contributor Author

@DK101010 Unfortunately I do not see an option to select the 'filesystem' protocol. Neither do I get an option to add via API. 'Enable local storage for User Instances' is enabled. Restarted the management server as well. image

image

Screenshot 2023-11-10 at 5 50 53 PM

Strange, we use it already in our environment without problems. hmm ... I will check this again next week.

@DaanHoogland
Copy link
Contributor

@DK101010 Unfortunately I do not see an option to select the 'filesystem' protocol. Neither do I get an option to add via API. 'Enable local storage for User Instances' is enabled. Restarted the management server as well. image

image

Screenshot 2023-11-10 at 5 50 53 PM

Strange, we use it already in our environment without problems. hmm ... I will check this again next week.

@DK101010 , is there maybe a prereq procedure that needs documentation? like installing/enabling a provider and or setting global configurations.

@DaanHoogland
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@DK101010
Copy link
Contributor Author

@rajujith @DaanHoogland many thanks for testing. The problem was that the filesystem protocol is dependent on scope and zone. If the zone was not selected or the zone was selected before the scope, the protocol is not updated. With the new scope event it should now also work for you

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el7 ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 7731

Copy link
Collaborator

@rajujith rajujith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Steps I followed:
Enable 'Enable local storage for User Instances' at zone.

Create Local storage:

1. Create a directory in one of the KVM host.
2. Click on add primary storage in the cloudstack UI. 
3. Fill the form selecting the scope as 'HOST" and the above host. Selected the protocol 'Filesystem'.
4. Input the directory path from the above step 1. 

If the directory is not present the primary storage addition fails, I believe this is expected.
Deleted the above primary storage, the primary storage is not visible anymore in the UI. The directory remains in the KVM host filesystem.
Verified volume deployments, and migrations.

@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@DaanHoogland
Copy link
Contributor

@JoaoJandre can you approve now?

Copy link
Contributor

@shwstppr shwstppr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

@DaanHoogland
Copy link
Contributor

api-rate limit again:
Smoke tests completed. 115 look OK, 2 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
test_02_trigger_shutdown Failure 336.69 test_safe_shutdown.py
test_02_upgrade_kubernetes_cluster Failure 551.92 test_kubernetes_clusters.py

I think we can call this a success

Copy link
Contributor

@JoaoJandre JoaoJandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM

@DaanHoogland DaanHoogland merged commit 6001772 into apache:main Nov 16, 2023
18 of 26 checks passed
dhslove pushed a commit to ablecloud-team/ablestack-cloud that referenced this pull request Nov 29, 2023
Co-authored-by: DK101010 <dirk.klahre@itelligence.de>
Co-authored-by: João Jandre <48719461+JoaoJandre@users.noreply.github.com>
@andrijapanicsb
Copy link
Contributor

Whoever is reviewing the PR, can we please try to ensure that the proper documentation is also provided for a feature? I'll add the related docs for this one. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants