-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After upgrade from 4.18.0 to 4.18.1 cloudstack-agent not starting #8604
Comments
@yashi4engg |
@weizhouapache -- we tried workarroun by replace redhat-release content with oracle-release file and now able to add node to cluster ...But somehow now unable to create VM with below error ...even we have enough resources . 2024-02-05 14:44:19,773 ERROR [c.c.a.ApiAsyncJobDispatcher] (API-Job-Executor-14:ctx-5789063c job-295587) (logid:5f922a22) Unexpected exception while executing org.apache.cloudstack.api.command.admin.vm.DeployVMCmdByAdmin |
On hypervisor side we can see below error in agent.logs - |
@yashi4engg |
We were able to create VMs now and hosts also added back to cloudstack ... But still we had one question in mind. Is there any change from 4.18.0 to 4.18.1 so it causes that issue where same hypervisors were added to cloudstack without any change in 4.18.0 but as soon as we upgraded 4.18.1 even OS version remained same and no updated in OS files it was unable to add and needed change in host.OS property. Expected -- It shoul dadded back without any change as it was added earlier with same properties. |
I agree with you @yashi4engg any idea to fix it @DaanHoogland ? This is related to #7570 |
If I read this correctly the file /etc/redhat-release was editted. this is not the correct procedure. Instead the host details for the hosts in the cluster should be updated. I see this didn t make it into the release notes. |
@DaanHoogland -- I agree with you but as a work around we did that. As host.OS propery already showing Oracle in DB but still host was unable to join cluster So we made this change and host was able to join. You suggest to update host.os property to redhat rather then updating it to release file ? |
I would sugest editing the host-detail in the database for the hosts in the cluster to match the contents of the redhat-release file. In that way freshly installed hosts should be able to join the cluster without further manipulation in /etc. can you share the original contents of /etc/redhat-release and the value that you replaced it with? |
cat /etc/redhat-release cat /etc/oracle-release Now i have fixed the issue after updating Host.OS value in DB and reverted redhat-release contents as those with default installation as above. |
Issue is now resolved for us after updating Host.OS value but concern here is it should be not the case general scenario and host should be added by default without any change after upgrade. |
@yashi4engg this is an omission in the installation notes. |
On second though, I'll first give it some though as to if it can be/should have been automated. |
@DaanHoogland which includes
If we get version from
|
Your PR would solve the issue completely as we can just add strings like "Red" and "Red Hat" in the list. |
I checked it in bit details and found file which is responsible for check hypervisor OS version "/usr/share/cloudstack-common/scripts/vm/hypervisor/versions.sh" and according file it first looks on redhat-release and if exist it get details from there. if [ -f /etc/redhat-release ] ; then |
yes, this can be improved. |
ISSUE TYPE
COMPONENT NAME
CLOUDSTACK VERSION
CONFIGURATION
OS / ENVIRONMENT
SUMMARY
We are trying to upgrade from 4.18.0 to 4.18.1.
We have upgarde ,management node and its up with systemVM version 4.18.1 .
While upgrading hypervisors cloudstack-agent is not starting afetr package upgrade.
Below are logs :-
2024-02-02 14:26:39,507 INFO [cloud.agent.AgentShell] (main:null) (logid:) Implementation Version is 4.18.1.0
2024-02-02 14:26:39,508 INFO [cloud.agent.AgentShell] (main:null) (logid:) agent.properties found at /etc/cloudstack/agent/agent.properties
2024-02-02 14:26:39,546 INFO [cloud.agent.AgentShell] (main:null) (logid:) Defaulting to using properties file for storage
2024-02-02 14:26:39,546 INFO [cloud.agent.AgentShell] (main:null) (logid:) Defaulting to the constant time backoff algorithm
2024-02-02 14:26:39,580 INFO [cloud.utils.LogUtils] (main:null) (logid:) log4j configuration found at /etc/cloudstack/agent/log4j-cloud.xml
2024-02-02 14:26:39,581 INFO [cloud.agent.AgentShell] (main:null) (logid:) Using default Java settings for IPv6 preference for agent connection
2024-02-02 14:26:39,655 INFO [cloud.agent.Agent] (main:null) (logid:) id is 0
2024-02-02 14:26:39,665 ERROR [kvm.resource.LibvirtComputingResource] (main:null) (logid:) uefi properties file not found due to: Unable to find file uefi.properties.
2024-02-02 14:26:39,706 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) Failed to find passphrase for keystore: cloud.jks
2024-02-02 14:26:39,709 INFO [kvm.resource.LibvirtConnection] (main:null) (logid:) No existing libvirtd connection found. Opening a new one
2024-02-02 14:26:39,799 WARN [kvm.resource.LibvirtComputingResource] (main:null) (logid:) Ignoring libvirt error.
org.libvirt.LibvirtException: Network not found: no network with matching name 'default'
at org.libvirt.ErrorHandler.processError(Unknown Source)
at org.libvirt.ErrorHandler.processError(Unknown Source)
at org.libvirt.Connect.networkLookupByName(Unknown Source)
at com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.configure(LibvirtComputingResource.java:1081)
at com.cloud.agent.Agent.(Agent.java:190)
at com.cloud.agent.AgentShell.launchNewAgent(AgentShell.java:452)
at com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:431)
at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:415)
at com.cloud.agent.AgentShell.start(AgentShell.java:511)
at com.cloud.agent.AgentShell.main(AgentShell.java:541)
2024-02-02 14:26:39,916 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) IO uring driver for Qemu: disabled
2024-02-02 14:26:39,977 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.LinstorStorageAdaptor
2024-02-02 14:26:39,980 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.StorPoolStorageAdaptor
2024-02-02 14:26:39,980 WARN [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) Duplicate StorageAdaptor type PowerFlex, not loading com.cloud.hypervisor.kvm.storage.ScaleIOStorageAdaptor
2024-02-02 14:26:39,980 INFO [kvm.storage.KVMStoragePoolManager] (main:null) (logid:) adding storage adaptor for com.cloud.hypervisor.kvm.storage.IscsiAdmStorageAdaptor
2024-02-02 14:26:39,981 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) No libvirt.vif.driver specified. Defaults to BridgeVifDriver.
2024-02-02 14:26:40,116 INFO [cloud.serializer.GsonHelper] (main:null) (logid:) Default Builder inited.
2024-02-02 14:26:40,116 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) iscsi session clean up is disabled
2024-02-02 14:26:40,118 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) Skipping the memory balloon stats period setting, since there are no VMs (active Libvirt domains) on this host.
2024-02-02 14:26:40,119 INFO [kvm.resource.LibvirtComputingResource] (main:null) (logid:) The [vm.memballoon.stats.period] property is set to '0', this prevents memory statistics from being displayed correctly. Adjust (increase) the value of this parameter to correct this.
We are using kvm native bridge as networking.
On management server we can see error in exception -
2024-02-02 14:46:06,722 DEBUG [c.c.a.m.AgentManagerImpl] (AgentConnectTaskPool-1175:ctx-9a210df2) (logid:139886e2) Failed to handle host connection:
java.lang.IllegalArgumentException: Can't add host: x.x.x.x with hostOS, "Red Hat Enterprise Linux"into a cluster, in which there are "Oracle Linux Server" hosts added.
STEPS TO REPRODUCE
EXPECTED RESULTS
ACTUAL RESULTS
The text was updated successfully, but these errors were encountered: