-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
ISSUE TYPE
- Bug Report
COMPONENT NAME
API / Backend
CLOUDSTACK VERSION
4.19.1.x
CONFIGURATION
Issue was with both: basic and advanced networking. The configuration is shown below.
OS / ENVIRONMENT
Used setup:
Management + KVM host:
- OS: Ubuntu (jammy)
- Network: both use the same CIDR
Management:
- serves MySQL
- serves NFS shares for primary and secondary storage
KVM host:
- cloudbr0 configured
- Agent installed
- Joined via public ssh key (root)
SUMMARY
When setting up a zone (basic or advanced) the KVM host has joined to the cluster, but the SSVM and the CPVM stuck in "Starting". The log file of the SSVM shows a SSL error:
2024-11-20 23:54:22,618 WARN [cloud.agent.Agent] (main:null) NIO Connection Exception com.cloud.utils.exception.NioConnectionException: SSL Handshake failed while connecting to host: **.**.**.** port: 8250
The same log indicates, that the cloud agent on the SSVM was not able to detect the keystore:
2024-11-20 23:54:21,927 WARN [utils.nio.Link] (main:null) Failed to load keystore, using trust all manager
After playing around I found out, that the cloud agent expects to have a keystore cloud.jks in the /usr/local/cloud/systemvm/conf directory, which is populated from the /etc/cloudstack directory. Unfortunately, /etc/cloudstack is empty on the VM.
Already tried to work around by setting the global configuration parameter ca.plugin.root.auth.strictness to false (not really working for me, but with unexpected results):
- The Agent state of the system VM's turned immediately to
Up, while the overall state remains onStarting - After restarting cloudstack management and/or cloudstack agent, but status where
Up. - Creating a compute instance with a guest network is not possible, the virtual router instance aborts with an error state (possibly due to the same keystore issue)
STEPS TO REPRODUCE
Setup management server:
apt-get install -y \
apt-transport-https \
bridge-utils \
ca-certificates \
curl \
chrony \
gnupg \
lsb-release \
mysql-server \
net-tools \
nfs-kernel-server \
quota \
software-properties-common \
unattended-upgrades
cat <<'EOF' > /etc/mysql/mysql.conf.d/cloudstack.cnf
[mysqld]
server-id=1
innodb_rollback_on_timeout=1
innodb_lock_wait_timeout=600
max_connections=350
log-bin=mysql-bin
binlog-format = 'ROW'
EOF
systemctl restart mysql
wget -O - https://download.cloudstack.org/release.asc | tee /etc/apt/trusted.gpg.d/cloudstack.asc
echo "deb https://download.cloudstack.org/ubuntu noble 4.19" | tee /etc/apt/sources.list.d/cloudstack.list
apt-get update
apt-get install -y cloudstack-management
mkdir -p /export/primary /export/secondary
echo "/export *(rw,async,no_root_squash,no_subtree_check,insecure)" >> /etc/exports
exportfs -a
sed -i -e 's/^RPCMOUNTDOPTS="--manage-gids"$/RPCMOUNTDOPTS="-p 892 --manage-gids"/g' /etc/default/nfs-kernel-server
sed -i -e 's/^STATDOPTS=$/STATDOPTS="--port 662 --outgoing-port 2020"/g' /etc/default/nfs-common
echo "NEED_STATD=yes" >> /etc/default/nfs-common
sed -i -e 's/^RPCRQUOTADOPTS=$/RPCRQUOTADOPTS="-p 875"/g' /etc/default/quota
service nfs-kernel-server restart
cloudstack-setup-databases ***:***@localhost --deploy-as=root -i 127.0.0.1
cloudstack-setup-management
Setup KVM host:
apt-get install -y \
apt-transport-https \
bridge-utils \
ca-certificates \
curl \
chrony \
gnupg \
lsb-release \
net-tools \
quota \
software-properties-common \
unattended-upgrades
cat <<'EOM' > /etc/netplan/01-netcfg.yaml
network:
version: 2
ethernets:
eth0: {}
bridges:
cloudbr0:
addresses:
- **.**.**.**/**
nameservers:
addresses:
- **.**.**.**
routes:
- to: default
via: **.**.**.**
metric: 100
interfaces: [eth0]
EOM
chmod 600 /etc/netplan/01-netcfg.yaml
mv /etc/netplan/50-cloud-init.yaml /etc/netplan/50-cloud-init.yaml.dist
netplan generate && netplan apply
wget -O - https://download.cloudstack.org/release.asc | tee /etc/apt/trusted.gpg.d/cloudstack.asc
echo "deb https://download.cloudstack.org/ubuntu noble 4.19" | tee /etc/apt/sources.list.d/cloudstack.list
apt-get update
apt-get install -y qemu-kvm cloudstack-agent
sed -i -e 's/\#vnc_listen.*$/vnc_listen = "0.0.0.0"/g' /etc/libvirt/qemu.conf
systemctl mask libvirtd.socket libvirtd-ro.socket libvirtd-admin.socket libvirtd-tls.socket libvirtd-tcp.socket
systemctl restart libvirtd
mv /etc/libvirt/libvirtd.conf /etc/libvirt/libvirtd.conf.dist
cat <<'EOM' > /etc/libvirt/libvirtd.conf
listen_tls=0
listen_tcp=0
tcp_port = "16509"
mdns_adv = 0
auth_tcp = "none"
EOM
systemctl restart libvirtd
modprobe br_netfilter
echo 'net.bridge.bridge-nf-call-arptables = 0' >> /etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-iptables = 0' >> /etc/sysctl.conf
echo 'net.bridge.bridge-nf-call-ip6tables = 0' >> /etc/sysctl.conf
sysctl -p
- Copy SSH key of the cloudstack management to the KVM host.
- Wait, until management is ready
- Login in, create an advanced zone using the gateway from the cloudstack subnet and assign reserved IP ranges to pod and for public traffic.
- Create primary/secondary storage, join host using SSH key and root account.
- Enable zone
- Navigate to SystemVM below Infrastructure menu and see 2 VMs in starting mode
EXPECTED RESULTS
SSVM and CPVM starting up, cloud agent is running. Creation of compute instances using virtual router (isolated guest network) is possible.
ACTUAL RESULTS
Here the (hopefully) relevant log snippet.
2024-11-20 23:54:21,734 INFO [cloud.agent.Agent] (main:null) Agent [id = new : type = PremiumSecondaryStorageResource : zone = 1 : pod = 1 : workers = 5 : host = **.**.**.** : port = 8250
2024-11-20 23:54:21,809 INFO [utils.nio.NioClient] (main:null) Connecting to **.**.**.**:8250
2024-11-20 23:54:21,828 INFO [utils.nio.Link] (main:null) Conf file found: /usr/local/cloud/systemvm/conf/agent.properties
2024-11-20 23:54:21,927 WARN [utils.nio.Link] (main:null) Failed to load keystore, using trust all manager
2024-11-20 23:54:22,597 ERROR [utils.nio.Link] (main:null) SSL error caught during unwrap data: Received fatal alert: bad_certificate, for local address=/**.**.**.**:43322, remote address=/**.**.**.**:8250. The client may have invalid ca-certificates.
2024-11-20 23:54:22,602 ERROR [utils.nio.NioClient] (main:null) SSL Handshake failed while connecting to host: **.**.**.** port: 8250
2024-11-20 23:54:22,604 ERROR [utils.nio.NioConnection] (main:null) Unable to initialize the threads.
java.io.IOException: SSL Handshake failed while connecting to host: **.**.**.** port: 8250
at com.cloud.utils.nio.NioClient.init(NioClient.java:67)
at com.cloud.utils.nio.NioConnection.start(NioConnection.java:95)
at com.cloud.agent.Agent.start(Agent.java:286)
at com.cloud.agent.AgentShell.launchNewAgent(AgentShell.java:454)
at com.cloud.agent.AgentShell.launchAgentFromClassInfo(AgentShell.java:431)
at com.cloud.agent.AgentShell.launchAgent(AgentShell.java:415)
at com.cloud.agent.AgentShell.start(AgentShell.java:511)
at com.cloud.agent.AgentShell.main(AgentShell.java:541)
2024-11-20 23:54:22,618 WARN [cloud.agent.Agent] (main:null) NIO Connection Exception com.cloud.utils.exception.NioConnectionException: SSL Handshake failed while connecting to host: **.**.**.** port: 8250
2024-11-20 23:54:22,618 INFO [cloud.agent.Agent] (main:null) Attempted to connect to the server, but received an unexpected exception, trying again...
Thanks for looking at it✌️