-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension fails to load after node restart #137
Comments
This seems to be an issue with |
I cannot reproduce this issue, are you using some extra mount under Also this error is not from iscsi-tools spec: failed to generate spec: failed to mkdir "/usr/local/etc/iscsi": mkdir /usr/local/etc: read-only file system must be some workload that's on the cluster |
Hi @frezbo again! Correct, using the democratic-csi plugin - apologies if its related to that - the specific container that seems to fail is driver-registrar
This for me happens consistently on a restartred node (normally when moving a VM around which isnt that frequent and could be avoided but still important that it should work as previous):
I only have two nodes currently, the one thats running is identical in configuration (to my knowledge) and is working, the bottom one has been rebooted. If I reboot the other one my prediction from what Ive seen thus far is it would also suffer the same problem (reluctant to do that as its running some stuff like HA/Omada etc) - Ive spun up fresh nodes using talos wipe --> send new configuration file and its working everytime, restart, this issue) The node configuration file is exactly the same, only the IP/host and therefore nodename is changing. Apologies if its my lack of understanding in the area/something silly Im doing, youre help is appreciated in all cases! If it should be with democratic-csi then thats fine (and sorry for the time) - but the extension on the node led me to believe it might be Talos extension. |
this seems to be likely an issue with the csi, and from the looks of the error maybe it wasn;t patched to run the commands in the iscsi-tools extension pid namespace and trying to create files on the host which is read-only |
Ive updated to Talos v1.4.6, Kubernetes v1.27.1, this is still occurring, Im not sure where the issue is, but some extra information, I had a power cut, which forced all nodes to restart. What is new, is that, even though all three had very similar configuration (i would say identical for all intensive purposes), one of them has worked post a restart:
Therefore it is able to create such files in the usual cases which still makes me think its the extension or Talos related. Its still giving the same error on the failed to set up iScsi nodes: Failed node 1 describe pod:
Failed node 2 describe pod [post-restart]:
Working node [post-restart] -
At the moment the only fix Im aware off is to re-roll the node with the same config, very frustrating. Considering switching back to NFS or any other recommended storage solution (I am running Ceph). Im not sure if this would be a problem with any of the CSI drivers though or just iSCSI Any help appreciated |
Im using the iSCSI extension in my machine install section:
install: disk: /dev/sda image: ghcr.io/siderolabs/installer:v1.3.5 bootloader: true wipe: true extensions: - image: ghcr.io/siderolabs/iscsi-tools:v0.1.4
This might actually be specific issue to the iscsi-tools extension and not Talos, but reporting nonetheless.
For a new node this works great and as expected, but as soon as I restart the node I get this error:
spec: failed to generate spec: failed to mkdir "/usr/local/etc/iscsi": mkdir /usr/local/etc: read-only file system
Was expecting the node restart to behave the same, suspect its something thats being done on initial install with the iScsi tools install that isnt recreated on a restart post initial installation. Very frustrating having to re-roll nodes, if they need a reboot.
Environment
Tag: v1.3.5 SHA: 03edf8c1 Built: Go version: go1.19.6 OS/Arch: linux/amd64 Enabled: RBAC
]Server Version: v1.26.1
]Linux
The text was updated successfully, but these errors were encountered: