-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't attach volume when the trident DaemonSet can't connect to trident-csi for the first time. #283
Comments
I saw this with K8S 1.15.3 + T 19.07.0 last week |
Does the fix also resolve scenarios where connection is lost after some time and not just on startup? I've now experience both issues. |
@travisghansen Each CSI Trident node (i.e. each instance of the daemonset) must contact the Trident controller so that the controller knows the node info such as its name and iSCSI IQN. As long as that occurs at least once, it shouldn't matter if a node comes and goes any time after that. |
@clintonk understood. Something else must be up then. I initially ran into this issue when the trident pod was starting up. I just had another issue where a (non trident) pod never started because the trident pod on the node lost it's connection to the controller after running for several days. The pod was sitting in an I essentially want to make sure that it continually monitors and tries to connect over time. |
Also note, if I understand correctly the iscsi/iqn stuff happens at attach time so a connection to the controller would still need to function beyond startup of the pod. I may have misunderstood how/when that info got added though. Also that assumes that the node pods don't make those api calls but rather the controller...which may not be the case, not sure. |
@travisghansen Prior to this fix, if the CSI Trident node was restarted for whatever reason, and even if it had previously registered with the controller, if it could not contact the controller after the most recent restart, then it would give up after 90 seconds and no longer respond to attachment requests. This fix continues trying the registration indefinitely until it succeeds, and it does not block other node operations while doing so. FWIW, while attaching a volume to a pod, a CSI provisioner is first called at the controller followed by the node. The controller must know the node details, because only the controller interacts with the storage management interfaces, and knowing (for example) the node IQN allows us to update igroups as needed at attach time. |
@clintonk |
After discussion on slack with @clintonk it became clear I misread some logs. Connection after startup is irrelevant. Thanks for the help! Looking forward to the fix. |
brackets for datalif, required for mounting when specified by user
Trident 19.07.0 with CRD can't attach volume when the trident DaemonSet can't connect to trident-csi for the first time.
I think trident DaemonSet pod must crash(or retry forever) when it cant't register itself.
The details are as below.
(I modified the node name
my-worker1
for security reasons)I found below pods in my cluster which mount Trident's Volume.
I checked trident's log.
trident-csi
This log said
node Not Found
but the node existed.I found there are no
tridentnodes
.I checked DaemonSet Pod.
This was the cause of this trouble!
I delete the pod.
and checked re-created Pod works well.
checked
tridentnodes
.my pods working !
I think trident DaemonSet pod must crash(or retry forever) when it cant't register itself.
The text was updated successfully, but these errors were encountered: