iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure) / Could not log into all portals #140

ahgraber · 2021-12-23T20:46:20Z

I am suddenly getting this new error message, seems similar to 112. I had not changed my democratic-csi config since mid-October and got this error starting a few days ago. I'm running democratic-csi chart 0.8.3 with freenas-api-iscsi driver.

MountVolume.MountDevice failed for volume "pvc-5a7349c4-b25e-45a6-9475-c18e4bc7cc6c" : rpc error: code = Internal desc = {"code":19,"stdout":"Logging in to [iface: default, target: iqn.2005-10.org.freenas.ctl:flux-nextcloud-nextcloud-db, portal: 10.2.1.1,3260] (multiple)\n","stderr":"iscsiadm: Could not login to [iface: default, target: iqn.2005-10.org.freenas.ctl:flux-nextcloud-nextcloud-db, portal: 10.2.1.1,3260].\niscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure)\niscsiadm: Could not log into all portals\n"}

Per 112, I have tried running systemctl restart scst on SCALE, although I already had several targets available at the time when I started receiving the error. I also tried restarting SCALE, updating SCALE from TrueNAS-SCALE-22.02-RC.1 to TrueNAS-SCALE-22.02-RC.2, and restarting nodes, to no avail.

LMK if I can hunt down other logs or provide additional config files that might be of use.

The text was updated successfully, but these errors were encountered:

travisghansen · 2021-12-23T20:53:32Z

Yeah get the scst logs from the server to see what’s going on server side.

ahgraber · 2021-12-23T20:59:15Z

systemctl status scst reports a bunch of

Dec 23 15:55:47 truenas.mydomain.com iscsi-scstd[1735709]: Connect from 10.2.113.31:35336 to 10.2.1.1:3260
Dec 23 15:55:47 truenas.mydomain.com iscsi-scstd[1735709]: Initiator iqn.1993-08.org.debian:01:4efdaa48c143 not allowed to connect to target iqn.2005-10.org.freenas.>

Edit --
Looks like the freenas box might be denying the connection, but that portal is set to allow all initiators...

Also, two days ago (when I first started seeing these errors), I got a different reason for the error (cannot allocate memory). A reboot seems to have resolved that problem, though.

Dec 21 09:17:18 truenas.mydomain.com iscsi-scstd[13361]: Connect from 10.2.113.31:50938 to 10.2.1.1:3260
Dec 21 09:17:18 truenas.mydomain.com iscsi-scstd[13361]: Can't create sess 0x4941000003d0200 (tid 8, initiator iqn.1993-08.org.debian:01:4efdaa48c143): Cannot allocate memory
Dec 21 09:17:19 truenas.mydomain.com iscsi-scstd[13361]: Connect from 10.2.113.30:53482 to 10.2.1.1:3260

travisghansen · 2021-12-23T21:05:31Z

Use juournalctl to get full logs. Probably send over scst.conf file and output of lsmod.

ahgraber · 2021-12-23T21:20:02Z

Requested data attached

debug_logs.zip

travisghansen · 2021-12-23T22:22:47Z

Things generally look sane with the exception of the scst.conf file. It appears the extents have disappeared (for the nextcloud volumes). If you look at the SCALE admin UI how many extents do you see in the list?

Essentially the targets are pointing to non-existent extents in the config file which would explain failures I'm guessing. If you look in the TARGET sections you'll see a line like this LUN 0 flux-nextcloud-nextcloud-db where flux-nextcloud-nextcloud-db matches exactly to a DEVICE name/id in the earlier part of file (which are clearly missing).

If the extents do show up in the admin UI then there's some breakdown in the config file generation process, if the do not show up in the admin UI then it begs the question how/why did they get deleted?

ahgraber · 2021-12-23T22:31:47Z

I do see the extents in SCALE ui:

I had an extra extent present in the UI that should have been removed previously. I deleted it and will see if that was offsetting the index somehow and preventing the extents from mapping.

travisghansen · 2021-12-23T22:40:19Z

Can you send over a screenshot of the associated targets tab as well?

What exactly do you mean by extra?

ahgraber · 2021-12-23T23:54:25Z

Targets:

Associated Targets:

I had a leftover extent 'flux-vaultwarden-test-config-vaultwarden` where the target zfs volume (and associated pv and pvc) were deleted.

After removing this "leftover" extent, restarting scst service, and renewing the k8s deployment, the error is no longer occurring and the deployment succeeds.

I'm unsure why the extent was left after the targets were removed. Perhaps because the iscsi storageClass and volumeSnapshotClass are set to 'retain', so even if I kubectl delete the PV and PVC, and then zfs destroy the associated volumes, there's something lingering in the iscsi config?

travisghansen · 2021-12-24T00:02:15Z

If it was provisioned by this project then the extent should be deleted and for sure everything should get tore down (assuming a delete policy on the pv).

The 2nd issue seems to be that the TrueNAS middleware should more gracefully handle that scenario when generating the config file and ignore invalid entries but continue with valid entries.

ahgraber · 2021-12-24T00:09:45Z

With a 'retain' policy, what is the appropriate way to remove?

travisghansen · 2021-12-24T00:11:47Z

Just kubectl delete the pv. Retain doesn’t really do anything special other than prevent it from deleting when a bound pvc is deleted.

ahgraber closed this as completed Dec 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure) / Could not log into all portals #140

iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure) / Could not log into all portals #140

ahgraber commented Dec 23, 2021

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021 •

edited

Loading

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021

travisghansen commented Dec 23, 2021 •

edited

Loading

ahgraber commented Dec 23, 2021 •

edited

Loading

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021 •

edited

Loading

travisghansen commented Dec 24, 2021

ahgraber commented Dec 24, 2021

travisghansen commented Dec 24, 2021

iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure) / Could not log into all portals #140

iscsiadm: initiator reported error (19 - encountered non-retryable iSCSI login failure) / Could not log into all portals #140

Comments

ahgraber commented Dec 23, 2021

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021 • edited Loading

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021

travisghansen commented Dec 23, 2021 • edited Loading

ahgraber commented Dec 23, 2021 • edited Loading

travisghansen commented Dec 23, 2021

ahgraber commented Dec 23, 2021 • edited Loading

travisghansen commented Dec 24, 2021

ahgraber commented Dec 24, 2021

travisghansen commented Dec 24, 2021

ahgraber commented Dec 23, 2021 •

edited

Loading

travisghansen commented Dec 23, 2021 •

edited

Loading

ahgraber commented Dec 23, 2021 •

edited

Loading

ahgraber commented Dec 23, 2021 •

edited

Loading