Skip to content

Nodes/PODs lost access to iSCSI based PVCs during / after upgrade to 25.02.1 #1001

@ptrkmkslv

Description

@ptrkmkslv

Describe the bug
Recently we have performed trident upgrade to 25.02.1 (from 24.10.1)

After upgrade was done by ‘operator’, and trident-node PODs were replaced – we faced tragic consequences – PODs lost access to bigger part of volumes (but not all of them). The same thing happened when we’ve upgraded from 24.10.0 -> 24.10.1

Before upgrade clusters were checked: all running PODs had access to all PVCs volumes, both paths under ‘multipath –ll’ were healthy for all dm-xxx devices, and no orphaned / ghosts block devices /dev/sdxx were present in the system before upgrade)

What we have observed was that after upgrade trident-node PODs were not able to fully start:

Image

Main reason for that was that driver-registrar container was not able to fully start:

Image

It seems that for unknown reason trident-main was not able to ’determine published paths for volumes’:

Image

And it tried to remove them from multipath and unmount filesystem (active volumes of active PODs):

Image

Image

In the logs it is complaining that it was not able to flush multipath and Umount devices – technically it errors with ‘map or partition in use’ however PODs had lost access to blockdevices anyway ...

Error from console of one of the nodes:

Image

Useful information is that on the trident-controller side there was no messages ‘Unpublishing volume from the node’ - so it was not removed on igroup / Netapp level but rather it seems like a bug in trident-node logic: it somehow decided that some volumes are not used (???) and must be unmounted / removed from OS

Environment

  • Trident version: upgrade from 24.10.1 to 25.02.1 (via operator)
  • Container runtime: docker://26.1.0
  • Kubernetes version: v1.30.6
  • Kubernetes orchestrator: rancher 2.10.3
  • OS: [Flatcar Container Linux by Kinvolk 4081.2.0 (Oklo)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions