Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reboot loop with error 'WARNING: unrecognized segment type thin-pool' #833

Closed
ndbrew opened this issue Nov 17, 2023 · 3 comments · Fixed by #834
Closed

Reboot loop with error 'WARNING: unrecognized segment type thin-pool' #833

ndbrew opened this issue Nov 17, 2023 · 3 comments · Fixed by #834

Comments

@ndbrew
Copy link
Contributor

ndbrew commented Nov 17, 2023

Booting/Rebooting a cluster node with a disk that include thin-pool / thin logical volumes causes the boot process to fail, triggering a reboot loop.

Error Message

...
Failed to activate logical volumes, exit code 5
...
WARNING: unrecognized segment type thin-pool'
WARNING: unrecognized segment type thin

(I'm running this in proxmox so don't have a convenient way to copy/paste the full output)

Environment Setup
Cluster Version: "v1.6.0-alpha.0-59-g0bd1bdd74"

  • Each node has 2 disks, one boot another for PV storage
  • drbd and dm-thin-pool kernel modules in the configuration
machine:
  install:
    extensions:
      - image: "ghcr.io/siderolabs/drbd:9.2.5-v1.6.0-alpha.0-17-g0ba9f81"
  kernel:
    modules:
      - name: drbd
        parameters:
          - usermode_helper=disabled
      - name: drbd_transport_tcp
      - name: dm-thin-pool

As long as the system remains online, I can provision thin-pool volumes and everything works as expected. On reboot it the system fails to load.

Possible Fix
From what I gather, this error is due to LVM2 of talos not being compiled with thin-pool support which is used during boot to mount the local disks.
https://listman.redhat.com/archives/linux-lvm/2013-July/022321.html

Setting https://github.com/siderolabs/pkgs/blob/main/lvm2/pkg.yaml#L26 to --with-thin=internal may fix the issue but I'm not exactly sure how this works if a user doesn't enable the dm-thin-pool module.

@ndbrew
Copy link
Contributor Author

ndbrew commented Nov 17, 2023

I'll create a PR for this change but I don't really have a local build chain setup to build this and all downstream images that will need to be rebuilt to test this.

Also, any tips for debugging talos boot/configuration issues without a shell?
Would be nice if there was a debug mode that lets me poke around deeper in the system even if the cluster is in a bad state.

@ndbrew
Copy link
Contributor Author

ndbrew commented Nov 18, 2023

I was able to build everything and get the installer working. Everything seems to be working with the above change.

@smira
Copy link
Member

smira commented Nov 20, 2023

Thanks for digging into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants