Skip to content

feat(ubuntu): install drbd-dkms from LINBIT PPA for Secure Boot hosts#39

Merged
lexfrei merged 2 commits intomainfrom
feat/ubuntu-secure-boot-drbd
May 6, 2026
Merged

feat(ubuntu): install drbd-dkms from LINBIT PPA for Secure Boot hosts#39
lexfrei merged 2 commits intomainfrom
feat/ubuntu-secure-boot-drbd

Conversation

@lexfrei
Copy link
Copy Markdown
Contributor

@lexfrei lexfrei commented May 6, 2026

Summary

Install drbd-dkms from the LINBIT PPA on Ubuntu LTS hosts (jammy / noble) as part of the prepare playbook, plus configure usermode_helper=disabled on the drbd module and mask the host-side drbd.service. This unblocks non-Talos Cozystack installation on Ubuntu hosts with UEFI Secure Boot enabled, where the in-cluster DRBD module compile path used by piraeus-operator's loader fails with Key was rejected by service because freshly built .ko files are unsigned and kernel lockdown rejects them.

With drbd loaded on the host, the loader's existing host-detection logic (LINBIT/drbd docker/entry.sh:328) sees DRBD ≥ 9 with usermode_helper=disabled and exits cleanly without attempting to compile or insmod, so no operator-side change is required.

Changes

  • New variables in examples/ubuntu/prepare-ubuntu.yml:
    • cozystack_enable_drbd_dkms: true (opt-out toggle)
    • cozystack_drbd_ppa: ppa:linbit/linbit-drbd9-stack (override default in inventory for internal mirrors)
    • cozystack_drbd_supported_releases: [jammy, noble] (extend in inventory once LINBIT publishes for a new series)
  • New tasks (Ubuntu only, gated on the toggle and the supported-release list): write /etc/modprobe.d/cozystack-drbd.conf with options drbd usermode_helper=disabled (BEFORE installing drbd-dkms so any package-side auto-modprobe respects the param), add the LINBIT PPA, install drbd-dkms, mask host drbd.service, modprobe drbd (tolerated via ignore_errors: true on Secure Boot hosts pending MOK enrollment, mirroring the existing ZFS pattern), persist via /etc/modules-load.d/cozystack-drbd.conf only when modprobe succeeded, plus symmetric cleanup of both drop-ins on opt-out / non-Ubuntu / unsupported-release.
  • Operator-facing reminder task that fires when modprobe failed, pointing at the MOK enrollment step. Sibling warn tasks fire on Debian and on Ubuntu releases LINBIT does not publish for (Oracular 24.10, Plucky 25.04, future LTS pre-publication, etc.).
  • gnupg added to required packages — apt_repository PPA flow needs gpg since 24.04 dropped apt-key, and minimal cloud images may not ship it.
  • README and CLAUDE.md updated to reflect the new Secure Boot prerequisite, the new variables, and the drbd-dkms exception to the prior 'no DRBD host packages' rule.
  • 22 structural unit tests in tests/unit/playbooks/test_ubuntu_examples.py covering ordering, gates, content, opt-out, supported-release list, cleanup symmetry, and documentation drift.

Bug fix carried in alongside

The pre-existing Load ZFS kernel module now and Enable multipathd service tasks used failed_when: false to tolerate failures while their downstream warn / persistence tasks gated on _var.failed. Ansible's failed_when: false rewrites the registered variable's .failed field to False, silently neutering those gates. Switched to ignore_errors: true (preserves the module's outcome on the registered variable) and applied the same shape to the new DRBD modprobe task. CLAUDE.md was updated to call out the distinction so this does not creep back in.

Why on Ubuntu only

LINBIT does not publish a Debian PPA, and the RHEL/SUSE flow needs a different repo plus pre-signed kmods. Worth separate PRs; this PR keeps scope tight to what can be verified end-to-end on the LINBIT-published Ubuntu LTS series. Operators on Debian / unsupported Ubuntu releases get a clear warn task pointing at the manual path.

On Secure Boot

The MOK enrollment confirmation step (operator entering a password at the shim console on first boot after install) cannot be automated — that is by design of UEFI Secure Boot. The playbook detects the failure mode (modprobe returns Key was rejected by service) and emits a reminder; idempotent re-run after enrollment converges.

Test plan

  • ansible-lint passes (production profile, 0 failures, 0 warnings)
  • ansible-playbook examples/ubuntu/prepare-ubuntu.yml --syntax-check passes
  • Unit tests: 29 passing (pytest tests/unit/playbooks/test_ubuntu_examples.py)
  • Tested on a live cluster — Ubuntu 24.04 LTS with Secure Boot enabled, kernel 6.8.x, baremetal/VM with UEFI firmware. Workflow: run playbook → reboot → confirm MOK at shim → re-run playbook (expect changed=0) → verify cat /sys/module/drbd/parameters/usermode_helperdisabled → deploy Cozystack → confirm linstor satellite Pods reach Ready without loader retry-loop.
  • Idempotency verified (second run: changed=0)

Related

Summary by CodeRabbit

  • Documentation
    • Expanded DRBD prerequisites and Ubuntu 26.04 guidance, kernel-header mappings, Secure Boot/MOK notes, opt-out/override variables, and example DKMS configuration/options.
  • Changelog
    • Added an Unreleased section summarizing DRBD-related install and gating notes and modprobe behavior changes.
  • Examples
    • Updated Ubuntu example playbook defaults, DRBD toggles, and module-loading/failure-handling behavior.
  • Tests
    • Added comprehensive tests validating Ubuntu DRBD/Secure Boot, modprobe handling, and release gating.

On hosts where UEFI Secure Boot is enabled, kernel lockdown rejects
the unsigned modules built by piraeus-operator's in-cluster compile
path with 'Key was rejected by service'. Pre-install drbd-dkms from
the LINBIT PPA so dkms+shim signs the module against a per-host MOK
key, and piraeus-operator's loader auto-detects host-loaded DRBD and
exits cleanly without attempting to insmod.

drbd-dkms hard-depends on drbd-utils (>= 9.28.0), so the userspace
binaries land on the host transitively. They are unused at runtime —
the satellite container ships its own copy. Mask host drbd.service
to prevent accidental `systemctl enable drbd` from racing the
satellite.

Tolerate the initial modprobe via ignore_errors: true (NOT
failed_when: false — the latter rewrites the registered variable's
.failed flag and silently breaks every downstream gate that consults
it). The dkms-generated MOK key is not enrolled until the operator
confirms it at the shim console on the next reboot. Gate persistence
in /etc/modules-load.d/ on modprobe success so
systemd-modules-load.service does not fail every boot before
enrollment. The same fix is applied to the pre-existing ZFS modprobe
and multipathd enable tasks, which carried the same bug.

Configure 'options drbd usermode_helper=disabled' via
/etc/modprobe.d/cozystack-drbd.conf BEFORE installing drbd-dkms —
piraeus-operator's loader explicitly die()s on a host-loaded module
without that param, so any package-side auto-modprobe must respect
it on first load.

Gate by Ubuntu release codename rather than version number. LINBIT's
PPA is keyed by release name (jammy, noble) and only publishes for
the LTS series they keep current. Version-based gates would let
interim releases (Oracular 24.10, Plucky 25.04) reach the PPA add
task and fail mid-playbook on a 404 Release file. The supported list
is materialized once via set_fact at the top of the play and exposed
as cozystack_drbd_supported_releases (default [jammy, noble]) so
operators can extend it from inventory once LINBIT publishes for a
new series. Debian + RHEL/SUSE are not automated; the playbook emits
a notice on Debian and on Ubuntu releases not in the supported list.

Add cozystack_enable_drbd_dkms toggle (default true) for Talos hosts
or operators who deliberately use the in-cluster compile path, and
cozystack_drbd_ppa for internal LINBIT mirrors. Both honor inventory
overrides — defaults live in task-level filters / set_fact, not
play-level vars (where they would outrank inventory).

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

📝 Walkthrough

Walkthrough

Adds Ubuntu DRBD/DKMS support and documentation: new playbook tasks and variables to optionally install drbd-dkms from LINBIT PPA, tolerant modprobe/ignore_errors patterns, gating by Ubuntu release list, kernel-headers guidance, Secure Boot/MOK notes, cleanup logic, expanded docs, and extensive unit tests for these behaviors.

Changes

DRBD DKMS Ubuntu Support

Layer / File(s) Summary
Defaults & Docs
README.md, CHANGELOG.rst, CLAUDE.md
Introduce new documentation and changelog entries describing DRBD DKMS behavior, kernel-headers requirements per-OS, Ubuntu 26.04 notes, new variables cozystack_enable_drbd_dkms, cozystack_drbd_ppa, cozystack_drbd_supported_releases, and failed_when vs ignore_errors guidance.
Configuration & Prereqs
examples/ubuntu/prepare-ubuntu.yml
Add gnupg to prerequisites; set default cozystack_drbd_supported_releases (jammy, noble); add explanatory comments about in-cluster DKMS and Secure Boot considerations.
Feature Tasks (implementation)
examples/ubuntu/prepare-ubuntu.yml
Add tasks to add LINBIT PPA, install drbd-dkms (gated by distro & supported releases), write modprobe/drbd config drop-ins, mask host drbd.service, persist modules-load, and warnings for Debian/manual setups.
Error-tolerance / Behavioral changes
examples/ubuntu/prepare-ubuntu.yml
Change modprobe/load tasks for ZFS, multipathd, and DRBD from failed_when: false to ignore_errors: true, propagate registered results into downstream gates, and add debug/warning tasks for modprobe failures.
Cleanup & Conditional Removal
examples/ubuntu/prepare-ubuntu.yml
Add removal/cleanup tasks for modules-load and modprobe.d drop-ins when DRBD DKMS is not active, opt-out, unsupported release, or modprobe failed.
Tests
tests/unit/playbooks/test_ubuntu_examples.py
Add ~20+ unit tests validating modprobe tolerance, ignore_errors usage, release gating, variable defaults and overrides, cleanup behavior, documentation/CLAUDE alignment, and Debian-specific warnings.

Sequence Diagram

sequenceDiagram
    actor User
    participant Playbook as Ansible<br/>(prepare-ubuntu.yml)
    participant Repo as APT<br/>Repository
    participant PKG as APT<br/>Package Manager
    participant DKMS as DKMS<br/>Builder
    participant Kernel as Kernel<br/>(modprobe)
    User->>Playbook: run prepare-ubuntu.yml
    Playbook->>Playbook: check cozystack_enable_drbd_dkms and release list
    alt DRBD DKMS enabled & supported
        Playbook->>Repo: add LINBIT PPA key (uses gnupg)
        Repo-->>Playbook: repo added
        Playbook->>PKG: install drbd-dkms
        PKG->>DKMS: trigger in-cluster build
        DKMS->>DKMS: compile module against running kernel
        DKMS-->>PKG: module built/signed
        PKG-->>Playbook: installation result
        Playbook->>Kernel: modprobe drbd (ignore_errors: true)
        Kernel-->>Playbook: success or failure (tolerated)
        alt modprobe failed
            Playbook->>Playbook: set fact / warn, keep drop-in for diagnosis
        else modprobe succeeded
            Playbook->>Playbook: persist modules-load and cleanups as needed
        end
    else DRBD DKMS disabled or unsupported
        Playbook->>Playbook: skip install, remove drop-ins if inactive
    end
    Playbook-->>User: playbook completes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Poem

🐰 I found a PPA beneath the hill,
Modules build and sign with skill—
Ignore the crash, enroll the key,
Jammy and Noble nod with glee.
A tidy cleanup, tests in tow—hop, go, go!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: installing drbd-dkms from LINBIT PPA for Secure Boot hosts on Ubuntu.
Description check ✅ Passed The description comprehensively covers the summary, detailed changes, test plan, and related context, exceeding the basic template requirements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/ubuntu-secure-boot-drbd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Ubuntu Secure Boot by pre-installing drbd-dkms from the LINBIT PPA, ensuring kernel modules can be signed via MOK. It also includes a critical fix for the tolerated-modprobe pattern, switching from failed_when: false to ignore_errors: true to prevent silencing downstream failure gates for ZFS and multipathd. The changes are accompanied by extensive documentation updates and new unit tests. Review feedback recommends removing hardcoded future years (e.g., '2026') from documentation to prevent it from becoming stale and suggests grouping related Ansible tasks into blocks to reduce conditional duplication.

Comment thread CHANGELOG.rst
default is in the task's ``| default(...)`` filter, not in
play-level ``vars:`` where it would outrank inventory).
- Automated only on Ubuntu releases LINBIT keeps current — Jammy
(22.04) and Noble (24.04) as of 2026. Interim non-LTS releases
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding a future year like '2026' can make the documentation quickly become outdated. Consider rephrasing to be more evergreen, for example: '...as of this writing.' or simply stating the releases without a time-based qualifier.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "as of 2026" anchor is intentional. The supported-series list is time-dependent — LINBIT adds new Ubuntu LTS codenames as they become current and drops older ones. The temporal anchor signals to a reader who lands on this CHANGELOG in 2028 that the list reflects state at writing and they should check the LINBIT PPA detail page for the current set. Removing the year would actually make the line less informative for future readers, since they would have no way to tell whether the list is current or stale. Same rationale applies to the related lines in README.md and the playbook comment.

Comment thread README.md
- `cozystack_enable_drbd_dkms` (bool, default `true`): set `false` on Talos hosts (Talos ships signed DRBD modules in extensions) or where Secure Boot is disabled and you prefer the in-cluster compile path.
- `cozystack_drbd_ppa` (string, default `ppa:linbit/linbit-drbd9-stack`): point at a local mirror of the LINBIT archive.

The PPA-based path is automated only on Ubuntu releases LINBIT keeps current — Jammy (22.04 LTS) and Noble (24.04 LTS) as of 2026. Interim non-LTS releases (Oracular 24.10, Plucky 25.04, etc.) and the next LTS (Resolute 26.04) before LINBIT publishes for them are not in the LINBIT PPA, so the playbook skips the install and emits a notice on those hosts. The supported list is exposed as `cozystack_drbd_supported_releases` (default `[jammy, noble]`); operators can extend it from inventory once LINBIT publishes for a new series, without waiting for a collection release. Operators on unsupported releases must build and sign `drbd-dkms` manually, downgrade to a supported LTS, or disable Secure Boot. Debian is not automated either (no LINBIT Debian PPA). RHEL/SUSE need a separate flow (LINBIT-managed RPM repo + pre-signed kmods) and are out of scope for this collection.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the changelog, the hardcoded year '2026' here may become stale. Phrasing this without a specific year would improve long-term maintainability of the documentation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the CHANGELOG comment: the "as of 2026" anchor is intentional and load-bearing. It tells a future reader that the LINBIT PPA series list is time-dependent and the snapshot recorded here is from a specific point. The variables-table entry below this paragraph (cozystack_drbd_supported_releases) and the inline troubleshooting note also explicitly invite operators to override the list once LINBIT publishes for a new series — the temporal phrasing complements that.

Comment thread README.md
- **Ubuntu 26.04 LTS:** three changes to be aware of.
1. *Auto-applied by `examples/ubuntu/site.yml`*: `sudo-rs` ships as the default `/usr/bin/sudo` alternative and does not honour ansible's `become_method: sudo` privilege-escalation pseudo-tty — every `become: true` task hangs with `Timeout (12s) waiting for privilege escalation prompt`. The classical sudo binary is co-installed at `/usr/bin/sudo.ws`. `site.yml` imports `prepare-sudo.yml` first, which switches the `sudo` alternative back via `update-alternatives` using a `raw` command (so it works even when become is broken). The play is a no-op on releases without sudo-rs. If you bypass `site.yml` and call the prepare playbooks directly, run `prepare-sudo.yml` before any task with `become: true` on 26.04 hosts.
2. *Manual inventory setting on 26.04 hosts*: the playbook auto-skips `linux-modules-extra-*` on Ubuntu 26.04+ because the package no longer exists for kernel 7.x — `openvswitch` and `vport-geneve` are bundled into `linux-image-generic`. The auto-skip relies on `ansible_distribution_version`; on hosts where that fact is unreliable, set `cozystack_ubuntu_extra_packages: []` in inventory to skip the apt install explicitly.
3. *DRBD via `drbd-dkms` is not automated on releases LINBIT does not publish for*: LINBIT's PPA only ships drbd-dkms for the LTS series they keep current (Jammy 22.04 + Noble 24.04 as of 2026). Interim releases (Oracular 24.10, Plucky 25.04) and the next LTS (Resolute 26.04) before LINBIT publishes are skipped with a notice; on Secure Boot hosts the in-cluster compile path will fail with `Key was rejected by service`. Build and sign `drbd-dkms` manually, downgrade to a supported LTS, extend `cozystack_drbd_supported_releases` from inventory once LINBIT publishes for your release, or disable Secure Boot.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This is another instance of the hardcoded year '2026'. Removing it would prevent the documentation from becoming outdated in the future.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same rationale as the other two "as of 2026" mentions: the LINBIT PPA's published-series set changes over time, so anchoring the claim to a calendar year helps a future reader identify the snapshot rather than assume the list is current. Removing the year would obscure that signal.

NOTE: LINBIT's PPA does not publish drbd-dkms for Ubuntu
{{ ansible_distribution_release }} ({{ ansible_distribution_version }});
only the LTS series LINBIT keeps current are supported
(jammy / noble as of 2026; extend
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This comment includes the year '2026', which could become outdated. To improve maintainability, consider removing the year or using a more general phrase like 'at the time of writing'.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment in playbook source — same rationale as the README and CHANGELOG references. The "as of 2026" anchor is meaningful because the LINBIT PPA's supported-series list is time-dependent. Removing it would make the comment look perpetually current even after the underlying state has drifted.

Comment on lines +436 to +500
- name: Configure DRBD module parameters
ansible.builtin.copy:
dest: /etc/modprobe.d/cozystack-drbd.conf
mode: "0644"
content: |
options drbd usermode_helper=disabled
when:
- cozystack_enable_drbd_dkms | default(true) | bool
- ansible_distribution == 'Ubuntu'
- ansible_distribution_release in cozystack_drbd_supported_releases

- name: Add LINBIT PPA for drbd-dkms
ansible.builtin.apt_repository:
repo: "{{ cozystack_drbd_ppa | default('ppa:linbit/linbit-drbd9-stack') }}"
state: present
update_cache: true
when:
- cozystack_enable_drbd_dkms | default(true) | bool
- ansible_distribution == 'Ubuntu'
- ansible_distribution_release in cozystack_drbd_supported_releases

# drbd-dkms has a hard apt dependency on drbd-utils (>= 9.28.0)
# and pulls it onto the host transitively. The host copy of
# drbdadm/drbdmeta/drbdsetup is never invoked at runtime —
# piraeus-operator's satellite container ships its own
# drbd-utils and uses that one. The host's drbd.service ships
# disabled by maintainer; we mask it below to prevent accidental
# `systemctl enable drbd` from conflicting with the satellite.
- name: Install drbd-dkms
ansible.builtin.apt:
name: drbd-dkms
state: present
when:
- cozystack_enable_drbd_dkms | default(true) | bool
- ansible_distribution == 'Ubuntu'
- ansible_distribution_release in cozystack_drbd_supported_releases

# Mask drbd.service so the host-side userspace cannot be enabled
# by accident. piraeus-operator manages all DRBD resources via its
# own satellite container; host-side drbd.service would race with
# it.
- name: Mask host drbd.service
ansible.builtin.systemd:
name: drbd.service
masked: true
when:
- cozystack_enable_drbd_dkms | default(true) | bool
- ansible_distribution == 'Ubuntu'
- ansible_distribution_release in cozystack_drbd_supported_releases

# ignore_errors (not failed_when: false) so the registered var
# preserves the module's `failed` flag — `failed_when: false`
# rewrites it to False and silently breaks every downstream gate
# in this play. Tolerated outcome is the same; introspectable
# state is not.
- name: Load DRBD kernel module now
community.general.modprobe:
name: drbd
params: usermode_helper=disabled
register: _cozystack_drbd_modprobe
when:
- cozystack_enable_drbd_dkms | default(true) | bool
- ansible_distribution == 'Ubuntu'
- ansible_distribution_release in cozystack_drbd_supported_releases
ignore_errors: true # tolerated: Secure Boot before MOK enrollment
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve readability and reduce duplication, you can group these tasks that share the same when condition under a single block.

    - name: Manage DRBD via dkms from LINBIT PPA
      block:
        - name: Configure DRBD module parameters
          ansible.builtin.copy:
            dest: /etc/modprobe.d/cozystack-drbd.conf
            mode: "0644"
            content: |
              options drbd usermode_helper=disabled

        - name: Add LINBIT PPA for drbd-dkms
          ansible.builtin.apt_repository:
            repo: "{{ cozystack_drbd_ppa | default('ppa:linbit/linbit-drbd9-stack') }}"
            state: present
            update_cache: true

        - name: Install drbd-dkms
          ansible.builtin.apt:
            name: drbd-dkms
            state: present

        - name: Mask host drbd.service
          ansible.builtin.systemd:
            name: drbd.service
            masked: true

        - name: Load DRBD kernel module now
          community.general.modprobe:
            name: drbd
            params: usermode_helper=disabled
          register: _cozystack_drbd_modprobe
          ignore_errors: true  # tolerated: Secure Boot before MOK enrollment
      when:
        - cozystack_enable_drbd_dkms | default(true) | bool
        - ansible_distribution == 'Ubuntu'
        - ansible_distribution_release in cozystack_drbd_supported_releases

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing project convention is per-task when: (see the ZFS, multipathd, and KubeVirt blocks earlier in this same playbook — none of them use a block: wrapper). Switching the DRBD block to block: + when: would diverge from that pattern for cosmetic reasons. The structural unit tests in tests/unit/playbooks/test_ubuntu_examples.py (e.g. test_drbd_tasks_are_ubuntu_only_and_opt_outable) also assert each task individually carries the gate, so a refactor would need a corresponding test rewrite. Keeping the explicit per-task when: for consistency.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@examples/ubuntu/prepare-ubuntu.yml`:
- Around line 477-484: The task "Mask host drbd.service" currently only masks
the unit; modify the ansible.builtin.systemd call for drbd.service to also stop
the running unit by adding state: stopped (while keeping masked: true) so the
host-side userspace is shut down immediately; keep the same task name and when
conditions and ensure the module targets "drbd.service" with both masked: true
and state: stopped.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6f6a7edc-f5a8-49f9-be06-7fc5b17b4598

📥 Commits

Reviewing files that changed from the base of the PR and between b799d9e and 428b512.

📒 Files selected for processing (5)
  • CHANGELOG.rst
  • CLAUDE.md
  • README.md
  • examples/ubuntu/prepare-ubuntu.yml
  • tests/unit/playbooks/test_ubuntu_examples.py

Comment thread examples/ubuntu/prepare-ubuntu.yml
Address review feedback from coderabbitai on
examples/ubuntu/prepare-ubuntu.yml:484: `masked: true` alone
prevents future starts but leaves an already-running drbd.service
alive on hosts that previously had drbd-utils installed and the
unit enabled. Add `state: stopped` and `enabled: false` so the
host-side userspace is shut down immediately, eliminating the race
window with piraeus-operator's satellite container.

Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Aleksei Sviridkin <f@lex.la>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
examples/ubuntu/prepare-ubuntu.yml (1)

436-563: 💤 Low value

Consider wrapping the DRBD task group in a block: to DRY up gating.

The eight DRBD-related tasks (Configure module params, Add PPA, Install drbd-dkms, Mask service, Load module now, Persist at boot, Warn on failure, Remove modules-load drop-in) repeat the same three-clause when: (cozystack_enable_drbd_dkms + ansible_distribution == 'Ubuntu' + ansible_distribution_release in cozystack_drbd_supported_releases) up to five times. A block scope with the gating expressed once would make the play easier to audit and harder to drift on (e.g., adding a new release-gate clause currently means touching every task).

Note that the two cleanup tasks at lines 538-563 should remain outside the block — their when: is the negation of the gate.

♻️ Sketch (illustrative, not exhaustive)
-    - name: Configure DRBD module parameters
-      ansible.builtin.copy:
-        ...
-      when:
-        - cozystack_enable_drbd_dkms | default(true) | bool
-        - ansible_distribution == 'Ubuntu'
-        - ansible_distribution_release in cozystack_drbd_supported_releases
-
-    - name: Add LINBIT PPA for drbd-dkms
-      ...
-    # ... and so on for Install / Mask / Modprobe / Persist / Warn
+    - name: Configure DRBD via LINBIT PPA on supported Ubuntu releases
+      when:
+        - cozystack_enable_drbd_dkms | default(true) | bool
+        - ansible_distribution == 'Ubuntu'
+        - ansible_distribution_release in cozystack_drbd_supported_releases
+      block:
+        - name: Configure DRBD module parameters
+          ansible.builtin.copy:
+            ...
+        - name: Add LINBIT PPA for drbd-dkms
+          ...
+        # Install drbd-dkms / Mask service / Modprobe / Persist at boot / Warn
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@examples/ubuntu/prepare-ubuntu.yml` around lines 436 - 563, Group the seven
DRBD setup tasks ("Configure DRBD module parameters", "Add LINBIT PPA for
drbd-dkms", "Install drbd-dkms", "Mask host drbd.service", "Load DRBD kernel
module now", "Load DRBD kernel module at boot", "Warn if DRBD module failed to
load") into a single block: put them under a block: header and move the common
when: clause (the three-clause gate using cozystack_enable_drbd_dkms,
ansible_distribution == 'Ubuntu', and ansible_distribution_release in
cozystack_drbd_supported_releases) onto the block so each task no longer repeats
it; keep the community.general.modprobe task's register:
_cozystack_drbd_modprobe and ignore_errors: true inside the block so later tasks
can reference _cozystack_drbd_modprobe; leave the two cleanup tasks ("Remove
DRBD modules-load drop-in when not active" and "Remove DRBD modprobe.d drop-in
when not active") outside the block unchanged because their when: conditions are
the negation/other cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@examples/ubuntu/prepare-ubuntu.yml`:
- Around line 436-563: Group the seven DRBD setup tasks ("Configure DRBD module
parameters", "Add LINBIT PPA for drbd-dkms", "Install drbd-dkms", "Mask host
drbd.service", "Load DRBD kernel module now", "Load DRBD kernel module at boot",
"Warn if DRBD module failed to load") into a single block: put them under a
block: header and move the common when: clause (the three-clause gate using
cozystack_enable_drbd_dkms, ansible_distribution == 'Ubuntu', and
ansible_distribution_release in cozystack_drbd_supported_releases) onto the
block so each task no longer repeats it; keep the community.general.modprobe
task's register: _cozystack_drbd_modprobe and ignore_errors: true inside the
block so later tasks can reference _cozystack_drbd_modprobe; leave the two
cleanup tasks ("Remove DRBD modules-load drop-in when not active" and "Remove
DRBD modprobe.d drop-in when not active") outside the block unchanged because
their when: conditions are the negation/other cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0854ba6e-7f84-4f0d-b061-d91d642a202b

📥 Commits

Reviewing files that changed from the base of the PR and between 428b512 and edbd8ad.

📒 Files selected for processing (1)
  • examples/ubuntu/prepare-ubuntu.yml

Copy link
Copy Markdown
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Reviewed the playbook + the unit-test additions.

The mechanics are right end-to-end:

  • /etc/modprobe.d/cozystack-drbd.conf written before drbd-dkms install — package-side auto-modprobe (postinst, future dkms hooks) respects usermode_helper=disabled. This ordering is load-bearing because piraeus-operator's loader die()s on a host-loaded module without the param (LINBIT/drbd entry.sh LB_FAIL_IF_USERMODE_HELPER_NOT_DISABLED=yes).
  • Modules-load.d drop-in is only persisted when modprobe succeeded — prevents systemd-modules-load.service from failing on every boot pre-MOK-enrollment, while leaving the modprobe.d drop-in intact (param needed once the module loads after reboot+enrollment).
  • drbd.service masked — host-side userspace would otherwise race piraeus-operator's satellite container.
  • Symmetric cleanup on opt-out / non-Ubuntu / unsupported-release.
  • Honest warn-task fallbacks for Debian, unsupported Ubuntu releases, and pre-MOK-enrollment Secure Boot hosts.

The carried-along failed_when: falseignore_errors: true fix on ZFS / multipathd is the right call — failed_when: false rewrites _var.failed to False and silently neuters every downstream _var.failed gate, which was masking real failures. CLAUDE.md update calling that out is good defensive doc.

Variable shape is reasonable: cozystack_drbd_supported_releases extensible from inventory means operators can add a new LTS release the moment LINBIT publishes for it without waiting on a collection release. cozystack_drbd_ppa overridable for local archive mirrors. Both right knobs.

Two non-blocking observations:

  1. The "Tested on a live cluster" checkbox in the test plan is unchecked — that step (Ubuntu 24.04 + Secure Boot + MOK enrollment + Cozystack deploy) cannot really land in CI by design, but worth confirming on dev hardware before tagging a release that bumps this to default.
  2. gnupg added to required packages — correct call for 24.04+, just worth double-checking the apt_repository PPA add still works on minimal images that lack gpg-agent (the package is split on some derivatives). Probably fine on stock Canonical cloud images.

Ship it.

lexfrei added a commit to cozystack/website that referenced this pull request May 6, 2026
## Summary

Add a new prerequisite page covering Ubuntu LTS hosts with UEFI Secure
Boot enabled, link it from the generic Kubernetes installation guide and
from the getting-started requirements, and clean up two adjacent docs
that contradicted the new flow. On hosts where Secure Boot is enabled,
kernel lockdown rejects the unsigned modules built by piraeus-operator's
in-cluster compile path with `Key was rejected by service`. The fix is
to install `drbd-dkms` from the LINBIT PPA on the host before deploying
Cozystack so dkms+shim signs the module against a per-host MOK key. Once
DRBD is host-loaded with `usermode_helper=disabled`, piraeus-operator's
loader auto-detects it and skips compilation.

The page documents the MOK enrollment step honestly (one console
interaction per node, by design of UEFI Secure Boot) instead of
promising automation that cannot exist at this layer.

## Changes

- `content/en/docs/{v1.3,next}/install/kubernetes/ubuntu-secure-boot.md`
— new prerequisite page covering: why kernel lockdown rejects in-cluster
compiled modules, the recommended fix (PPA + drbd-dkms + modprobe.d
drop-in + drbd.service mask + MOK enrollment reboot), the manual
procedure step-by-step, what happens at deploy time (showing the
loader's host-detection short-circuit), alternatives (Talos, disabling
Secure Boot, manual signing), and troubleshooting.
- `content/en/docs/{v1.3,next}/install/kubernetes/generic.md` — warning
alert in Prerequisites linking to the new page.
- `content/en/docs/{v1.3,next}/getting-started/requirements.md` — reword
the `Secure Boot must be disabled` blanket statement into a more
accurate description that distinguishes Talos (works with SB enabled)
from non-Talos Ubuntu (needs the drbd-dkms workaround above) and links
to the new page.
- `content/en/docs/{v1.3,next}/install/providers/hetzner.md` — clarify
that the rescue-mode `installimage` flow used in this guide requires
Secure Boot disabled, while Talos itself supports SB via its UKI /
SecureBoot installation flow (just not via this guide). Resolves the
contradiction the new requirements.md wording would otherwise introduce.

The v1.3 copy of `ubuntu-secure-boot.md` ships only the manual path. The
forward-looking automation in `cozystack/ansible-cozystack#39` is
described in the `next` copy because the collection PR has not yet
landed in a tagged release.

Upstream references (LINBIT/drbd `entry.sh` line 332, piraeus-operator's
satellite daemonset env wiring) are pinned to specific commit hashes so
they do not silently drift.

## Test plan

- [x] Hugo build (content phase) parses without ref-shortcode errors.
- [x] `hugo list all` registers the new page in v1.3.
- [x] Manual proof-read of both copies for cross-references and version
paths.
- [ ] Local `hugo serve` smoke check (requires Node sandbox bypass for
postcss; deferred to CI preview).

## Related

- cozystack/cozystack#2568 — original report.
-
[cozystack/ansible-cozystack#39](cozystack/ansible-cozystack#39)
— companion automation PR.
@lexfrei lexfrei merged commit b647eaf into main May 6, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants