Skip to content

Flaky: test_headless_fallback_to_keyutils intermittently picks File backend in CI #20

@prodnull

Description

@prodnull

Observed

prmana-agent/tests/headless_storage.rs::test_headless_fallback_to_keyutils fails intermittently on GitHub Actions ubuntu-latest runners in the integration-docker.yml → Keyutils storage tests job.

Assertion:

expected keyutils fallback when D-Bus is absent, got File

That fires from prmana-agent/tests/headless_storage.rs:38. Same commit, back-to-back runs: one fail, next pass.

First reproduction

Hypothesis

StorageRouter::detect() does a write-probe against keyutils. The keyutils user keyring (@u) is available on the runner (keyctl show @u passes in the preceding step), but the probe write fails intermittently. Plausible causes:

  • Runner user-scoped keys quota (/proc/sys/kernel/keys/maxkeys, maxbytes) — test artefacts from prior runs still linked.
  • Concurrent kernel-level resource contention across runner reuse.
  • Race between the D-Bus env-var unset and the Secret Service probe fast-fail path.

Mitigation (separate PR)

CI step is retry-wrapped (3 attempts) so one-shot flakes do not fail green builds. Long-term fix requires root-causing the probe write intermittent failure — possibly by adding diagnostic output from StorageRouter::detect() on probe failure and capturing /proc/keys / /proc/sys/kernel/keys/* state at failure time.

Acceptance criteria

  • Root cause identified for the intermittent probe failure
  • Fix verified over 50+ consecutive CI runs without regression
  • Retry wrapper removed from .github/workflows/integration-docker.yml

References

  • prmana-agent/tests/headless_storage.rs
  • prmana-agent/src/storage/router.rs
  • .github/workflows/integration-docker.yml

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions