fix(serviceoffer-controller): align remote-signer config with v0.3.0 image#465
Merged
Merged
Conversation
OisinKyne
approved these changes
May 11, 2026
…image
Verified locally against ghcr.io/obolnetwork/remote-signer:v0.3.0:
- Main's KEYSTORE_PASSWORD env name is unrecognised; the binary exits
with Error: NoPassword on startup.
- Main's keystore dir /keystores conflicts with the image's default
/data/keystores (declared as a volume in the image config).
- Main's /health readiness probe returns HTTP 404; the binary only
serves /healthz, which returns {"status":"ok"}.
Together these mean any Agent CR with wallet.create=true on main has a
remote-signer that crash-loops or fails liveness, blocking the agent
from ever reaching Ready.
This is what the integration branch behind #452 was carrying. Pulling
it forward:
- Move keystore dir to /data/keystores (the image default), and pin
the on-disk filename to keystore.json so the Secret volume
projection no longer needs to thread the V3 UUID through; the V3
document carries the address internally so the cosmetic filename
doesn't matter.
- Add ensureCanonicalKeystoreKey migration helper: on reconcile of an
existing Secret with the wallet annotation, if data is keyed under
the old UUID-named JSON field, rewrite it as keystore.json
in-place. Refuses ambiguous Secrets with multiple legacy JSON keys.
- Switch env scheme to upstream's SIGNER__SECTION__KEY hierarchy
(SIGNER__SERVER__HOST, SIGNER__SERVER__PORT, SIGNER__KEYSTORE__DIR,
SIGNER__KEYSTORE__PASSWORD, SIGNER__LOGGING__FORMAT/LEVEL). Matches
the master agent's working config in hermes-obol-agent.
- Switch readiness and liveness probes from /health to /healthz.
Adds 8 unit tests covering fresh keystore creation, reuse, legacy key
migration, ambiguity rejection, malformed data, and the canonical
Secret/Deployment shape (single keystore.json projected, password
read via env, never mounted).
fce4ca0 to
4862fd7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
serviceoffer-controlleronmaincurrently speaks the wrong contract to the deployedghcr.io/obolnetwork/remote-signer:v0.3.0image. Any Agent CR withwallet.create=truehas a signer pod that fails to start or fails its probes, blocking the agent from reaching Ready.Local verification against the published image
I pulled
ghcr.io/obolnetwork/remote-signer:v0.3.0and ran it with both env schemes plus probed both health paths. Results:docker inspect)/data/keystores— not/keystoresKEYSTORE_PASSWORD,KEYSTORE_PATH)Error: NoPasswordSIGNER__KEYSTORE__PASSWORD,SIGNER__KEYSTORE__DIR)GET /healthon a running signerGET /healthzon a running signer{"status":"ok"}So every one of main's three signer-related constants is wrong against the deployed image.
Fix
Pulls forward the integration branch's S6 work (originally on
integration/pr450-pr451-cloudflare-obol, dropped by the squash merges):/data/keystores(image default).keystore.jsonso the Secret'sitemsprojection no longer needs to thread the V3 UUID through. The V3 document carries the address internally, so the on-disk name is cosmetic.ensureCanonicalKeystoreKey: on reconcile of an existing Secret with the wallet annotation, if data is keyed under the old UUID-named JSON field, rewrite it askeystore.jsonin place. Refuses ambiguous Secrets that already have multiple legacy JSON keys (operator must intervene).SIGNER__SECTION__KEYhierarchy:SIGNER__SERVER__HOST,SIGNER__SERVER__PORT,SIGNER__KEYSTORE__DIR,SIGNER__KEYSTORE__PASSWORD,SIGNER__LOGGING__FORMAT,SIGNER__LOGGING__LEVEL. Matches the master agent's known-good config inhermes-obol-agent./healthto/healthz.Tests
Adds 8 unit tests in
agent_wallet_test.gocovering:keystore.jsonkeystore.jsonprojected, password read via env, never mounted into the keystore dir)Test plan
go test ./internal/serviceoffercontroller/...— all greendocker run ghcr.io/obolnetwork/remote-signer:v0.3.0with the new env scheme — runs healthy,/healthzreturns 200kubectl applyan Agent CR withwallet.create: trueand confirm the remote-signer pod reaches Ready 1/1 without restartskeystore.jsonon next reconcile without rotating key material