Skip to content

fix: ability to change ports with systemd#354

Merged
jason-lynch merged 1 commit intomainfrom
fix/PLAT-548/change-ports-on-systemd
Apr 20, 2026
Merged

fix: ability to change ports with systemd#354
jason-lynch merged 1 commit intomainfrom
fix/PLAT-548/change-ports-on-systemd

Conversation

@jason-lynch
Copy link
Copy Markdown
Member

Summary

Fixes a bug where changing the Patroni or Postgres port would cause the update operation to fail on systemd. This happened because, in the update method, we were setting the connection info based on the updated spec.

This change means we copy the connection info from the previous instance state to retrieve the previously active Patroni port. Then, it adds a restart operation to handle cases where the Postgres port has changed.

Changes

  • Get connection info from previous instance state in InstanceResource.Update with a fallback to the current spec (the previous behavior) when the previous state is malformed.
  • Restart Postgres if needed in InstanceResource
    • We need this to change the port on systemd. On Swarm, the container was already restarted at this point due to changing the port binding.
    • Note that this is a necessary change to our earlier decision to leave it up to the user to restart after updating. I think it's a net improvement to the user experience.

Testing

The TestPortChange E2E exercises this fix:

make test-e2e E2E_RUN=TestPortChange

Notes for Reviewers

PLAT-548

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

📝 Walkthrough

Walkthrough

Added a new end-to-end test validating database port changes across nodes with replication verification. Updated instance resource logic to preserve connection information during port updates and implemented a restart mechanism that checks for pending restarts after Patroni initialization.

Changes

Cohort / File(s) Summary
E2E Port Change Test
e2e/port_change_test.go
New parallel test that creates a database with two nodes bound to specific ports, updates the database spec to shift those ports, validates usability by connecting to both nodes, asserts port values when using systemd orchestrator, and verifies replication succeeds after port changes.
Instance Resource Port Update Logic
server/internal/database/instance_resource.go
InstanceResource.Update now reuses prior persisted ConnectionInfo to handle port changes, only recomputing from spec if previous connection info is nil. Added restartIfNeeded step in initializeInstance to query Patroni status, schedule restart if PendingRestart is set, and wait for Patroni readiness.

Poem

🐰 Hops with glee o'er ports that change,
Connection info, reused and arranged,
When restarts call, the instance heeds,
Replication flows where the test succeeds!
Port shifts smoothly, no more dismay,
The rabbit's code saves the day! 🐇

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: ability to change ports with systemd' directly and concisely summarizes the main fix in the changeset, following Conventional Commits format.
Description check ✅ Passed The description includes all required template sections: Summary, Changes, Testing, Checklist (with relevant items marked), and Notes for Reviewers with issue reference.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/PLAT-548/change-ports-on-systemd

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 20, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 0 duplication

Metric Results
Duplication 0

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@jason-lynch jason-lynch force-pushed the fix/PLAT-548/change-ports-on-systemd branch from aeb80fb to 6f15645 Compare April 20, 2026 14:29
Fixes a bug where changing the Patroni or Postgres port would cause the
update operation to fail on systemd. This happened because, in the
update method, we were setting the connection info based on the updated
spec.

This change means we copy the connection info from the previous instance
state to retrieve the previously active Patroni port. Then, it adds a
restart operation to handle cases where the Postgres port has changed.

PLAT-548
@jason-lynch jason-lynch force-pushed the fix/PLAT-548/change-ports-on-systemd branch from 6f15645 to c80b7d5 Compare April 20, 2026 14:30
@jason-lynch
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 20, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
server/internal/database/instance_resource.go (2)

103-116: Fallback path can re-introduce the original bug.

When previous.ConnectionInfo is nil (e.g., an older persisted state from before ConnectionInfo existed, or any corrupted/partial state), this falls back to updateConnectionInfo(ctx, rc), which rebuilds the connection info from the new spec — exactly the behavior that caused the systemd port-change failure this PR is fixing. The per-PR-295 learning confirms updateConnectionInfo intentionally uses the desired spec values.

That fallback is acceptable as a last resort, but consider either:

  • Logging a warning when it fires so the re-emergence of the bug is observable, or
  • Only falling back when previous itself is the zero/empty resource, and otherwise erroring (stop-the-world) since we know the pre-existing instance had some active ports we can’t discover.

Also: resource.FromContext[*InstanceResource](rc, r.Identifier()) assumes previous is non-nil whenever err == nil. Worth a quick nil-guard at Line 112 (if previous != nil && previous.ConnectionInfo != nil) since a nil previous here would panic while dereferencing, masking the real problem.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/database/instance_resource.go` around lines 103 - 116, In
Update, avoid silently rebuilding connection info from the new spec when a
previous saved state is present but missing ConnectionInfo: add a nil-guard for
previous (check result of resource.FromContext[*InstanceResource](rc,
r.Identifier()) before dereferencing) and change the fallback so it either logs
a warning when updateConnectionInfo(ctx, rc) is used as last resort or returns
an error (use recordError) when previous is non‑zero but missing ConnectionInfo;
reference the Update method, the previous variable from resource.FromContext,
the ConnectionInfo field, updateConnectionInfo, and recordError to locate and
implement the nil check plus the conditional logging/error behavior.

350-366: LGTM on restartIfNeeded.

Status is fetched once, PendingRestart is nil-checked before dereference, ScheduleRestart is called exactly once (no retry/idempotency concerns per ScheduleRestart's contract in server/internal/patroni/client.go), and we re-wait for the running state. Errors are wrapped with useful context.

One small optional tightening: on a very fast Postgres restart, Patroni can still report state=running between the schedule request and the actual restart, so WaitForPatroniRunning at Line 361 may return before the restart has started. In practice the subsequent GetPrimaryInstanceID call works around it, but you could re-check PendingRestart == false (or poll until the postmaster_start_time advances) for a tighter guarantee. Optional — the current shape is fine for the stated use case.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/database/instance_resource.go` around lines 350 - 366, The
current restartIfNeeded flow can return before the actual restart has begun
because Patroni may briefly report state=running; after calling
client.ScheduleRestart and WaitForPatroniRunning, add a short polling loop that
re-fetches client.GetInstanceStatus (using the same ctx) and verifies
PendingRestart is either nil or false (or alternatively poll the instance's
postmaster_start_time until it advances) with a small timeout/retry interval;
update restartIfNeeded to only return nil once PendingRestart clears (or
postmaster_start_time advances) and otherwise return a wrapped error on timeout.
Use the existing functions GetInstanceStatus, ScheduleRestart,
WaitForPatroniRunning and keep the timeout/polling logic confined to
restartIfNeeded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/database/instance_resource.go`:
- Around line 103-116: In Update, avoid silently rebuilding connection info from
the new spec when a previous saved state is present but missing ConnectionInfo:
add a nil-guard for previous (check result of
resource.FromContext[*InstanceResource](rc, r.Identifier()) before
dereferencing) and change the fallback so it either logs a warning when
updateConnectionInfo(ctx, rc) is used as last resort or returns an error (use
recordError) when previous is non‑zero but missing ConnectionInfo; reference the
Update method, the previous variable from resource.FromContext, the
ConnectionInfo field, updateConnectionInfo, and recordError to locate and
implement the nil check plus the conditional logging/error behavior.
- Around line 350-366: The current restartIfNeeded flow can return before the
actual restart has begun because Patroni may briefly report state=running; after
calling client.ScheduleRestart and WaitForPatroniRunning, add a short polling
loop that re-fetches client.GetInstanceStatus (using the same ctx) and verifies
PendingRestart is either nil or false (or alternatively poll the instance's
postmaster_start_time until it advances) with a small timeout/retry interval;
update restartIfNeeded to only return nil once PendingRestart clears (or
postmaster_start_time advances) and otherwise return a wrapped error on timeout.
Use the existing functions GetInstanceStatus, ScheduleRestart,
WaitForPatroniRunning and keep the timeout/polling logic confined to
restartIfNeeded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c46ce314-445b-4d92-b5fc-43b90def10d9

📥 Commits

Reviewing files that changed from the base of the PR and between 107121a and c80b7d5.

📒 Files selected for processing (2)
  • e2e/port_change_test.go
  • server/internal/database/instance_resource.go

@jason-lynch jason-lynch merged commit 3ec5b6e into main Apr 20, 2026
3 checks passed
@jason-lynch jason-lynch deleted the fix/PLAT-548/change-ports-on-systemd branch April 20, 2026 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants