Add PostgreSQL documentation and tutorials#72
Conversation
…d user manual - Add Tutorial 5: Setting up orchestrator with PostgreSQL streaming replication (prerequisites, user setup, configuration, discovery, failover testing) - Update database-providers.md: document PostgreSQL as a fully supported provider with all supported operations, differences from MySQL, and known limitations - Update reference.md: add ProviderType config field to section 1.1 - Update user-manual.md: add PostgreSQL sections to chapters 2 (prerequisites), 3 (discovery mechanism), 4 (failure detection codes), and 5 (recovery flow) - Update README.md: mention PostgreSQL support in features list and description Closes #69
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 9 minutes and 35 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
This PR updates the project documentation to describe PostgreSQL support (including a new PostgreSQL streaming replication tutorial) and to surface a PostgreSQL “provider mode” in the README, user manual, database provider docs, and configuration reference.
Changes:
- Add “Tutorial 5” walkthrough for PostgreSQL streaming replication setup in
docs/tutorials.md. - Expand
docs/user-manual.mdanddocs/database-providers.mdwith PostgreSQL discovery/failure-detection/recovery/provider-mode documentation. - Update
README.mdanddocs/reference.mdto advertise/configure aProviderTypeswitch and PostgreSQL support.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Advertises PostgreSQL support in the project description and features list. |
| docs/user-manual.md | Adds PostgreSQL prerequisites, discovery, failure detection codes, and recovery flow sections. |
| docs/tutorials.md | Adds a PostgreSQL streaming replication setup and failover tutorial. |
| docs/reference.md | Adds ProviderType to the configuration reference (General/Debug section). |
| docs/database-providers.md | Updates provider abstraction docs to describe PostgreSQL “mode”, config, and supported operations. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|  | ||
|
|
||
| `orchestrator` is a MySQL high availability and replication management tool, runs as a service and provides command line access, HTTP API and Web interface. `orchestrator` supports: | ||
| `orchestrator` is a MySQL and PostgreSQL high availability and replication management tool, runs as a service and provides command line access, HTTP API and Web interface. `orchestrator` supports: |
There was a problem hiding this comment.
README now states orchestrator is a “MySQL and PostgreSQL high availability and replication management tool”, but the current codebase does not expose any configuration/CLI mechanism to switch to the PostgreSQL provider (no ProviderType field in go/config/config.go, and the provider registry is initialized to MySQL only). This claim should be revised to match what’s actually available, or the missing provider-selection plumbing needs to be implemented first.
| `orchestrator` is a MySQL and PostgreSQL high availability and replication management tool, runs as a service and provides command line access, HTTP API and Web interface. `orchestrator` supports: | |
| `orchestrator` is a MySQL high availability and replication management tool, runs as a service and provides command line access, HTTP API and Web interface. `orchestrator` supports: |
| - **PostgreSQL support** -- discovery, failure detection, and automated failover for PostgreSQL streaming replication topologies | ||
| - Database provider abstraction for multi-database support |
There was a problem hiding this comment.
The new feature bullet claims “PostgreSQL support — discovery, failure detection, and automated failover”, but there is no PostgreSQL discovery/failure-analysis/failover implementation present (e.g., no pg_stat_replication/pg_promote usage in go/, and analysis codes like DeadPrimary do not exist). Please either scope this bullet to what is actually implemented today, or land the corresponding code changes before advertising full PostgreSQL HA support.
| |-------|------|---------|-------------| | ||
| | `Debug` | bool | `false` | Set debug mode (similar to `--debug` option) | | ||
| | `EnableSyslog` | bool | `false` | Should logs be directed (in addition) to syslog daemon? | | ||
| | `ProviderType` | string | `"mysql"` | Database provider type: `"mysql"` (default) or `"postgresql"`. When set to `"postgresql"`, orchestrator uses PostgreSQL-specific discovery, failure detection, and recovery logic. See [Database Providers](database-providers.md) for details. | |
There was a problem hiding this comment.
docs/reference.md states it is generated from go/config/config.go, but it documents a ProviderType field that does not exist in the Configuration struct. Either add/implement this config field (and wire it into provider selection), or remove it from the reference to keep the document consistent with the code-as-source-of-truth statement.
| | `ProviderType` | string | `"mysql"` | Database provider type: `"mysql"` (default) or `"postgresql"`. When set to `"postgresql"`, orchestrator uses PostgreSQL-specific discovery, failure detection, and recovery logic. See [Database Providers](database-providers.md) for details. | |
| MySQL is the default provider. PostgreSQL is fully supported for streaming | ||
| replication topologies, including discovery, failure detection, and automated | ||
| failover. The abstraction layer is designed to support additional providers in | ||
| the future. |
There was a problem hiding this comment.
This doc claims PostgreSQL is “fully supported … including discovery, failure detection, and automated failover” and that setting ProviderType enables PostgreSQL mode. In the current codebase, there is no ProviderType config field and the provider registry is not wired into discovery/recovery (no call sites for inst.GetProvider() outside tests). Please adjust the documentation to the current capabilities (or merge the missing wiring/code first).
| ### Switching to PostgreSQL Mode | ||
|
|
||
| Add the following fields to your orchestrator configuration JSON: | ||
| Set `ProviderType` to `"postgresql"` in your orchestrator configuration: | ||
|
|
||
| ```json | ||
| { | ||
| "ProviderType": "postgresql", | ||
| "PostgreSQLTopologyUser": "orchestrator", | ||
| "PostgreSQLTopologyPassword": "secret" | ||
| "PostgreSQLTopologyPassword": "secret", | ||
| "PostgreSQLSSLMode": "require", | ||
| "DefaultInstancePort": 5432 | ||
| } | ||
| ``` | ||
|
|
||
| These credentials are used to connect to PostgreSQL topology instances for | ||
| discovery and replication management operations. | ||
| When `ProviderType` is set to `"postgresql"`, orchestrator automatically uses | ||
| the PostgreSQL provider for all topology operations: discovery, failure | ||
| analysis, and recovery. |
There was a problem hiding this comment.
The “Switching to PostgreSQL Mode” section documents ProviderType as the activation mechanism, but go/config/config.go has no such field and nothing in the code switches the global provider based on config. This section should either describe the actual activation mechanism (if any), or be removed until provider selection is implemented.
| ```json | ||
| { | ||
| "Debug": true, | ||
| "ListenAddress": ":3000", | ||
| "ProviderType": "postgresql", | ||
| "PostgreSQLTopologyUser": "orchestrator", | ||
| "PostgreSQLTopologyPassword": "orch_pass", | ||
| "PostgreSQLSSLMode": "require", | ||
| "BackendDB": "sqlite", | ||
| "SQLite3DataFile": "/tmp/orchestrator.sqlite3", | ||
| "DefaultInstancePort": 5432, | ||
| "InstancePollSeconds": 5, | ||
| "RecoverMasterClusterFilters": ["*"], | ||
| "RecoverIntermediateMasterClusterFilters": ["*"], | ||
| "FailureDetectionPeriodBlockMinutes": 60, | ||
| "RecoveryPeriodBlockSeconds": 3600 | ||
| } | ||
| ``` | ||
|
|
||
| **Key fields explained:** | ||
|
|
||
| | Field | Purpose | | ||
| |-------|---------| | ||
| | `ProviderType` | Set to `"postgresql"` to enable PostgreSQL mode. Default is `"mysql"`. | | ||
| | `PostgreSQLTopologyUser` / `Password` | Credentials orchestrator uses to connect to your PostgreSQL instances. | | ||
| | `PostgreSQLSSLMode` | SSL mode for PostgreSQL connections: `disable`, `require`, `verify-ca`, or `verify-full`. | | ||
| | `DefaultInstancePort` | Set to `5432` for PostgreSQL (default is 3306 for MySQL). | |
There was a problem hiding this comment.
Tutorial config uses ProviderType: "postgresql" and suggests it enables PostgreSQL mode, but ProviderType is not a recognized config field in go/config/config.go and the global provider is never switched based on config. As written, users can follow this tutorial and still run in MySQL mode. Please update the tutorial to the actual enablement mechanism (or add the missing configuration + wiring in code).
| ### Step 6: Discover your PostgreSQL topology | ||
|
|
||
| Tell orchestrator about your PostgreSQL primary. Replace `pg-primary` with the actual hostname or IP: | ||
|
|
||
| ```bash | ||
| curl http://localhost:3000/api/discover/pg-primary/5432 | ||
| ``` | ||
|
|
||
| Expected output: | ||
|
|
||
| ```json | ||
| { | ||
| "Key": {"Hostname": "pg-primary", "Port": 5432}, | ||
| "Uptime": 1, | ||
| "FlavorName": "PostgreSQL", | ||
| "Version": "16.2", | ||
| "ReadOnly": false | ||
| } | ||
| ``` | ||
|
|
||
| Orchestrator connects to the primary, queries `pg_stat_replication` to discover connected standbys, and recursively probes each standby. | ||
|
|
There was a problem hiding this comment.
The tutorial claims /api/discover/pg-primary/5432 will discover a PostgreSQL topology by querying pg_stat_replication, but the current Discover handler calls inst.ReadTopologyInstance, which is explicitly implemented as MySQL-only discovery. Unless PostgreSQL discovery has been implemented elsewhere, these steps (and the “expected output”) will not work as documented and should be revised/removed until supported.
| ### Step 9: Test graceful failover | ||
|
|
||
| To verify failover works, you can simulate a primary failure by stopping PostgreSQL on the primary: | ||
|
|
||
| ```bash | ||
| # On the primary host: | ||
| pg_ctl stop -D /var/lib/postgresql/16/main -m fast | ||
| ``` |
There was a problem hiding this comment.
Step 9 is titled “Test graceful failover” but the procedure is to stop PostgreSQL on the primary, which is an unplanned failure scenario (failover), not a graceful/planned switchover. Consider renaming this step to avoid conflating planned takeover with failure recovery (especially since graceful takeover is called out elsewhere as not supported for PostgreSQL).
|
|
||
| The `pg_monitor` role grants read access to `pg_stat_replication`, `pg_stat_wal_receiver`, and other monitoring views that orchestrator needs for discovery. | ||
|
|
||
| > **Note:** If you are using PostgreSQL 9.6 (not recommended), you need to grant `SELECT` on the individual monitoring views instead of using `pg_monitor`. |
There was a problem hiding this comment.
Tutorial requires PostgreSQL 12+ (for pg_promote()), but the note suggests supporting PostgreSQL 9.6 with extra grants. This is internally inconsistent; either remove the 9.6 note or relax/clarify the minimum version requirement and explicitly document which features won’t work on <12.
| > **Note:** If you are using PostgreSQL 9.6 (not recommended), you need to grant `SELECT` on the individual monitoring views instead of using `pg_monitor`. | |
| > **Note:** This tutorial assumes PostgreSQL 12+. | |
| > Earlier PostgreSQL versions may require different grants for monitoring views and do not support all features used later in this guide. |
| instead of binlog file:position or GTIDs. Orchestrator converts LSN to an | ||
| int64 for internal use. |
There was a problem hiding this comment.
The doc says “Orchestrator converts LSN to an int64 for internal use”, but the provider interface explicitly defines ReplicationStatus.Position as an opaque string (and provider_postgresql.go returns LSN strings). Suggest updating this to reflect the current design: LSN is treated as an opaque position string unless/until there is a dedicated typed representation.
| instead of binlog file:position or GTIDs. Orchestrator converts LSN to an | |
| int64 for internal use. | |
| instead of binlog file:position or GTIDs. Orchestrator currently treats the | |
| replication position as an opaque string (`ReplicationStatus.Position`) | |
| unless/until a dedicated typed representation is introduced. |
There was a problem hiding this comment.
Code Review
This pull request introduces extensive documentation for PostgreSQL support, including configuration, discovery, failure detection, and recovery processes. It also adds a new setup tutorial and updates the user manual. The review feedback identifies several issues, including a contradiction regarding supported operations, inconsistent timeout values for promotion, a typo in the Go version requirement, and discrepancies between documented analysis codes and the source code.
| |---------------------|---------------------------------------------------------------| | ||
| ### Supported Operations | ||
|
|
||
| | Operation | PostgreSQL Implementation | |
There was a problem hiding this comment.
| | **Discovery (standby)** | Queries `pg_stat_wal_receiver` for WAL receiver status and `pg_last_wal_replay_lsn()` for replay position. Extracts primary host/port from `conninfo`. | | ||
| | **Replication lag** | Computes `EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp())` on standbys. | | ||
| | **Failure detection** | Analyzes reachability of primary and replication status of standbys. Produces `DeadPrimary`, `DeadPrimaryAndSomeStandbys`, `StandbyNotReplicating`, `AllStandbyNotReplicating`, and `UnreachablePrimary` analysis codes. | | ||
| | **Promotion** | Calls `pg_promote(true, 60)` on the selected standby and waits up to 30 seconds for it to exit recovery mode. | |
There was a problem hiding this comment.
The SQL call pg_promote(true, 60) specifies a 60-second timeout, but the description states that orchestrator waits up to 30 seconds. These values should be consistent.
| | **Promotion** | Calls `pg_promote(true, 60)` on the selected standby and waits up to 30 seconds for it to exit recovery mode. | | |
| | **Promotion** | Calls `pg_promote(true, 60)` on the selected standby and waits up to 60 seconds for it to exit recovery mode. | |
| ### What you will need | ||
|
|
||
| - **PostgreSQL 12+** primary with one or more streaming replication standbys already configured | ||
| - Go 1.25+ installed (for building from source) |
|
|
||
| **PostgreSQL-specific analysis codes:** | ||
|
|
||
| | Analysis Code | Condition | |
| - Must not be downtimed | ||
| - If a candidate key is specified and the candidate is valid, it is preferred | ||
| - Otherwise, the standby with the lowest replication lag and highest WAL LSN (most up-to-date) is chosen | ||
| 4. **Promote the standby** by calling `pg_promote(true, 60)` on the selected standby. Orchestrator waits up to 30 seconds for the instance to exit recovery mode. |
There was a problem hiding this comment.
The SQL call pg_promote(true, 60) specifies a 60-second timeout, but the text states that orchestrator waits up to 30 seconds. These should be consistent.
| 4. **Promote the standby** by calling `pg_promote(true, 60)` on the selected standby. Orchestrator waits up to 30 seconds for the instance to exit recovery mode. | |
| 4. **Promote the standby** by calling `pg_promote(true, 60)` on the selected standby. Orchestrator waits up to 60 seconds for the instance to exit recovery mode. |
Summary
docs/tutorials.md: complete walkthrough for setting up orchestrator with PostgreSQL streaming replication (user creation, configuration, discovery, failover testing, expected output at each step)docs/database-providers.md: document PostgreSQL as a fully supported provider with all supported operations, configuration fields, differences from MySQL mode, and known limitationsdocs/reference.md: addProviderTypeconfiguration field to section 1.1 (General/Debug)docs/user-manual.md: add PostgreSQL content to Chapter 2 (prerequisites), Chapter 3 (discovery mechanism using pg_stat_replication/pg_stat_wal_receiver), Chapter 4 (PostgreSQL-specific failure analysis codes), and Chapter 5 (recovery flow with pg_promote and ALTER SYSTEM)README.md: mention PostgreSQL support in the features list and update the project descriptionCloses #69
Test plan
go/config/config.gogo/inst/analysis.go