Documentation: operational aspects, hints and tips#252
Conversation
docs/deployment-raft.md
Outdated
|
|
||
| - For `SQLite`: | ||
| - `SQLite` is bundled with `orchestrator`. | ||
| - Make sure the `SQLite3DataFile` is writable to the `orchestrator` user. |
There was a problem hiding this comment.
Suggestion: writable by
Question: does orchestrator try to ensure this file has any specific permissions?
There was a problem hiding this comment.
fixed. Answer: nope.
docs/deployment-raft.md
Outdated
| As suggested, you may want to put `orchestrator` service and `MySQL` service on same box. If using `SQLite` there's nothing else to do. | ||
|
|
||
| - Consider adding a proxy on top of the service boxes; the proxy would redirect all traffic to the leader node. There is one and only one leader node, and the status check endpoint is `/api/leader-check`. | ||
| - Clients may _only iteract with the leader_. Setting up a proxy is one way to ensure that. See [proxy section](raft.md#proxy). |
There was a problem hiding this comment.
"may" ? Should this not be "must" ?
docs/deployment-raft.md
Outdated
| - Run failure detection | ||
| - Register their own health check | ||
|
|
||
| None-leader nodes may _NOT_: |
There was a problem hiding this comment.
"may" ? Should this not be "must" ?
|
|
||
| - Run arbitrary commands (e.g. `relocate`, `begin-downtime`) | ||
| - Run recoveries per human request. | ||
| - Serve client HTTP requests (but some endpoints, such as load-balancer and health checks, are valid). |
There was a problem hiding this comment.
Could the non-leaders redirect these requests to the leader? That might simplify the proxy setup but maybe it assumes connectivity that may not work: (e.g. it probably requires the non-leaders to know the public interface of the leader and currently orchestrator is not necessarily aware of that.
| - Copy backend DB data: | ||
| - If `MySQL`, run backup/restore, either logical or physical. | ||
| - If `SQLite`, run `.dump` + restore, see [10. Converting An Entire Database To An ASCII Text File](https://sqlite.org/cli.html). | ||
|
|
There was a problem hiding this comment.
- It is not possible to "bootstrap" a new empty orchestrator node into a raft cluster just by talking to the cluster? while using the equivalent of dumping the sqlite or mysql db backend to the new node will work it feels attractive to be able to pull this info down from the cluster directly.
- I may have missed this: but are access "credentials" required to join the cluster?
There was a problem hiding this comment.
bootstrap: see #246 (comment)
credentials: no.
docs/deployment-shared-backend.md
Outdated
|
|
||
| In a shared backend setup multiple `orchestrator` services will all speak to the same backend. | ||
|
|
||
| - For **synchronous replication**, the advise is: |
docs/deployment.md
Outdated
| - Only one service will be the leader at any given time. | ||
| - The leader is the one polling for servers; doing database cleanup; checking for crash recovery scenarios etc. | ||
| - You may choose to have all your `orchestrator` services load-balanced | ||
| However how does `orchestrator` discover completely new topologies? |
There was a problem hiding this comment.
Suggestion: "However, how ..." ?? (comma is important here).
docs/deployment.md
Outdated
| - The (single) MySQL backend has the necessary state for managing concurrent operations. | ||
| - `orchestrator` has "maintenance locks" which prevent destructive concurrent operations on the same instance. At worst an | ||
| operation will be rejected due to not being able to acquire maintenance lock. | ||
| - You may ask `orchestrator` to _discover_ (probe) any single server in such topology, and from there on it will crawl its way across the entire topology. |
There was a problem hiding this comment.
suggestion: " ... in such a topology ..."
| ``` | ||
|
|
||
| This setup comes from production environments. The cron entries get updated by `puppet` to reflect the appropriate `promotion_rule`. A server may have `prefer` at this time, and `prefer_not` in 5 minutes from now. Integrate your own service discovery method, your own scripting, to provide with your up-to-date `promotion-rule`. | ||
|
|
There was a problem hiding this comment.
You can also use the http interface to achieve the same result. That avoids the need for direct access to the orchestrator database (for writes) so may be preferred.
There was a problem hiding this comment.
Thanks for pointing this out. I'll be giving examples using orchestrator CLI, orchestrator-client (which is actually usign the HTTP interface) and directly accessing the API.
There was a problem hiding this comment.
actually, the very example you responded to used orchestrator-client...
|
cc @github/database-infrastructure as interested party. |
Documenting operational aspects: