Skip to content
This repository has been archived by the owner on Sep 4, 2021. It is now read-only.

docs: Add MySQL, update database and operating pages #2992

Merged
merged 1 commit into from
Jun 27, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
57 changes: 57 additions & 0 deletions docs/content/databases.html.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
title: Databases
layout: docs
---

# Databases

Flynn includes built-in database appliances that handle configuring and managing
highly available databases automatically. These appliances are designed to
provide the maximum amount of safety available from the database system while
providing as much availability as possible without compromising safety.

In some cases it is not possible to meet the strict guarantees of a 'CP' system
under [CAP theorem](https://en.wikipedia.org/wiki/CAP_theorem) due to
limitations in the database software we are wrapping. This is noted specifically
in the Safety section of the documentation for the database in question.

## State Machine Design

The Flynn database appliances are designed with a few goals in mind:

1. Acknowledged writes must not be lost and must be consistent.
1. Network partitions must be tolerated without corrupting data. There should be
no potential for split-brain or other data-mangling failures.
1. When a failure occurs, the appliance should transition into an available
configuration without operator intervention if it can do so safely.

The appliance is a cluster of three or more database instances where:

- One member of the cluster, the _primary_, serves consistent reads and writes.
- The primary has synchronous replication to a single member called the _sync_.
Write transactions are not acknowledged to client until they have been added
to the sync's transaction log.
- Replicating from the sync is a daisy chain of one or more _async_ instances,
which replicate changes asynchronously from their upstream link in the chain.
- If possible, the system automatically reconfigures itself after failures to
maximize uptime and never lose data.

In the face of an arbitrary failure or maintenance action, the cluster can
temporarily lose the ability to handle writes and consistent reads. Eventually
consistent reads are always available from the sync and async instances.

If the primary fails, the sync sees this and promotes itself to primary,
converting the async replicating from it to the new sync. Writes are not
accepted until the new sync has caught up. A variety of safety conditions are in
place so that a promotion will never cause writes to be lost or split brain to
occur.

The cluster state is maintained by the primary and stored in discoverd. The
discoverd DNS and HTTP APIs expose the current primary instance.

This design is heavily based on the prior work done by Joyent on the [Manatee
state machine](https://github.com/joyent/manatee-state-machine).

Flynn comes with a cluster configured with three instances by default. If an
instance fails, the scheduler will create a new instance and the cluster will be
reconfigured by the primary without operator intervention.
89 changes: 89 additions & 0 deletions docs/content/mysql.html.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
title: MySQL
layout: docs
---

# MySQL

The Flynn MySQL appliance provides MariaDB 10.1 in a highly-available
configuration with automatic provisioning. It automatically fails over to
a synchronous replica with no loss of data if the primary server goes down.

## Usage

### Adding a database to an app

MariaDB comes ready to go as soon as you've installed Flynn. After you create an
app, you can provision a database for your app by running:

```text
flynn resource add mysql
```

This will provision a database on the MariaDB cluster and configure your
application to connect to it.

By default, MariaDB is not running in the Flynn cluster. The first time you
provision a database, MariaDB will be started and configured.

### Connecting to the database

Provisioning the database will add a few environment variables to your app
release. `MYSQL_HOST`, `MYSQL_USER`, `MYSQL_PWD`, and `MYSQL_DATABASE` provide
connection details for the database and are used automatically by many MySQL
clients.

Flynn will also create the `DATABASE_URL` environment variable which is utilized
by some frameworks to configure database connections.

### Connecting to a console

To connect to a `mysql` console for the database, run `flynn mysql console`.
This does not require the MySQL client to be installed locally or
firewall/security changes, as it runs in a container on the Flynn cluster.

### Dumping and restoring

The Flynn CLI provides commands for exporting and restoring database dumps.

`flynn mysql dump` saves a complete copy of the database schema and data to a local file.

```text
$ flynn mysql dump -f latest.dump
60.34 MB 8.77 MB/s
```

The file can be used to restore the database with `flynn mysql restore`. It may
also be imported into a local MySQL database that is not managed by Flynn with
`mysql`:

```text
$ mysql -D mydb < latest.dump
```

`flynn mysql restore` loads a database dump from a local file into a Flynn MySQL
database. Any existing tables and database objects will be dropped before they
are recreated.

```text
$ flynn mysql restore -f latest.dump
62.29 MB / 62.29 MB [===================] 100.00 % 4.96 MB/s
```

The restore command may also be used to restore a database dump from another non-Flynn
MySQL database, use `mysqldump` to create a dump file:

```text
$ mysqldump mydb > mydb.dump
```

## Safety

This appliance is designed to provide full consistency and partition tolerance
for all operations that are committed to the binlog. However, the semi-sync
replication configuration is not as well tested as our Postgres appliance, so
we do not have full confidence in the system yet.

There is currently no support for tuning, and data transfer during recovery is
not optimized, so we do not recommend using the appliance for applications that
have high throughput or many records.
20 changes: 12 additions & 8 deletions docs/content/operating.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,19 @@ necessary.

Each host should have a minimum of 1GB of memory, and inter-host network packets
should have a latency of less than 2ms. Deploying a single Flynn cluster across
higher latency WAN links is not recommended, as it can have a negative impact on
cluster consensus.
higher latency WAN links is not recommended, as it can have a significant impact
on the stability of cluster consensus.

## Storage

Flynn uses ZFS to store data. By default, a ZFS pool is created in a sparse file
on top of the existing filesystem at `/var/lib/flynn/volumes`. This is not
recommended for production for performance and reliability reasons. Before
starting the `flynn-host` daemon, you can create a ZFS pool named
on top of the existing filesystem at `/var/lib/flynn/volumes`. We don't
recommend keeping this configuration in production, as it is not as reliable as
dedicating whole disks to the ZFS pool.

### Custom ZFS pool

Before starting the `flynn-host` daemon, you can create a ZFS pool named
`flynn-default` and it will be used instead of a sparse file.

```text
Expand All @@ -44,10 +48,10 @@ off of the sparse file by first attaching your disk as a mirror, then detaching
the sparse file after it has been replicated to the new disk:

```text
# Attach /dev/sdb1 to the flynn-default ZFS pool
# Attach /dev/sdb1 (specify your disk instead of sdb1) to the flynn-default ZFS pool
$ sudo zpool attach flynn-default /var/lib/flynn/volumes/zfs/vdev/flynn-default-zpool.vdev /dev/sdb1

# Check the replication status
# Wait for the resilver to copy all data onto the newly added disk
$ sudo zpool status flynn-default
pool: flynn-default
state: ONLINE
Expand Down Expand Up @@ -195,7 +199,7 @@ Upstart manages the `flynn-host` daemon and stores the log at

The `flynn-host collect-debug-info` command will collect information about the
system it is run on along with recent logs from all apps and the `flynn-host`
daemon. By default it uploads these logs to [Github's
daemon. By default it uploads these logs to [GitHub's
Gist](https://gist.github.com) service, but they can also be saved to a local
tarball with the `--tarball` flag.

Expand Down
55 changes: 9 additions & 46 deletions docs/content/postgres.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,50 +146,13 @@ default:
| swedish\_stem | snowball stemmer for swedish language |
| turkish\_stem | snowball stemmer for turkish language |

## Safety

## Design

The Flynn Postgres appliance was designed with a few goals in mind:

1. Acknowledged writes must not be lost and must be consistent.
1. Network partitions must be tolerated without corrupting data. There should be
no potential for split-brain or other data-mangling failures.
1. When a failure occurs, the appliance should transition into an available
configuration without operator intervention if it can do so safely.

The resulting system can be described as a 'CP' system under [CAP
theorem](https://en.wikipedia.org/wiki/CAP_theorem). The appliance is a cluster
of three or more Postgres instances where:

- One member of the cluster, the _primary_, serves consistent reads and writes.
- The primary has synchronous replication to a single member called the _sync_.
Write transactions are not acknowledged to client until they have been added
to the sync's transaction log.
- Replicating from the sync is a daisy chain of one or more _async_ instances,
which replicate changes asynchronously from their upstream link in the chain.
- If possible, the system automatically reconfigures itself after failures to
maximize uptime and never lose data.

In the face of an arbitrary failure or maintenance action, the cluster can
temporarily lose the ability to handle writes and consistent reads. Eventually
consistent reads are always available from the sync and async instances.

If the primary fails, the sync sees this and promotes itself to primary,
converting the async replicating from it to the new sync. Writes are not
accepted until the new sync has caught up. A variety of safety conditions are in
place so that a promotion will never cause writes to be lost or split brain to
occur.

The built-in Postgres streaming write ahead log replication features are wrapped
in a cluster state machine to achieve this. The cluster state is maintained by
the primary and stored in discoverd. The discoverd DNS and HTTP APIs expose the
current primary instance.

This design is heavily based on the prior work done by Joyent on the [Manatee
state machine](https://github.com/joyent/manatee-state-machine).

Flynn comes with a cluster configured with three instances by default. If an
instance fails, the scheduler will create a new instance and the cluster will be
reconfigured by the primary without operator intervention. When a user runs
`flynn resource add postgres`, a new user and database is created on the default
cluster.
This appliance is designed to provide full consistency and partition tolerance
for all operations that are committed to the write-ahead log (WAL). Note that
this guarantee does not apply to advisory locks, as they are specific to the
server they are acquired and are not persisted to the WAL.

There is currently no support for tuning, and data transfer during recovery is
not optimized, so we do not recommend using the appliance for applications that
have high throughput or many records.
13 changes: 9 additions & 4 deletions docs/content/redis.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,9 @@ layout: docs

# Redis

The Flynn Redis appliance provides Redis 2.8 in a single process configuration.
The data for this process is ephemeral and is intended for caching and
development usage.

The Flynn Redis appliance provides Redis 3.0 in a single process configuration.
The data stored in this process is ephemeral and is intended for caching and
development use.

## Usage

Expand Down Expand Up @@ -38,3 +37,9 @@ by some libraries to configure connections.
To connect to a console for the database, run `flynn redis redis-cli`. This does
not require the Redis client to be installed locally or firewall/security
changes, as it runs in a container on the Flynn cluster.

## Safety

No safety or availability guarantees are currently provided for the Redis
appliance. Data loss and inconsistency is likely. Any data stored should be
treated as ephemeral and only used for caching, development, and testing.
8 changes: 4 additions & 4 deletions docs/content/stability.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ Nightly updates will include all the bleeding edge changes that have just been
merged into Flynn. These changes have all passed code review and our CI system,
but may not be fully tested in "the real world".

The stable channel is currently released weekly and changes have had more time
The stable channel is currently released monthly and changes have had more time
to stabilize. We can't guarantee that these releases will be free of bugs or
unexpected behavior, but our standards are and will continue to be high. It's
important to us that users feel they can trust us and the systems we build, and
Expand All @@ -45,6 +45,6 @@ train system used by major web browsers.
Flynn currently has some [security considerations](/docs/security) that you
should take into account when evaluating it.

Currently we do not recommend using the built-in Postgres appliance for
databases with high write volume or a large amount of data as it is not
yet optimized for this.
Currently we do not recommend using the built-in database appliances for
databases with high write volume or a large amount of data as they are not yet
optimized for demanding use cases.
6 changes: 5 additions & 1 deletion docs/docs-nav.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,11 @@
{ "title": "How To Deploy Python", "path": "/docs/how-to-deploy-python" },
{ "title": "How To Deploy Ruby", "path": "/docs/how-to-deploy-ruby" }
] },
{ "title": "Databases", "path": "/docs/postgres" },
{ "title": "Databases", "path": "/docs/databases", "children": [
{ "title": "PostgreSQL", "path": "/docs/postgres" },
{ "title": "MySQL", "path": "/docs/mysql" },
{ "title": "Redis", "path": "/docs/redis" },
] },
{ "title": "Stability", "path": "/docs/stability" },
{ "title": "Security", "path": "/docs/security" },
{ "title": "Contributing", "path": "/docs/contributing" },
Expand Down