Skip to content

Minimal Database Replication API

Mark Story edited this page May 16, 2022 · 1 revision

When using a primary & replica database configuration choosing which node to use is tedious to do currently. Instead we should offer a higher level replication feature that allows application developers to easily select which node an individual query runs on, and the default node a model uses.

Configuration

The replication API starts in the configuration APIs. Connection configuration will include new keys:

$config = [
    'Datasources' => [
        'default' => [
            'driver' => MysqlDriver::class,
        ],
        'replica-1' => [
            'driver' => MysqlDriver::class,
            'primary' => 'replica-1',
        ]
    ],
]

The primary configuration key is new and defines the primary for a replica. This allows multiple replicas to be defined for a primary and will enable us to provide APIs for choosing a replica as well. By only defining the primary in each replica we avoid duplication in the configuration API.

Choosing a Table's default model

A table can opt into defaulting using a replica by using defaultConnectionName(). That is the existing convention for models anyways.

Switching replicas for a single query

For any given find operation or query the node can be chosen using methods on the query builder.

// Switch to using the replica for the query.
$query->useReplica();

// Switch to using the primary for the query.
$query->usePrimary();

I think this is the most ergonomic API that I can think of. The internals of this are a bit more complicated but I don't think it is egregious complexity for us to take on as a framework. I think the we can use for this are:

  1. Before a query is executed the query's node selection is checked. If the query has a node selection, the selection is resolved to a connection name.
  2. Connection resolution calls either Table::chooseReplica($connection) or Table::choosePrimary($connection) depending on what mode was selected in the query. The default implementation will use new methods on Connection that provide the default behavior. The default implementation will use an internal replica lookup tables that we will need to build as configuration is attached.
  3. If there are multiple replicas for a primary a simple round robin will be used. If more precise logic is required that will need to be handled in application logic.
// New connection APIs

// Will accept either a replica or primary name.
// The next replica in round robin will be used.
// If the connection is not in a replica set, 
// then an exception would be raised.
$replica = ConnectionManager::getReplica($connectionName)


// Will accept a replica or primary name.
// The primary will be returned.
// If the connection is not in a replica set, 
// then an exception would be raised.
$primary = ConnectionManager::getPrimary($connectionName)