This directory defines the Taskcluster database:
- Stored Functions -- list of stored functions defined by this package
versions/
-- the migrations that create the most-recent database schematables.yml
-- tables in the DB (for internal use only!)access.yml
-- definitions for which services can access which tables
test/
-- tests for the contents of this directorysrc/
-- implementation of the JS interface to the DB
The database schema is handled by taskcluster-lib-postgres.
Each database version is defined in db/versions/####.yml
, numbered sequentially, as decribed in that library's documentation.
It's not permitted to change an existing version file (db/versions/*.yml
) [*].
Instead, any change to the DB must be made by adding a new version.
This allows deployments of Taskcluster to follow those changes smoothly.
[*] There are a few exceptions: fixing bugs in a version that has not yet been included in a Taskcluster release, and updating stored-function descriptions.
A version file has a migrationScript
which performs the change to the database.
This can use any Postgres functionality required to make the change.
In some cases, that's as simple as CREATE TABLE
or ALTER TABLE
, but can also involve temporary tables and complex data manipulation.
The script runs in a single transaction.
The following checklist summarizes what needs to be written to modify the database.
- new version file in
db/versions
that updates all impacted stored functions - any necessary updates in
db/tables.yml
anddb/access.yml
- new test script in
db/test/versions
- test forward migration
- test downgrade
- test migration after downgrade (ensuring downgrade doesn't leave stray Postgres resources around)
- for any new stored functions:
- tests for new functions in
db/test/fns
- tests for new functions in
This script should also update database permissions as necessary.
The username prefix is substituted for $db_user_prefix$
, so permissions can be managed with statements like
grant select on table newtable to $db_user_prefix$_someservice;
As a safety check, the upgrade machinery will confirm after an upgrade is complete that the permissions in the database match those in db/access.yml
.
Each version file should redefine any stored functions that are affected by the schema changes, and define any newly-required functions. Unchanged functions can be omitted. A function's signature (argument and return types) cannot change from version to version. Instead, define a new function with a different name.
For example, if get_widget(widgetId text) returns table (widgetWidth integer)
must be extended to also return a widget height, define a new get_widget_with_height
method.
This approach leaves the existing method in place so that older code can continue to use it.
When a method no longer makes sense (for example, when a feature is removed), redefine the method to return an empty table or default value, as appropriate.
For example, if support for widgets is removed, get_widget
should be redefined to always return an empty table.
Every version should have tests defined in db/tests/versions/
.
These tests should exercise all of the functionality of the migration script, and verify that everything is as expected.
These tests should be very thorough, as broken migration scripts cannot be easily fixed once they are included in a Taskcluster release.
Ensure that they cover every conceivable circumstance, especially if they modify existing data.
Tests typically take the form
test('...', async function() {
await helper.upgradeTo(16); // upgrade to the previous version
// insert some data
await helper.withDbClient(async client => {
..
});
await helper.upgradeTo(17); // upgrade to this version
// assert that things were migrated properly
await helper.withDbClient(async client => {
..
});
});
Tests for stored functions should be in db/tests/fns/<service-name>_test.js
.
These tests serve as unit tests for the stored functions, and also help to ensure that modifications to the stored functions do not unexpectedly change their behavior.
In most cases, existing tests for a stored function should continue to pass without modification even when a new DB version modifies the function implementation.
There are exceptions to this rule, but reviewers should carefully scrutinize any such changes to ensure they will not break compatibility.
The taskcluster-db
package exports an async setup
function which is intended to be used in services' main.js
:
const tcdb = require('taskcluster-db');
// ...
db: {
requires: ['cfg'],
setup: ({cfg}) => tcdb.setup({
readDbUrl: cfg.db.readDbUrl,
writeDbUrl: cfg.db.writeDbUrl,
serviceName: 'queue',
}),
}
The result is a taskcluster-lib-postgres Database instance all set up and ready to use. This uses the generated schema by default.
Similarly, the upgrade
method will upgrade a database to the current version and set up table permissions for per-service postgres users.
To upgrade to a specific version, pass toVersion: <number>
.
This functionality is typically used in tests, as in production deployments the deployers will run yarn db:upgrade
.
const tcdb = require('taskcluster-db');
setup('upgrade db', async function() {
await tcdb.upgrade({
adminDbUrl: process.env.TEST_DB_URL,
usernamePrefix: 'test',
});
});
Finally, to get the current Schema instance, call tcdb.schema({})
.
All of these functions take an optional useDbDirectory: true
option to indicate that they should read from the YAML files under db/
instead of using the serialized format.
This approach is slower, but is appropriate for testing.
To test this library, you will need a Postgres database, running the latest release of Postgres 11. The easiest and best way to do this is to use docker, as described in the Development Process docs.