-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Unify all alembic scripts into single directory, creating single hist…
…ory (#7411) * Unify all alembic scripts into single directory, creating single history * merge the alembic env script; will cause all metadatas to be marked * fix scripts * fix manifest, guard sqlite upgrades * migration engine * fix logging overrides * allow overrides for ursula * add readme * fix alembic script location for ursula * fix pg connection, used by internal
- Loading branch information
Showing
109 changed files
with
531 additions
and
929 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
114 changes: 114 additions & 0 deletions
114
python_modules/dagster/dagster/core/storage/alembic/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# Storage migrations | ||
|
||
We use alembic (https://alembic.sqlalchemy.org/en/latest/) to manage the schema migrations for our storage. This adds a directory of migration scripts to your repo and an alembic_version table to your database to keep track of which migration scripts have been applied. | ||
|
||
## Adding a schema migration | ||
|
||
Migrations are only required when you are altering the schema of an existing table (adding/removing a column, adding an index, etc). | ||
|
||
To add a sql schema migration, follow the following steps: | ||
|
||
1. Change the schema definition | ||
1. add a column foo to RunsTable in dagster.core.storage.runs.schema | ||
1. Add an alembic migration script. You'll typically use a specific engine (e.g. sqlite) to create the script, but the migration will apply to all storage implementations | ||
1. `cd python_modules/dagster/dagster/core/storage/runs/sqlite/alembic; alembic revision -m 'add column foo'` | ||
1. Fill in the upgrade/downgrade parts of the migration using alembic operations: `op.add_column('runs', db.Column('foo', db.String))` | ||
1. Make sure that any storage-specific changes are guarded (e.g. if this should only apply to run storage, then do a `runs` table existence check) | ||
1. Make sure that any dialect-specific changes are guarded (e.g. if this should only apply to MySQL, then wrap in a conditional). | ||
|
||
Users should be prompted to manually run migrations using `dagster instance migrate`. | ||
|
||
## Testing a schema migration | ||
|
||
For schema migrations, we test migration behavior in sqlite / postgres / mysql. | ||
|
||
### sqlite | ||
|
||
Migration tests for sqlite can be found here: `python_modules/dagster/dagster_tests/general_tests/compat_tests/test_back_compat.py` | ||
|
||
To add a new back-compat test for sqlite, follow the following steps: | ||
|
||
1. Switch code branches to master or some revision before you’ve added the schema change. | ||
1. Change your dagster.yaml to use the default sqlite implementation for run/event_log storage. | ||
1. Make sure your configured storage directory (e.g. $DAGSTER_HOME/history) is wiped | ||
1. Start dagit and execute a pipeline run, to ensure that both the run db and per-run event_log dbs are created. | ||
1. Copy the runs.db and all per-run event log dbs to the back compat test directory: | ||
- `mkdir python_modules/dagster/dagster_tests/general_tests/compat_tests/<my_schema_change>/sqlite/history` | ||
- `cp $DAGSTER_HOME/history/runs.db\* python_modules/dagster/dagster_tests/general_tests/compat_tests/<my_schema_change>/sqlite/history/` | ||
- `cp -R $DAGSTER_HOME/history/runs python_modules/dagster/dagster_tests/general_tests/compat_tests/<my_schema_change>/sqlite/history/` | ||
1. Write your back compat test, loading your snapshot directory | ||
|
||
### postgres | ||
|
||
Migration tests for postgres can be found here: `python_modules/libraries/dagster-postgres/dagster_postgres_tests/compat_tests/test_back_compat.py` | ||
|
||
To add a new back-compat test for postgres, follow the following steps: | ||
|
||
1. Switch code branches to master or some revision before you’ve added the schema change. | ||
1. Change your dagster.yaml to use a wiped postgres storage configuration | ||
``` | ||
event_log_storage: | ||
module: dagster_postgres.event_log | ||
class: PostgresEventLogStorage | ||
config: | ||
postgres_url: "postgresql://test:test@localhost:5432/test" | ||
run_storage: | ||
module: dagster_postgres.run_storage | ||
class: PostgresRunStorage | ||
config: | ||
postgres_url: "postgresql://test:test@localhost:5432/test" | ||
schedule_storage: | ||
module: dagster_postgres.schedule_storage | ||
class: PostgresScheduleStorage | ||
config: | ||
postgres_url: "postgresql://test:test@localhost:5432/test" | ||
``` | ||
1. Wipe, if you haven’t already dagster run wipe | ||
1. Start dagit and execute a pipeline run, to ensure that both the run db and per-run event_log dbs are created. | ||
1. Create a pg dump file | ||
- `mkdir python_modules/libraries/dagster-postgres/dagster_postgres_tests/compat_tests/<my_schema_change>/postgres` | ||
- `pg_dump test > python_modules/libraries/dagster-postgres/dagster_postgres_tests/compat_tests/<my_schema_change>/postgres/pg_dump.txt` | ||
1. Write your back compat test, loading your snapshot directory | ||
|
||
### mysql | ||
|
||
Migration tests for mysql can be found here: | ||
REPO_ROOT/python_modules/libraries/dagster-mysql/dagster_mysql_tests/compat_tests/test_back_compat.py | ||
|
||
To add a new back-compat test for mysql, follow the following steps: | ||
|
||
1. Switch code branches to master or some revision before you’ve added the schema change. | ||
2. Change your dagster.yaml to use a wiped postgres storage configuration | ||
``` | ||
event_log_storage: | ||
module: dagster_mysql.event_log | ||
class: MySQLEventLogStorage | ||
config: | ||
mysql_url: "mysql+mysqlconnector://test:test@localhost:3306/test" | ||
run_storage: | ||
module: dagster_mysql.run_storage | ||
class: MySQLRunStorage | ||
config: | ||
mysql_url: "mysql+mysqlconnector://test:test@localhost:3306/test" | ||
schedule_storage: | ||
module: dagster_mysql.schedule_storage | ||
class: MySQLScheduleStorage | ||
config: | ||
mysql_url: "mysql+mysqlconnector://test:test@localhost:3306/test" | ||
``` | ||
3. Wipe, if you haven’t already dagster run wipe | ||
4. Start dagit and execute a pipeline run, to ensure that both the run db and per-run event_log dbs are created. | ||
5. Create a mysql dump file | ||
- `mkdir python_modules/libraries/dagster-mysql/dagster_mysql_tests/compat_tests/<my_schema_change>/mysql` | ||
- `mysqldump test > python_modules/libraries/dagster-mysql/dagster_mysql_tests/compat_tests/<my_schema_change>/mysql/mysql_dump.sql -p` | ||
6. Write your back compat test, loading your snapshot directory | ||
|
||
### Adding a data migration | ||
|
||
Generally we do not want to force users to data migrations, especially over the event log which might be extremely large and therefore expensive. | ||
|
||
For secondary index tables (e.g. derived tables from event_log), you can write your custom data migration script, and mark the status of the migration in the secondary_indexes table. This allows you to write guards in your EventLogStorage class that optionally reads from the event log or from the secondary index table, depending on the status of the migration. | ||
|
||
See `EventLogStorage.has_secondary_index` and `EventLogStorage.enable_secondary_index` for more. | ||
|
||
Users should be prompted to manually run data migrations using `dagster instance reindex`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 2 additions & 2 deletions
4
.../sqlite/alembic/versions/da7cd32b690d_.py → .../alembic/versions/001_initial_schedule.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
80 changes: 80 additions & 0 deletions
80
...modules/dagster/dagster/core/storage/alembic/versions/004_add_snapshots_to_run_storage.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
"""add snapshots to run storage | ||
Revision ID: c63a27054f08 | ||
Revises: 3b1e175a2be3 | ||
Create Date: 2020-04-09 05:57:20.639458 | ||
""" | ||
import sqlalchemy as sa | ||
from alembic import op | ||
from sqlalchemy.engine import reflection | ||
|
||
from dagster.core.storage.migration.utils import has_column, has_table | ||
|
||
# alembic magic breaks pylint | ||
# pylint: disable=no-member | ||
|
||
# revision identifiers, used by Alembic. | ||
revision = "c63a27054f08" | ||
down_revision = "3b1e175a2be3" | ||
branch_labels = None | ||
depends_on = None | ||
|
||
|
||
def upgrade(): | ||
bind = op.get_context().bind | ||
inspector = reflection.Inspector.from_engine(bind) | ||
|
||
if not has_table("runs"): | ||
return | ||
|
||
if not has_table("snapshots"): | ||
op.create_table( | ||
"snapshots", | ||
sa.Column("id", sa.Integer, primary_key=True, autoincrement=True, nullable=False), | ||
sa.Column("snapshot_id", sa.String(255), unique=True, nullable=False), | ||
sa.Column("snapshot_body", sa.LargeBinary, nullable=False), | ||
sa.Column("snapshot_type", sa.String(63), nullable=False), | ||
) | ||
|
||
if not has_column("runs", "snapshot_id"): | ||
if "sqlite" in inspector.dialect.dialect_description: | ||
# Sqlite does not support adding foreign keys to existing | ||
# tables, so we are forced to fallback on this witchcraft. | ||
# See https://alembic.sqlalchemy.org/en/latest/batch.html#dealing-with-referencing-foreign-keys | ||
# for additional context | ||
with op.batch_alter_table("runs") as batch_op: | ||
batch_op.execute("PRAGMA foreign_keys = OFF;") | ||
batch_op.add_column( | ||
sa.Column( | ||
"snapshot_id", | ||
sa.String(255), | ||
sa.ForeignKey( | ||
"snapshots.snapshot_id", name="fk_runs_snapshot_id_snapshots_snapshot_id" | ||
), | ||
), | ||
) | ||
op.execute("PRAGMA foreign_keys = ON;") | ||
else: | ||
op.add_column( | ||
"runs", | ||
sa.Column("snapshot_id", sa.String(255), sa.ForeignKey("snapshots.snapshot_id")), | ||
) | ||
|
||
|
||
def downgrade(): | ||
bind = op.get_context().bind | ||
inspector = reflection.Inspector.from_engine(bind) | ||
|
||
if not has_table("runs"): | ||
return | ||
|
||
if has_column("runs", "snapshot_id"): | ||
if "sqlite" in inspector.dialect.dialect_description: | ||
with op.batch_alter_table("runs") as batch_op: | ||
batch_op.drop_column("snapshot_id") | ||
else: | ||
op.drop_column("runs", "snapshot_id") | ||
|
||
if has_table("snapshots"): | ||
op.drop_table("snapshots") |
File renamed without changes.
Oops, something went wrong.