Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in SqlAlchemy migrations #4602

Merged
merged 2 commits into from
Dec 2, 2020

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 30, 2020

Fixes #4590

SqlAlchemy: improve the alembic migration code

The code in aiida.backends.sqlalchemy.manager is used, among other
things, to control the database migrations. It can be used to get info
like the current revision of the database, current revision of the code
and to actually migrate the database. The migrations are performed
through alembic. This library has the somewhat weird approach of
loading the env.py file, which then actually calls the migrations.
This file somehow needs to get the configuration of the environment, for
example the connection to the database, and it does so by loading it
from the config that is imported. This config is actually built
dynamically in the SqlBackendManager.

The old implementation was correctly building up the configuration
object but was then calling the CLI commands of alembic to perform the
various operations, which was unnecessarily complex and inefficient. For
most operations, essentially everything but performing the actual
migrations, one can construct a MigrationContext directly, which
allows to inspect the state of the database, for example to get the
current database revision. For anything related to the revisions in the
code, e.g., to get the current head of the code, we can instantiate an
instance of the ScriptDirectory which allows one to inspect the
available revisions.

Through these changes, the env.py file is no longer loaded everytime,
even when it is not necessary, such as when simply retrieving the
current revision of the database.

SqlAlchemy: fix bug in Group extras migration with revision `0edcddc585917

This migration added the JSONB column extras to the DbGroup model.
Since this needs to be a non-nullable column, but existing group rows
would have a null value, the migration was implemented by first adding
the column as nullable, after which all rows had their value set to the
default, before making the column non-nullable.

This approach was working when the revision was run by itself in an
isolated way, which is how the unit test is run, and so that passed.
However, as soon as it was run with multiple revisions in a sequence the
migration would fail with the message that the db_dbgroup table has
pending event triggers as soon as the final instruction to change the
nullability of the column was executed. The cause for this was never
found.

As an alternative, the migration is adapted to instead create the column
directly as non-nullable, but also provide a server default. This will
also avoid the exception of existing rows with a null value, just as the
original approach, but now when we remove the server default in a second
operation, we no longer hit the problem of the pending trigger events.

@codecov
Copy link

codecov bot commented Nov 30, 2020

Codecov Report

Merging #4602 (40a8e99) into release/1.5.1 (6d8ec0b) will increase coverage by 0.02%.
The diff coverage is 88.10%.

Impacted file tree graph

@@                Coverage Diff                @@
##           release/1.5.1    #4602      +/-   ##
=================================================
+ Coverage          79.49%   79.50%   +0.02%     
=================================================
  Files                482      482              
  Lines              35329    35315      -14     
=================================================
- Hits               28080    28074       -6     
+ Misses              7249     7241       -8     
Flag Coverage Δ
django 73.67% <4.77%> (+0.03%) ⬆️
sqlalchemy 72.84% <88.10%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/backends/sqlalchemy/manager.py 84.79% <83.34%> (+6.79%) ⬆️
aiida/backends/sqlalchemy/migrations/env.py 81.49% <93.75%> (+0.84%) ⬆️
...migrations/versions/0edcdd5a30f0_dbgroup_extras.py 100.00% <100.00%> (ø)
aiida/engine/daemon/runner.py 79.32% <0.00%> (-3.44%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6d8ec0b...40a8e99. Read the comment docs.

@sphuber
Copy link
Contributor Author

sphuber commented Dec 1, 2020

Thinking about this, I think we need to release this as v1.5.1 and so not merge into develop. @giovannipizzi it would still be good to review and then I can start to prepare the release/1.5.1 branch

@sphuber sphuber changed the base branch from develop to release/1.5.1 December 1, 2020 13:52
Copy link
Member

@giovannipizzi giovannipizzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems all very good to me - even if of course something might always slip, alembic is a big beast :-D

Anyway, I think there might be a simple explanation for the other issue we were seeing - I think PSQL has to build a table of constraints when you put them, and to avoid blocking it will do it "in the background" while you continue operation.
Of course, if you have to use that contraint check, it will fail with the error you see.
There are ways to tell PSQL "Please create all pending constraint tables and then continue", and one could put this between the transactions, but probably this is a better approach.

The code in `aiida.backends.sqlalchemy.manager` is used, among other
things, to control the database migrations. It can be used to get info
like the current revision of the database, current revision of the code
and to actually migrate the database. The migrations are performed
through `alembic`. This library has the somewhat weird approach of
loading the `env.py` file, which then actually calls the migrations.
This file somehow needs to get the configuration of the environment, for
example the connection to the database, and it does so by loading it
from the config that is imported. This config is actually built
dynamically in the `SqlBackendManager`.

The old implementation was correctly building up the configuration
object but was then calling the CLI commands of `alembic` to perform the
various operations, which was unnecessarily complex and inefficient. For
most operations, essentially everything but performing the actual
migrations, one can construct a `MigrationContext` directly, which
allows to inspect the state of the database, for example to get the
current database revision. For anything related to the revisions in the
code, e.g., to get the current head of the code, we can instantiate an
instance of the `ScriptDirectory` which allows one to inspect the
available revisions.

Through these changes, the `env.py` file is no longer loaded everytime,
even when it is not necessary, such as when simply retrieving the
current revision of the database.
…5a30f0`

This migration added the JSONB column `extras` to the `DbGroup` model.
Since this needs to be a non-nullable column, but existing group rows
would have a null value, the migration was implemented by first adding
the column as nullable, after which all rows had their value set to the
default, before making the column non-nullable.

This approach was working when the revision was run by itself in an
isolated way, which is how the unit test is run, and so that passed.
However, as soon as it was run with multiple revisions in a sequence the
migration would fail with the message that the `db_dbgroup` table has
pending event triggers as soon as the final instruction to change the
nullability of the column was executed. The cause for this was never
found.

As an alternative, the migration is adapted to instead create the column
directly as non-nullable, but also provide a server default. This will
also avoid the exception of existing rows with a null value, just as the
original approach, but now when we remove the server default in a second
operation, we no longer hit the problem of the pending trigger events.
@sphuber sphuber merged commit d604916 into aiidateam:release/1.5.1 Dec 2, 2020
@sphuber sphuber deleted the fix/4590/sqla-migrations branch December 2, 2020 11:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential bug in SqlAlchemy migrations
2 participants