Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not suppress sqlmigrate errors #209

Closed
roman-ivanyushchenko opened this issue Mar 5, 2022 · 6 comments · Fixed by #223
Closed

Do not suppress sqlmigrate errors #209

roman-ivanyushchenko opened this issue Mar 5, 2022 · 6 comments · Fixed by #223
Assignees

Comments

@roman-ivanyushchenko
Copy link

Now, if an error occurs while executing the sqlmigrate command, then as a result, the analyzer will be given an empty string instead of migrations SQL. This can cause a backwards incompatible migration to get into production and break it.

def get_sql(self, app_label, migration_name):
logger.info(
"Calling sqlmigrate command {} {}".format(app_label, migration_name)
)
dev_null = open(os.devnull, "w")
try:
sql_statement = call_command(
"sqlmigrate",
app_label,
migration_name,
database=self.database,
stdout=dev_null,
)
except (ValueError, ProgrammingError):
logger.warning(
(
"Error while executing sqlmigrate on (%s, %s). "
"Continuing execution with empty SQL."
),
app_label,
migration_name,
)
sql_statement = ""
return sql_statement.splitlines()

The suggestion is to add a strict check option to require the sqlmigrate command to be run for migration.

@David-Wobrock
Copy link
Collaborator

Hi @roman-ivanyushchenko

Thanks for opening the issue 👍

I think that we catch these exceptions because in some cases, Django is executing a query against the DB in order to generate the SQL statement.
This happens for some database constraints, where, in order to drop them, Django needs to know its current name the DB.
This works well when executing migrations in a sequential order, because the constraint was created before dropping it. However, when linting a migration, the migration creating the constraint wasn't necessarily executed, so the constraint might not exist.
In these cases, the sqlmigrate command fails, but we didn't want to stop the linter from working, but simply ignore it.

However, I understand the point that it could fail in these cases.
Did you encounter a case where the linter should have failed but instead let you put in production a "dangerous" migration?

@David-Wobrock David-Wobrock added the need_info Need more information before we can pursue the fix label Apr 24, 2022
@phillipuniverse
Copy link
Contributor

Did you encounter a case where the linter should have failed but instead let you put in production a "dangerous" migration?

@David-Wobrock I just ran across this problem and exactly this issue happened to me. In our case this let a RenameField sneak through.

This was the problematic migration file contents:

from django.db import migrations


class Migration(migrations.Migration):

    dependencies = [
        ("someapp", "0306_auto_20220531_1429"),
    ]

    operations = [
        migrations.RenameField(
            model_name="amodel",
            old_name="old_column_name",
            new_name="new_column_name",
        ),
    ]

The problem for us was a kind of wonky migration chain that caused a missing state but perfectly fine SQL execution which is why we didn't catch it. This is the output from sqlmigrate someapp 0306_auto_20220531_1429:

 [2022-06-02T16:28:27.079077+00:00] | ERROR    | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [__main__] [manage.py:<module>:45] - Could not execute management command ['backend/manage.py', 'sqlmigrate', 'someapp', '0306_auto_20220531_1429']
Traceback (most recent call last):
  File "backend/manage.py", line 43, in <module>
    execute_from_command_line(sys.argv)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/sbackend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
    utility.execute()
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/__init__.py", line 375, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/base.py", line 323, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/commands/sqlmigrate.py", line 30, in execute
    return super().execute(*args, **options)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/base.py", line 364, in execute
    output = self.handle(*args, **options)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/core/management/commands/sqlmigrate.py", line 64, in handle
    sql_statements = executor.collect_sql(plan)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/db/migrations/executor.py", line 225, in collect_sql
    state = migration.apply(state, schema_editor, collect_sql=True)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/db/migrations/migration.py", line 124, in apply
    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/db/migrations/operations/fields.py", line 358, in database_forwards
    to_model = to_state.apps.get_model(app_label, self.model_name)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/utils/functional.py", line 80, in __get__
    res = instance.__dict__[self.name] = self.func(instance)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/db/migrations/state.py", line 210, in apps
    return StateApps(self.real_apps, self.models)
  File "/Users/phillip/Library/Caches/pypoetry/virtualenvs/backend-cpwchxt0-py3.7/lib/python3.7/site-packages/django/db/migrations/state.py", line 280, in __init__
    raise ValueError("\n".join(error.msg for error in errors))
ValueError: <function GenericRelation.contribute_to_class.<locals>.make_generic_foreign_order_accessors at 0x142815950> contains a lazy reference to auditlog.auditchangerecord, but app 'auditlog' isn't installed.

The output from the linter indicated there was an error but:

  1. It missed context so slightly esoteric to understand what the issue was
  2. It was marked as "valid", instead of something like a "warning"
  3. It was suppressed without a way to say "treat these as actual stop the world errors"
[2022-06-02T15:51:56.438644+00:00] | INFO     | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:307] - Calling sqlmigrate command anotherapp 0018_copy_values_from_expiration_minutes_to_expiration_seconds
(rfp_automation, 0018_copy_values_from_expiration_minutes_to_expiration_seconds)... OK
[2022-06-02T15:51:57.108473+00:00] | INFO     | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:307] - Calling sqlmigrate command someapp 0306_auto_20220531_1429
[2022-06-02T15:51:57.791838+00:00] | WARNING  | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:325] - Error while executing sqlmigrate on (someapp, 0306_auto_20220531_1429). Continuing execution with empty SQL.
(shipments, 0306_auto_20220531_1429)... OK
[2022-06-02T15:51:57.793985+00:00] | INFO     | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:307] - Calling sqlmigrate command someapp 0307_auto_20220530_1517
[2022-06-02T15:51:58.341416+00:00] | WARNING  | [service=backend env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:325] - Error while executing sqlmigrate on (someapp, 0307_auto_20220530_1517). Continuing execution with empty SQL.
(someapp, 0307_auto_20220530_1517)... OK
*** Summary ***
Valid migrations: 3/3
Erroneous migrations: 0/3
Migrations with warnings: 0/3
Ignored migrations: 0/3

The fix for us was to change one of other problematic migrations (the one that had the auditlog reference) to depend on a different migration. This issue is perhaps slightly exacerbated by the fact that we do not run our migrations from scratch regularly. Our app is ~4 years old and has literally thousands of migrations, it's not practical for us to run through all of them in a CI/CD process. And obviously if we are applying them against our existing environments, the migrations have already been executed there so state issues like this would not crop up.

Once I fixed the state issue, I got the error on the migration file that I expected:

[2022-06-02T16:33:04.216001+00:00] | INFO     | [service=backend-core env=localdev version=ce6e5f0a2a trace_id=0 span_id=0] [django_migration_linter] [migration_linter.py:get_sql:307] - Calling sqlmigrate command someapp 0307_auto_20220530_1517
(someapp, 0307_auto_20220530_1517)... ERR
	RENAMING columns (table: someapp_amodel, column: old_column_name)
*** Summary ***
Valid migrations: 2/3
Erroneous migrations: 1/3
Migrations with warnings: 0/3
Ignored migrations: 0/3

This happens for some database constraints, where, in order to drop them, Django needs to know its current name the DB.

This definitely clears up why the suppression was added in the first place, thanks for this context! Perhaps it would be beneficial to try to whitelist these specific cases and have everything else bubble up the exception and error the linter?

@David-Wobrock
Copy link
Collaborator

That's a very interesting use case! Thanks for sharing that :)

After a bit of reflection, this is definitely a case we want to avoid.
We want the linter to be reliable, and I'd prefer it to be over cautious and warn the user that the migration could be problematic, but it can't say for sure, than to silently consider the migration as valid.
I'll write a patch to change this, and this will probably mark the next major release for the linter :)

@David-Wobrock David-Wobrock self-assigned this Jul 10, 2022
@David-Wobrock David-Wobrock added enhancement and removed need_info Need more information before we can pursue the fix labels Jul 10, 2022
@David-Wobrock
Copy link
Collaborator

The next question would be, what should happen when the linter cannot analyse a migration?

  1. it crashes, meaning that the user has to either fix it, or manually ignore the migration
  2. it marks the migration as error, so that the user can either ignore the migration or ignore this type of error
  3. it marks the migration as warning, so that the user can escalate themselves with the 'warning as error' flag, or ignore the migration or the warning
  4. any other option... 🤔

Any opinion would be welcome :)

@phillipuniverse
Copy link
Contributor

@David-Wobrock my preference would be (1). (3) would probably be the closest to existing behavior.

You mentioned above about a false-negative case the could occur:

However, when linting a migration, the migration creating the constraint wasn't necessarily executed, so the constraint might not exist.
In these cases, the sqlmigrate command fails, but we didn't want to stop the linter from working, but simply ignore it.

How common is this exactly? And what would it look like to reproduce it? That could help inform the solution to not give false negatives. Or maybe the resolution to that edge case (like ignoring the migration) could simply be included in the error message.

@David-Wobrock
Copy link
Collaborator

First, just for the record (and so that I don't forget it :P) it's a known bug in Django: https://code.djangoproject.com/ticket/26624.

But in short, I have a model like:

from django.db import models


class MyModel(models.Model):
    x = models.IntegerField()
    y = models.IntegerField()

    class Meta:
        unique_together = ("x", "y")

and create a migration.
Then, if we change the unique_together and inverse the columns for instance (so that we need to drop and recreate the constraint) we have a new migration

    operations = [
        migrations.AlterUniqueTogether(
            name='mymodel',
            unique_together={('y', 'x')},
        ),
    ]

The sqlmigrate of this migration fails if the one creating the unique_together was not applied (the error is ValueError: Found wrong number (0) of constraints for app_remove_constraint_mymodel(x, y)). But if we run the first migration, then the sqlmigrate gives us:

BEGIN;
--
-- Alter unique_together for mymodel (1 constraint(s))
--
ALTER TABLE "app_remove_constraint_mymodel" DROP CONSTRAINT "app_remove_constraint_mymodel_x_y_af56469c_uniq";
ALTER TABLE "app_remove_constraint_mymodel" ADD CONSTRAINT "app_remove_constraint_mymodel_y_x_62eaf196_uniq" UNIQUE ("y", "x");
COMMIT;

So the issue is that, on a project where all migrations are linted but not applied (maybe on a CI build), there is a high change that at least one migration will fail like this I think.

I would actually also lean towards option (1), as it is the option where the linter is doing the least "magic".
So letting, by default, the error raise but with a specific message from the linter giving some indications what to do could be viable in my opinion 🤔 But giving an easy possibility to ignore this migration (and more or less fallback on the previous behaviour).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants