Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncdata unable to use natural keys #1549

Open
MattFisher opened this issue Jul 3, 2020 · 4 comments
Open

syncdata unable to use natural keys #1549

MattFisher opened this issue Jul 3, 2020 · 4 comments

Comments

@MattFisher
Copy link
Contributor

MattFisher commented Jul 3, 2020

Version 3 seems unable to use natural keys in syncdata fixtures.

The traceback shows a long string of "During handling of the above exception, another exception occurred", but the important bits are:

django_extensions.management.commands.syncdata.SyncDataError: Problem installing fixture '.../myproject/fixtures/base.json': Traceback (most recent call last):
  File ".../python3.6/site-packages/django/core/serializers/base.py", line 288, in deserialize_m2m_values
    values.append(m2m_convert(pk))
  File ".../python3.6/site-packages/django/core/serializers/base.py", line 278, in m2m_convert
    return model._default_manager.db_manager(using).get_by_natural_key(*value).pk
  File ".../python3.6/site-packages/django/contrib/auth/models.py", line 88, in get_by_natural_key
    return self.get(name=name)
  File ".../python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File ".../python3.6/site-packages/django/db/models/query.py", line 417, in get
    self.model._meta.object_name
django.contrib.auth.models.Group.DoesNotExist: Group matching query does not exist.

...

Traceback (most recent call last):
  File ".../python3.6/site-packages/django_extensions/management/commands/syncdata.py", line 166, in syncdata
    objects = list(serializers.deserialize(format_, fixture))
  File ".../python3.6/site-packages/django/core/serializers/json.py", line 69, in Deserializer
    yield from PythonDeserializer(objects, **options)
  File ".../python3.6/site-packages/django/core/serializers/python.py", line 124, in Deserializer
    raise base.DeserializationError.WithData(e.original_exc, d['model'], d.get('pk'), e.pk)
django.core.serializers.base.DeserializationError: Group matching query does not exist.: (users.user:pk=2) field_value was '['teachers']'

I think this is because of the changes made here 0edbb6c#diff-ed4bf9a1cad19a0969a26b755518c4a9R164, in adding the remove_before flag.

In django's loaddata, it uses the construct:

objects = serializers.deserialize(
    ser_fmt, fixture, using=self.using, ignorenonexistent=self.ignore,
    handle_forward_references=True,
)

for obj in objects:
    # snip
    obj.save(using=self.using)

serializers.deserialize() returns a generator, and relies internally on all the previous objects already being saved before generating each one. The change in syncdata to materialise the objects into a list before saving them means natural key lookups can't work.

Minimal fixture demonstrating the problem:

[
{
  "model": "auth.group",
  "pk": 1,
  "fields": {
    "name": "teachers",
    "permissions": []
  }
},
{
  "model": "auth.user",
  "pk": 1,
  "fields": {
    "password": "",
    "last_login": null,
    "is_superuser": false,
    "username": "staff",
    "first_name": "",
    "last_name": "",
    "email": "",
    "is_staff": true,
    "is_active": true,
    "date_joined": "2017-05-01T03:49:00.823Z",
    "groups": [
      ["teachers"]
    ],
    "user_permissions": []
  }
}
]
@trbs
Copy link
Member

trbs commented Jul 3, 2020

Would restructuring the code to not use list(serializers.deserialize(format, fixture)) make a difference you think ?

Could you make a PR for at least a test case ?
Seems like an easy bug to get back even if we fix it this time around...

@MattFisher
Copy link
Contributor Author

MattFisher commented Jul 5, 2020

Created a PR #1551 with a failing test, but I'm not sure how to go about restructuring the code to allow both the --remove-before flag and natural keys.

I'm actually not clear on the use-case that --remove-before solves. I could write some test cases if someone could help me understand the expected behaviour? What would determine whether someone would want to remove existing-objects-not-in-the-fixture before or after creating+updating the objects in the fixture?

One alternative technique is just wiping the relevant tables before loading the fixture, rather than trying to identify which objects to keep and which to delete. I'm assuming the current method is for performance reasons, but since all the kept objects are iterated through and re-saved anyway, I don't know how much of slower it would be to just delete everything and add back what's in the fixture?

@trbs
Copy link
Member

trbs commented Jul 6, 2020

This was the originating PR; #1529

@cuchi could you please help here with a background information and @MattFisher's questions ?

Reading that PR and this issue it kinda sounds to me that there is a more fundamental problem (and therefor fix) for this in the code... Given that the original intend was to fix issues with some constraint conditions and that fix breaking natural keys...

Also looking at the code it would seem that a somewhat hacky solution could be to revert to the original code and just do a 2 phase approach for the --remove-before run... reading the fixture two times, the the first sweep removing objects and the second one just doing exactly the same as without --remove-before (guessing the second removal pass won't hurt then since it would be a noop ?)

But as said that feels even more hacky, where probably the best solution is to see why the entire process is troublesome in the case of fixtures with constraint on another existing object...

Anyways I hope @cuchi can help shed some light on this :-)

@MattFisher
Copy link
Contributor Author

MattFisher commented Jul 7, 2020

Thanks for pointing out that PR @trbs, I understand the requirement that prompted the addition of --remove-before now. I think the ability to cope with constraints should be default behaviour and not need its own flag.

I've added a failing test covering the scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants