Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot use bulk_create to create multiple entities with a foreign key relationship #2162

Closed
psahgal opened this issue Apr 27, 2020 · 4 comments

Comments

@psahgal
Copy link

psahgal commented Apr 27, 2020

While making some changes to our ETL pipeline, my team and I noticed that some new errors have appeared. It looks like there was an issue that was introduced in version. 3.13.3. It's also possible we were relying on some undocumented behavior.

What we're trying to do is instantiate a set of entities before writing them to the database in a single transaction with the bulk_create method. One set of these entities has a foreign key to entities in the second set. On version 3.13.1, we don't see any errors thrown in our code. But on version 3.13.3, there is an error.

Here's a snippet of the example I put together reproducing the issue. You can find the full example here.

class Child(Model):
    class Meta:
        database = database
    name = CharField()


class Parent(Model):
    class Meta:
        database = database
    child = ForeignKeyField(Child)
    name = CharField()


entities = [Parent, Child]
database.drop_tables([Parent, Child])
database.create_tables([Parent, Child])


child1 = Child(name='Bob')
child2 = Child(name='Alice')
parent1 = Parent(child=child1, name='Martin')
parent2 = Parent(child=child2, name='Suzy')

with database.atomic():
    Child.bulk_create([child1, child2])
    Parent.bulk_create([parent1, parent2])

On 3.13.1, this works without an error. On 3.13.3, I get the following error:

Traceback (most recent call last):
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 3099, in execute_sql
    cursor.execute(sql, params or ())
psycopg2.errors.NotNullViolation: null value in column "child_id" violates not-null constraint
DETAIL:  Failing row contains (1, null, Martin).


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 33, in <module>
    Parent.bulk_create([parent1, parent2])
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 6324, in bulk_create
    res = cls.insert_many(accum, fields=fields).execute()
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 1886, in inner
    return method(self, database, *args, **kwargs)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 1957, in execute
    return self._execute(database)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 2707, in _execute
    return super(Insert, self)._execute(database)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 2440, in _execute
    cursor = self.execute_returning(database)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 2447, in execute_returning
    cursor = database.execute(self)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 3112, in execute
    return self.execute_sql(sql, params, commit=commit)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 3106, in execute_sql
    self.commit()
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 2873, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 183, in reraise
    raise value.with_traceback(tb)
  File "/Users/praneetsahgal/.pyenv/versions/3.8.2/lib/python3.8/site-packages/peewee.py", line 3099, in execute_sql
    cursor.execute(sql, params or ())
peewee.IntegrityError: null value in column "child_id" violates not-null constraint
DETAIL:  Failing row contains (1, null, Martin).

Could someone help investigate this error? Our project will be stuck on 3.13.1 until we can find a fix.

@coleifer
Copy link
Owner

Peewee introduced an efficiency measure in bulk_create() to use the object-id descriptor to grab the associated PK of the related object. It seems it has had an unintended side-effect of not reflecting the new primary-key value upon saving the related object. I'll look into this.

@coleifer
Copy link
Owner

Fixed. Note that this would only have worked with postgres, thanks to postgres' ability to return autogenerated IDs after a bulk insert query. So this is kinda under "yeah, it works, but it's a little weird" category.

At any rate it should be working again on master. I've added a regression test which reproduced the issue and now passes.

@psahgal
Copy link
Author

psahgal commented Apr 28, 2020

Thanks for the quick turnaround! I tried it out in the example I put together and everything seems to be working.

@youngquan
Copy link

Note that this would only have worked with postgres

Sorry to bother you. I am using mysql and what should I do to fix the same problem? Any clues?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants