Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulk_insert fails when trying to insert JSON fields when used with return_model=True #158

Closed
MontyD opened this issue Nov 5, 2021 · 1 comment

Comments

@MontyD
Copy link

MontyD commented Nov 5, 2021

Having upgrade to postgres-extra v2.0.3 bulk inserting models which contain JSON fields produces the following error:

    cards_created, ids_created, references_created, notes_created = upsert_cards(batch)
  File "/usr/src/app/card/bulk.py", line 100, in upsert_cards
    cards = manager.on_conflict(
  File "/usr/local/lib/python3.9/site-packages/psqlextra/query.py", line 170, in bulk_insert
    return [
  File "/usr/local/lib/python3.9/site-packages/psqlextra/query.py", line 171, in <listcomp>
    self._create_model_instance(dict(row, **obj), compiler.using)
  File "/usr/local/lib/python3.9/site-packages/psqlextra/query.py", line 418, in _create_model_instance
    converted_field_values[field.attname] = converter(
  File "/usr/local/lib/python3.9/site-packages/django/db/models/fields/json.py", line 83, in from_db_value
    return json.loads(value, cls=self.decoder)
  File "/usr/local/lib/python3.9/json/__init__.py", line 339, in loads
    raise TypeError(f'the JSON object must be str, bytes or bytearray, '
TypeError: the JSON object must be str, bytes or bytearray, not dict

This seems to be due to double converting JSON to be inserted into a JSON field. We have a JSON field defined on the model we're inserting: https://gitlab.developers.cam.ac.uk/uis/devops/iam/card-database/card-api/-/blob/master/card/models.py#L315.

We're bulk inserting using return_model=True which seems to cause this problem: https://gitlab.developers.cam.ac.uk/uis/devops/iam/card-database/card-api/-/blob/master/card/bulk.py#L99.

Removing return_model=True allows the bulk insert to work, but unfortunately our bulk insert logic requires that we have access to the created models.

This is likely caused by the change made in 1fccf9a#diff-cc2bc273189507c98734a1ddaddcee3a8f206c7dee9de8ef1ceffa71f71ad0d6. Seeing as adding apply_converters=False to the call to _create_model_instance made in bulk_insert stops this error from occurring.

@MontyD MontyD changed the title bulk_insert fails when trying to insert JSON fields and return_model=True bulk_insert fails when trying to insert JSON fields when used with return_model=True Nov 5, 2021
Photonios added a commit that referenced this issue Nov 22, 2021
`return_model=True` would break any rows with JSONField as it
would try to convert from a JSON string to a Python object. However,
the original input data was already partially converted.

By forcing the database to return the values it inserted, we
get a raw JSON string and the converted applies normally.
Photonios added a commit that referenced this issue Nov 23, 2021
`return_model=True` would break any rows with JSONField as it
would try to convert from a JSON string to a Python object. However,
the original input data was already partially converted.

By forcing the database to return the values it inserted, we
get a raw JSON string and the converted applies normally.
Photonios added a commit that referenced this issue Nov 23, 2021
`return_model=True` would break any rows with JSONField as it
would try to convert from a JSON string to a Python object. However,
the original input data was already partially converted.

By forcing the database to return the values it inserted, we
get a raw JSON string and the converted applies normally.
@Photonios
Copy link
Member

This is fixed in v2.0.4rc2 or newer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants