Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NOT NULL constraint failed for non-nullable Charfields in xlsx #1485

Closed
vanschelven opened this issue Sep 14, 2022 · 8 comments · Fixed by #1637
Closed

NOT NULL constraint failed for non-nullable Charfields in xlsx #1485

vanschelven opened this issue Sep 14, 2022 · 8 comments · Fixed by #1637
Assignees
Labels

Comments

@vanschelven
Copy link
Contributor

When using the .xslx import, non-nullable CharFields will fail with an exception when their value is left empty. Presumably because the cell has a value of None in that case.

As seen on StackOverflow where a workaround was provided. IMHO this should simply be fixed instead.

To Reproduce
Steps to reproduce the behavior:

  1. Start the example app
  2. Edit a book, make the field "author_email" empty.
  3. Export all books as .xlsx
  4. Import this .xlsx
  5. sqlite3.IntegrityError: NOT NULL constraint failed: core_book.author_email

Versions (please complete the following information):

  • Django Import Export: [2.8.0]

Expected behavior
I expect no error. In particular: I expect that the output of an export is a valid input for an import.

@pokken-magic
Copy link

pokken-magic commented Oct 29, 2022

I wonder what the proper fix is for this--how do we know whether an empty cell in Excel is None or an empty string? I ran into this problem a lot when I was messing with CSV/Excel and for my work project wound up using only YAML and JSON because of this.

Is it fundamentally a problem with how empty strings are sent to Excel? Could we maybe have a standard of quoting all strings in Excel, and escaping quotes that are meant to be explicit (e.g. \") like json does?

I think I remember that the issue is that Excel saver interprets "" as an empty cell, which the importer interprets as None?

Alternatively, maybe we could explicitly write None to empty cells, kinda like the json 'null' keyword? That's a bit hinky though and it would prevent you from having a string of 'None' unless it were quoted, which brings us back to quoting?

The probelm I think is that we need to have a concept of None / null and empty cells are natural for that.

--

Maybe we could add a Charwidget/Textwidget that are intelligent and introspect the field parameters a little? Say,

  • on export, maintain current behavior
  • on import, if the field is Nullabe, treat empty cell as None
  • on import, if the field is not Nullable, treat empty cell as ""

I think that might be the most logical approach. It would leave you with some weird behavior if you had both nulls and blanks.

@matthewhegarty
Copy link
Contributor

For exporting, we have a related discussion here, namely that FloatWidget exports the value, whereas when you need localization support, it needs to export a formatted string. I'm reticent about changing in that PR because exporting as string not value could affect existing implementations.

Looking at Django Rest Framework, they have a coerce_to_string arg. Maybe that is an option here? It would give the user control on how exports are formatted for Excel.

There is a solution for this issue on import, so maybe it's just a case of improving the documentation to state the issues. Like you say, how do we know if an empty field is None or ''.

@pokken-magic
Copy link

pokken-magic commented Oct 29, 2022

After having a think, I believe it'd be a sensible default to treat an incoming empty field as "" if the field is not nullable and is a char/text field, and document that (for formats that do not have the concept of null, like csv and excel).

At least that way django import export would be internally consistent, where we export a null=True field with empty strings to output an empty cell in Excel or an empty cell in Csv (,,), and when it came back in it'd know that those things should be "" if the char/textfield does not allow nulls.

It's generally considered poor practice to mix nulls and empty strings in Django fields anyway, so that could be considered an edge case ("if you export both nulls and blanks, Excel can't distinguish between these").

Moving to a quoting strategy could work but it would be superbly tedious, whereas at least handling them consistently would be a step up.

--

edited to say I double rethought it and there are some risks associated with doing it this way which is that I think we'd be taking away the ability of json and yaml to support text fields that mix nulls and blanks. Don't know if we want to remove functionality.

bleh.

@matthewhegarty
Copy link
Contributor

Thanks for working through this issue.

I think I remember that the issue is that Excel saver interprets "" as an empty cell, which the importer interprets as None?

I tested and in this case the author_email is exported as empty string "". In fact, 'None' values are always forced to strings on export - this has been in the library from the start.

So in the case of Excel, we are exporting as empty string. In Excel, I couldn't find a way to explicitly set a cell as being an empty string. I tried setting the cell as type 'Text' and importing, but it is always interpreted as a 'None' (null) value by tablib when importing. So we have the situation where we always export as empty string, but always get back a None value.

To fix this (rather than having to rely on the users using the workaround) I think we would have to include the logic you describe here. Namely that the import process has to be more explicit when dealing with non-nullable fields.

@pokken-magic
Copy link

pokken-magic commented Oct 30, 2022

Huh, that's very interesting. I thought that the YAML / Json libraries would have nulls, but I think my memory of that must be because I made a nullable field for our internal stuff, and used json/Yaml which support nulls. Which was probably necessary because we have some of those nullable-blankable fields you shouldn't really have.

I think it would be fairly straightforward to have charfields and textfields check for nullability and interpret Nones as "" if the field is not-nullable. I'll try to pick that up if I get some time here soon, but if not someone's more than welcome to experiment.

I think the logic for that would need to live in Field.save, since the widgets don't seem to have knowledge of their django fields.

Interestingly, there is already an explicit setting in Fields for 'saves_null_values' that is checked, so we could just piggy back on that and say "if saves null values is false, then None should be interpreted as an empty string" -- and we might be able to add some autodetection to saves_null_values (maybe in Resource.import_field)

Do you think it should start from 3.x?

@matthewhegarty
Copy link
Contributor

If you export to JSON the empty field is rendered as empty string. If you try to import JSON with a null field it fails with the same issue as above (at least in this context). So there are probably other undiscovered bugs here.

I'll try to pick that up if I get some time here soon, but if not someone's more than welcome to experiment.

That would be great - feel free to assign it to yourself if you think you can pick it up. I think this is OK to go into v3 because it is a fix rather than explicitly changing behaviour. If it looks like it might have more impact to existing implementations we can always defer to v4.

The Field / Widget logic is maybe a bit tangled. We don't want CharField specific logic in Field, but also we don't want to couple the widgets module to Django Field implementations (as you say). If the correct setting of saves_null_values was set on the Field instance, that may well solve it. So that implies that the underlying Django model field would have to be inspected when the import_export Field is instantiated. I think that is mostly done here, and this method is passed the django field, so it could be fairly simple to add this logic, by using the django field null attribute to set saves_null_values.

The other issue is that users will often declare the Field themselves in the resource, so they would need to know the implications of this. They may well have used 'default' kwarg to workaround this issue already. Probably that's where the improved documentation could come in. They would maybe need to set saves_null_values correctly themselves.

For info, #611 is where saves_null_values was added.

@pokken-magic
Copy link

pokken-magic commented Nov 9, 2022

Yeah, I think we essentially need to inspect both Null and Blank setting on the django field and then set those values appropriately. Then when saving, we need to interpret that correctly. And we can allow users to explicitly override that at the field level (The way they can say saves_null_values, they could also have saves_blank_values)

My thinking is:

-if blank and null are true, empty cells are ""
-if blank is true and null is false, empty cells are ""
-if blank is false and null is false, empty cells are None, and will fail
-if blank is false and null is true, empty cells are None

I think the things we need to do are:

  1. update field constructor to take saves_blank_values with a default of False
  2. update field.save to know what to do based on saves null / saves blank
  3. update field detection to correctly inspect the django field for saves_blank and saves_null

The problem here is that this reduces a little bit of functionality, which is that dialects that have a clear distinction between "" and null (json, yaml) won't be able to have both blanks and nulls.

I can't think of any way to get around that except to move things even further up the chain to the importer, but the importer is not aware of the django field. so problematic there.

It might make some sense long term to think about the widget getting information about the blank/null status tbh. I'll have to think about this some more.

@matthewhegarty matthewhegarty self-assigned this Sep 22, 2023
matthewhegarty added a commit to matthewhegarty/django-import-export that referenced this issue Sep 22, 2023
@matthewhegarty matthewhegarty linked a pull request Sep 22, 2023 that will close this issue
matthewhegarty added a commit that referenced this issue Sep 24, 2023
* dynamic widget parameters for CharField fixes 'NOT NULL constraint' error in xlsx (#1485)

* added test json file
@matthewhegarty
Copy link
Contributor

Will be available in v4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants