Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #12990 -- Added JSONField model field. #11452

Closed
wants to merge 18 commits into from

Conversation

@laymonage
Copy link
Contributor

laymonage commented Jun 9, 2019

This pull request is closed. Please look at #12392 instead.


Ticket #12990, as part of the Google Summer of Code program.

Some points:

  • Currently supports storing and retrieving any valid JSON value (boolean, integer, float, string, object, array) on all supported database backends (SQLite, PostgreSQL, MySQL, MariaDB, Oracle).
    Note: Oracle only supports JSON object and array if IS JSON constraint is enabled.
  • Supports custom encoder and decoder (defaults to json.JSONEncoder and json.JSONDecoder, respectively). I had to cast the SELECT query to text for PostgreSQL to avoid having the value already decoded in from_db_value so it's possible to use a custom decoder (see psycopg2 docs).
  • Custom lookups and transforms from contrib.postgres.fields.JSONField are implemented as possibly supported for each backend.
  • Check constraints are applied.
    • MariaDB and SQLite have a JSON_VALID function which works well.
    • Oracle has IS JSON constraint, but from what I tried, it only returns true if the field's value is either a JSON object or array.
    • The constraint already comes with the JSON data types in PostgreSQL and MySQL. You can only insert valid JSON values for jsonb and json columns.
  • Oracle implementation uses the nclob data type. Oracle recommends using blob, but there are some downsides (see docs). Besides, an existing implementation of oracle-json-field uses clob. TextField also uses nclob, so I think this makes sense.
  • Needs JSON1 extension for SQLite. Most Linux distros already have it enabled by default in their SQLite and Python packages.
  • Supports introspection.

More details of this available on my blog.

@laymonage laymonage force-pushed the laymonage:ticket_12990 branch 2 times, most recently from 1f8b5c7 to 8110513 Jun 9, 2019
@laymonage

This comment was marked as outdated.

Copy link
Contributor Author

laymonage commented Jun 9, 2019

Any clue on which version of MariaDB is used on djangoci.com? The json data type is introduced in MariaDB 10.2.7.

@laymonage laymonage force-pushed the laymonage:ticket_12990 branch from 15b7856 to b992485 Jun 9, 2019
@carltongibson carltongibson self-requested a review Jun 10, 2019
@felixxm

This comment was marked as outdated.

Copy link
Member

felixxm commented Jun 10, 2019

Any clue on which version of MariaDB is used on djangoci.com? The json data type is introduced in MariaDB 10.2.7.

djangoci uses MariaDB 10.1.40. I can bump MariaDB version in the next few days. We need to remember that Django 3.0 supports MariaDB 10.1 and higher so a new db feature is required e.g. has_json_field.

django/db/models/fields/json.py Outdated Show resolved Hide resolved
django/db/models/fields/json.py Outdated Show resolved Hide resolved
tests/model_fields/models.py Outdated Show resolved Hide resolved
tests/model_fields/test_jsonfield.py Outdated Show resolved Hide resolved
@laymonage laymonage force-pushed the laymonage:ticket_12990 branch from b992485 to 9854ccf Jun 10, 2019
@laymonage

This comment was marked as outdated.

Copy link
Contributor Author

laymonage commented Jun 10, 2019

Any clue on which version of MariaDB is used on djangoci.com? The json data type is introduced in MariaDB 10.2.7.

djangoci uses MariaDB 10.1.40. I can bump MariaDB version in the next few days. We need to remember that Django 3.0 supports MariaDB 10.1 and higher so a new db feature is required e.g. has_json_field.

I see. I'll try to add it later.

@adamchainz

This comment was marked as outdated.

Copy link
Member

adamchainz commented Jun 10, 2019

djangoci uses MariaDB 10.1.40. I can bump MariaDB version in the next few days. We need to remember that Django 3.0 supports MariaDB 10.1 and higher so a new db feature is required e.g. has_json_field.

How about supports_json ? It's not really a separate data type on SQLite or MariaDB.

@felixxm

This comment has been minimized.

Copy link
Member

felixxm commented Jun 10, 2019

djangoci uses MariaDB 10.1.40. I can bump MariaDB version in the next few days. We need to remember that Django 3.0 supports MariaDB 10.1 and higher so a new db feature is required e.g. has_json_field.

How about supports_json ? It's not really a separate data type on SQLite or MariaDB.

It is also not a separate field on Oracle, but a feature flag will determine if backend has JSON field or not, so ... 🤔

@laymonage laymonage force-pushed the laymonage:ticket_12990 branch from 9854ccf to 71ad10c Jun 11, 2019
@felixxm

This comment was marked as outdated.

Copy link
Member

felixxm commented Jun 12, 2019

@laymonage I updated MariaDB to 10.2.24 on Jenkins.

@laymonage laymonage force-pushed the laymonage:ticket_12990 branch from 71ad10c to 886862c Jun 12, 2019
@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Jun 12, 2019

@laymonage I updated MariaDB to 10.2.24 on Jenkins.

Thanks! As expected, the tests have passed now.

I have added a supports_json feature (can be renamed if desired). Not sure if I should check the SQLite version, though. I don't think there's any way to check if the JSON1 extension is enabled (maybe we could try to do SELECT json('"test"'), but that's a bit hack-ish).
The JSON1 extension was introduced with the release of SQLite 3.9.0. However, since it's a loadable extension, it might work if it's loaded on older SQLite version(s). I haven't tried.

Also, I'm not sure if I should use check and extend the list returned by that method instead of raising a NotSupportedError. I've seen both examples in the existing codebase.

@laymonage laymonage force-pushed the laymonage:ticket_12990 branch 4 times, most recently from 7433b52 to 3f9c68d Jun 13, 2019
@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Jun 19, 2019

I've added a form field. It's pretty much the one in contrib.postgres, but I omitted the field value in the invalid JSON error message and changed the tests accordingly.

@adamchainz

This comment has been minimized.

Copy link
Member

adamchainz commented Jun 19, 2019

I have added a supports_json feature (can be renamed if desired). Not sure if I should check the SQLite version, though. I don't think there's any way to check if the JSON1 extension is enabled (maybe we could try to do SELECT json('"test"'), but that's a bit hack-ish).
The JSON1 extension was introduced with the release of SQLite 3.9.0. However, since it's a loadable extension, it might work if it's loaded on older SQLite version(s). I haven't tried.

I think trying the json function and catching the error isn't so bad, as long as it won't break any transactions.

The other option is to use PRAGMA compile_options and check if the extension is in there, however I am not sure if it's possible to load the json1 extension without it being built in at compile time...

sqlite> PRAGMA compile_options;
BUG_COMPATIBLE_20160819
COMPILER=clang-10.0.1
DEFAULT_CACHE_SIZE=2000
DEFAULT_CKPTFULLFSYNC
DEFAULT_JOURNAL_SIZE_LIMIT=32768
DEFAULT_PAGE_SIZE=4096
DEFAULT_SYNCHRONOUS=2
DEFAULT_WAL_SYNCHRONOUS=1
ENABLE_API_ARMOR
ENABLE_COLUMN_METADATA
ENABLE_DBSTAT_VTAB
ENABLE_FTS3
ENABLE_FTS3_PARENTHESIS
ENABLE_FTS3_TOKENIZER
ENABLE_FTS4
ENABLE_FTS5
ENABLE_JSON1
ENABLE_LOCKING_STYLE=1
ENABLE_PREUPDATE_HOOK
ENABLE_RTREE
ENABLE_SESSION
ENABLE_SNAPSHOT
ENABLE_SQLLOG
ENABLE_UNKNOWN_SQL_FUNCTION
ENABLE_UPDATE_DELETE_LIMIT
HAVE_ISNAN
MAX_LENGTH=2147483645
MAX_MMAP_SIZE=1073741824
MAX_VARIABLE_NUMBER=500000
OMIT_AUTORESET
OMIT_LOAD_EXTENSION
STMTJRNL_SPILL=131072
THREADSAFE=2
USE_URI
@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Jun 19, 2019

I think trying the json function and catching the error isn't so bad, as long as it won't break any transactions.

I'm not sure where and how to properly put it in Django's source code, though.


I tried compiling SQLite 3.28.0 without JSON1, compiling JSON1 separately, and loading it with the .load command.
SELECT JSON('"test"'); works, but ENABLE_JSON1 doesn't show up with PRAGMA compile_options (which is correct since I didn't build JSON1 along with SQLite).

On the other hand, I also tried loading JSON1 (compiled from SQLite 3.28.0 source code) on SQLite 3.8.7.1 (what's available on Debian Jessie). This SQLite version supports extension loading, but I got a segmentation fault when I tried to load JSON1. So, I guess it needs SQLite 3.9.0 and up.

By the way... JSON1 is also enabled by default if SQLite is compiled using make with the amalgamation and the given configurations.

@adamchainz

This comment has been minimized.

Copy link
Member

adamchainz commented Jun 19, 2019

I'm not sure where and how to properly put it in Django's source code, though.

You can use a @cached_property for the feature, for example https://github.com/django/django/blob/master/django/db/backends/mysql/features.py#L110

@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Jun 19, 2019

I'm not sure where and how to properly put it in Django's source code, though.

You can use a @cached_property for the feature, for example https://github.com/django/django/blob/master/django/db/backends/mysql/features.py#L110

Yeah, I've used it in my supports_json DB feature. What I mean is, should I do something like this?

try:
    with self.connection.cursor() as cursor:
        cursor.execute("SELECT JSON('\"test\"')
except DatabaseError:
    return False
else:
    return True
@adamchainz

This comment has been minimized.

Copy link
Member

adamchainz commented Jun 19, 2019

That looks like what I was thinking of, though you might need a transaction.atomic around it to prevent the error from breaking any current transaction - at least that's the way it works with some kinds of error on other databases, I'm no SQLite expert.

Copy link
Contributor

cansarigol left a comment

Looks nice 👍 I left some comments.

django/db/models/fields/mixins.py Outdated Show resolved Hide resolved
tests/model_fields/test_jsonfield.py Outdated Show resolved Hide resolved
@laymonage laymonage force-pushed the laymonage:ticket_12990 branch 4 times, most recently from 71305bd to 0590edb Jun 20, 2019
@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Nov 11, 2019

Thanks for the tip @m-aciek. That reminds me to look at MySQL docs again, and here's what I found:

MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation. Strings in other character sets are converted to utf8mb4 as necessary.

Because utf8mb4_bin is a binary collation, comparison of JSON values is case-sensitive.

Therefore, I think using REGEXP_LIKE won't solve the problem.

I originally applied CaseInsensitiveMixin to iregex as well, but @felixxm said that it shouldn't be necessary. I didn't really know why CaseInsensitiveMixin was needed for case-insensitive comparisons of JSON values, I just knew that it didn't work without it. So I just went along 😳

Anyway, as expected, after I reapplied CaseInsensitiveMixin, the test passes on the CI. 🎉
I guess it's a good idea to add a docstring to CaseInsensitiveMixin.

Still... I don't know why test_iregex works without CaseInsensitiveMixin on my machine 🤷‍♂

@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Nov 12, 2019

@felixxm If you're aiming to simplify Oracle implementation, we can try using the dot notation syntax to access JSON values, instead of using JSON_VALUE() and JSON_EXTRACT(). However, that requires the use of table aliases in the query. I think it would be better if Django has built-in functionality for this because otherwise, I don't really know how to do that without making some dirty hacks...

laymonage and others added 18 commits Jun 9, 2019
Minor edits.
Added documentation and tests for the system check.
…transform(lookup='value__baz__contained_by') on MySQL, MariaDB, and SQLite.

We need to also swap params not only sides of expressions.
…transform(lookup='value__baz__contained_by') on MySQL, MariaDB, and SQLite.

We didn't handle how the lhs and rhs can be instances of KeyTransform/KeyTextTransform
It's not possible in PostgreSQL, and it's already represented by value__baz__has_keys
It's a subclass of KeyTransform
…te and removed unused models.
The logic should belong to the KeyTransform itself
@laymonage laymonage force-pushed the laymonage:ticket_12990 branch from 8f4716e to 7f7e1a1 Jan 28, 2020
@stuaxo stuaxo mentioned this pull request Jan 29, 2020
Copy link
Member

carltongibson left a comment

Hi @laymonage. Time to move this forward. 🙂

  • Can you squash as of 7f7e1a1, and open a fresh PR at that point.
  • Any changes after 7f7e1a1 to be in separate commits for now (to help us continue the review). We'll squash down again later on.

I will then ask for testing from the community. Folks can comment with any regressions/issues on the fresh PR, and we can get those resolved.

I left a few smaller remarks here but I want to look more at the docs and test coverage. I know @felixxm has more comments too. But my overall impression was that it makes sense (and it works on Windows—where those SQLite instructions are necessary—so... 🙂) Good job!

docs/ref/forms/fields.txt Show resolved Hide resolved
docs/ref/models/fields.txt Show resolved Hide resolved
docs/ref/models/fields.txt Show resolved Hide resolved
docs/releases/3.1.txt Show resolved Hide resolved
django/db/backends/mysql/base.py Show resolved Hide resolved
tests/model_fields/test_jsonfield.py Show resolved Hide resolved
@laymonage

This comment has been minimized.

Copy link
Contributor Author

laymonage commented Jan 30, 2020

Closing this in favor of #12392.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.