django.contrib.postgres.fields.ArrayField #2485

mjtamlyn · 2014-03-26T17:20:22Z

This is a first draft of Array fields. The basic field definition is there, with the required functionality to handle arrays of almost any type. I've also written the lookups/transforms specific to array fields.

Work still to do:

Docs
Form fields (naive and admin) and data cleaning
Handling dimensions

The last of these is a particularly interesting case. Postgres has a "casual relationship" with the definition of an array field. You can create integer[], integer[][], integer[3][4] etc, but postgres docs state that this is basically just documentation as it is not enforced at all. We have a couple of options here:

Force single dimensional, unbounded arrays always. This would be pretty boring.
Allow max_size=4 and do python side only validation. We'd still pass the correct [4] to postgres, but it won't enforce integrity.
Allow a complex dimensions flag to be passed allowing for any option. I think this isn't needed as if you want a 2-dimensional array you could do ArrayField(ArrayField(IntegerField())). This also makes the code path much easier as all the functions which delegate to the base_field don't have to worry about its dimensions.

In the absence of strong opinion otherwise, I'm going to do option 2.

Other notes for reviewers:

Related fields are banned. For M2M this is quite obviously necessary, but I've done so for FKs as well as they currently do not support referential integrity, which is what Django FKs try to enforce. Otherwise just use an integer.
Postgres uses 1-based indexing, but I'm converting this in the lookups from 0-based indexing. If someone is used to writing a lot of raw pg queries directly, this will be confusing, but to a normal python user we expect 0-based indexing everywhere.
At present I have not implemented contained_by, which is contains with the arguments reversed. It's basically a "is subset" operator. Thinking about it as I'm writing this, I think it does have use cases so I should add it in.
String based lookups (__iexact, startswith etc) continue to be accepted, even though they are largely useless. contains has been overloaded with a more sensible implementation. This is on the principle that date based fields accept them, and the query is functional (casts everything to text). Personally, I would like fields to only support the lookups which make sense on them now that is easily done, but this is a backwards incompatible change. I may open it up as a ticket when working on refactoring __year etc into transforms.
The approach for handling test models is copied from gis. As Anssi said on IRC, it might be nice if runtests didn't need to know about this, but it'll do for now.
There's a bit of hackery with the deconstruct method which means the __init__ accepts two formats for the base field. I wonder whether this could be avoided if there is a suitable hook in migrations.writer to allow me to pass a string containing the correct field definition for the base_field from deconstruct. This would make the migration files look less weird. @andrewgodwin is this sensible? Also should I have explicit tests that migrations work, and if so what would that look like?

alex · 2014-03-26T17:22:28Z

django/contrib/postgres/fields/array.py

+
+def index_transform_factory(index, base_field):
+
+    class IndexTransform(Transform):


Please dont' create new classes dynamically for each query like this, IndexTransform should be factored out and take some parameter to it's constructor (and then offer an __call__ or something), same with SliceTransform.

alex · 2014-03-26T17:23:21Z

I'd prefer these to live in the django.db.backends.postgresql_psycopg2 namespace than in contrib, but for the most part this looks awesome -- thanks for working on this!

andrewgodwin · 2014-03-26T17:48:30Z

Option 2 for dimensions looks good.

As for deconstruction, what extra control would you like? I'd rather this stuff was more achievable from inside fields themselves. Looking over the diff, it looks like you'd want the ability to pass out whole field instances? That should work...

And for testing things with migrations, it's enough to just add migrations into a test app, and they'll get run at test time. If you want to explicitly test individual migration operations, you'll need something like I have in the "migrations" tests, where you swap in different values of MIGRATION_MODULES for certain tests and run the migrate command (or the machinery underlying it) directly.

mjtamlyn · 2014-03-26T20:03:30Z

Thanks Andrew, I hadn't realised that deconstruction was recursive. I've added a test that MigrationWriter.serialize does what I expect it to, in addition to the deconstruct/reconstruct test. I think that should be sufficient.

mjtamlyn · 2014-04-21T16:41:30Z

Most of the forms code is now present. The js in the admin needs improving, and the admin integration needs some tests. I need to look at how we've tested similar things in other areas to know exactly what to write here.

SimpleArrayField and SplitArrayField can be reviewed pretty well already though.

dbrgn · 2014-05-04T00:11:54Z

django/contrib/postgres/fields/array.py

+            vals = json.loads(value)
+            value = []
+            for val in vals:
+                value.append(self.base_field.to_python(val))


Why not a list comprehension here? (And on line 106)

value = [self.base_field.to_python(val) vor val in vals]

Or even the faster map(self.base_field.to_python, vals), but that's more arguable.

BertrandBordage · 2014-05-04T22:45:16Z

Great work :) I really can't wait to see it in django!
My review was only formal, I didn't dig to understand how it really works.

BertrandBordage · 2014-05-09T16:58:44Z

tests/postgres_tests/test_array.py

+            NullableIntegerArrayModel.objects.create(field=[2, 3]),
+            NullableIntegerArrayModel.objects.create(field=[20, 30, 40]),
+            NullableIntegerArrayModel.objects.create(field=None),
+        ]


Why not using a bulk_create here? I may be a bit obsessed with performance, but I like when tests also are fast ;)

self.objs = NullableIntegerArrayModel.objects.bulk_create([ NullableIntegerArrayModel(field=[1]), NullableIntegerArrayModel(field=[2]), NullableIntegerArrayModel(field=[2, 3]), NullableIntegerArrayModel(field=[20, 30, 40]), NullableIntegerArrayModel(field=None), ])

Bulk create bypasses some logic so I'd rather stick to the "safe" option.

They don't play nice with flexible sizes.

Missing: - Tests - Fully working js

It needs a better way of handing JS widgets in the admin as a whole before it is easy to write. In particular there are serious issues involving DateTimePicker when used in an array.

This will be a documented pattern so having a test for it is useful.

mjtamlyn · 2014-05-16T14:56:20Z

Ok, so I have removed the admin functionality for now. In order to do this nicely, it seems likely I will need to do a more thorough review of how javascript widgets in the admin are built in order to make this work nicely. However, model field, form fields and documentation are ready for review. I think this is a complete enough patch for initial inclusion.

apollo13 · 2014-05-16T15:36:45Z

django/contrib/postgres/fields/array.py

+        self.base_field.set_attributes_from_name(name)
+
+    @property
+    def definition(self):


Is this needed somewhere?

apollo13 · 2014-05-16T15:58:05Z

django/contrib/postgres/fields/array.py

+        return '%s[%s]' % (self.base_field.db_type(connection), size)
+
+    def get_prep_value(self, value):
+        if isinstance(value, list):


list is sufficient here or should this be for every iterable?

oinopion · 2014-05-16T18:14:24Z

django/contrib/postgres/forms/array.py

+        return self.widget.is_hidden
+
+    def value_from_datadict(self, data, files, name):
+        regex = re.compile(name + '_([0-9]+).*')


Are we sure that name does not have to be escaped? I guess it should be a valid Python identifier and thus be safe, but maybe it's worth leaving a comment here?

…m_lookup and custom_transform. Previously, class lookups from the output_type would be used, but any changes to custom_lookup or custom_transform would be ignored.

Also fix slicing as much as it can be fixed.

If we aren't including the variable size one, we don't need to search like this.

mjtamlyn · 2014-05-22T08:54:17Z

Committed in 6041626

pauloxnet · 2014-10-02T15:28:44Z

What about basic admin functionality for array field ?

alex reviewed Mar 26, 2014
View reviewed changes

mjtamlyn mentioned this pull request Apr 1, 2014

Django 1.7 compatibility djangonauts/django-hstore#34

Merged

dbrgn reviewed May 4, 2014
View reviewed changes

BertrandBordage reviewed May 9, 2014
View reviewed changes

mjtamlyn added 3 commits May 15, 2014 09:23

Add shell of postgres app and test handling.

1e2bf30

First draft of array fields.

e6ab418

Use recursive deconstruction.

230a822

mjtamlyn added 11 commits May 15, 2014 09:23

Add SplitArrayField (mainly for admin).

7f71b97

Fix prepare_value for SimpleArrayField.

e55e15c

Stop using MultiValueField and MultiWidget.

5d7cc5d

They don't play nice with flexible sizes.

Add basics of admin integration.

c11e244

Missing: - Tests - Fully working js

Add reference document for django.contrib.postgres.fields.ArrayField.

12dc24f

Various performance and style tweaks.

93cab1c

Fix internal docs link, formalise code snippets.

4808a0c

Remove the admin code for now.

1af4ba4

It needs a better way of handing JS widgets in the admin as a whole before it is easy to write. In particular there are serious issues involving DateTimePicker when used in an array.

Add a test for nested array fields with different delimiters.

9d5eae7

This will be a documented pattern so having a test for it is useful.

Add docs for SimpleArrayField.

488e778

Add docs for SplitArrayField.

272997c

mjtamlyn changed the title ~~django.contrib.postgres.fields.ArrayField - WIP~~ django.contrib.postgres.fields.ArrayField May 16, 2014

Remove admin related code for now.

33b8532

apollo13 reviewed May 16, 2014
View reviewed changes

definition -> description

e216ee7

apollo13 reviewed May 16, 2014
View reviewed changes

mjtamlyn added 4 commits May 16, 2014 18:58

Fix typo.

2c4c5b2

Py3 errors.

cf12673

Avoid using regexes where they're not needed.

133616d

Allow passing tuples by the programmer.

effa9da

oinopion reviewed May 16, 2014
View reviewed changes

mjtamlyn added 3 commits May 17, 2014 12:12

Fixed #22648 -- Transform.output_type should respect overridden custo…

253ef84

…m_lookup and custom_transform. Previously, class lookups from the output_type would be used, but any changes to custom_lookup or custom_transform would be ignored.

Add some more tests for multidimensional arrays.

3a3bf9d

Also fix slicing as much as it can be fixed.

Simplify SplitArrayWidget's data loading.

5d14375

If we aren't including the variable size one, we don't need to search like this.

mjtamlyn closed this May 22, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

django.contrib.postgres.fields.ArrayField #2485

django.contrib.postgres.fields.ArrayField #2485

mjtamlyn commented Mar 26, 2014

alex Mar 26, 2014

alex commented Mar 26, 2014

andrewgodwin commented Mar 26, 2014

mjtamlyn commented Mar 26, 2014

mjtamlyn commented Apr 21, 2014

dbrgn May 4, 2014

BertrandBordage May 4, 2014

BertrandBordage May 9, 2014

BertrandBordage commented May 4, 2014

BertrandBordage May 9, 2014

mjtamlyn May 10, 2014

mjtamlyn commented May 16, 2014

apollo13 May 16, 2014

apollo13 May 16, 2014

oinopion May 16, 2014

mjtamlyn commented May 22, 2014

pauloxnet commented Oct 2, 2014 •

edited


		def index_transform_factory(index, base_field):

		class IndexTransform(Transform):

django.contrib.postgres.fields.ArrayField #2485

django.contrib.postgres.fields.ArrayField #2485

Conversation

mjtamlyn commented Mar 26, 2014

Choose a reason for hiding this comment

alex commented Mar 26, 2014

andrewgodwin commented Mar 26, 2014

mjtamlyn commented Mar 26, 2014

mjtamlyn commented Apr 21, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BertrandBordage commented May 4, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjtamlyn commented May 16, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjtamlyn commented May 22, 2014

pauloxnet commented Oct 2, 2014 • edited

pauloxnet commented Oct 2, 2014 •

edited