Only include columns in insert statements if needed #367

MaienM · 2017-02-21T15:30:18Z

Columns that have no specific value set (are undefined) previously were explicitly set to default. However, this causes problems if you do not have permission to set a column, whereas if these columns are not specified at all it works fine, and they will still ends up with their default value.

This changes that, by only including columns in the insert statement if they have a value set. If the insert statement is for multiple rows, and not all rows have the same columns, all columns that are present in at
least one row will be included. Rows that don't have that column specified will revert to the old behaviour, being set to default explicitly.

An example: say you have a table foo with the columns name and secret. You're allowed to read all of them, but you're only allowed to set name. Previously, if you call the mutation createFoo with only a value for name, the query will look something like:

INSERT INTO "foo" ("name", "secret") VALUES ("test", default);

This will cause an error, as you're not allowed to set the column secret. After this change, the query will look something like this:

INSERT INTO "foo" ("name") VALUES ("test");

This will work fine.

Columns that have no specific value set (are undefined) previously were excplicitly set to `default`. However, this causes problems if you do not have permission to set this column, whereas not specifing the column at all works fine (and still ends up with the default value). This changes that, by only including columns in the insert statement if they have a value set. If the insert statement is for multiple rows, and not all rows have the same columns, all columns that are present in at least one row will be included. Rows that don't have that column specified will revert to the old behavior, being set to `default` explicitly.

benjie · 2017-02-21T17:42:12Z

This looks good to me 👍 Could you add a test that fails on master and passes on this branch for this code please, to avoid regressions?

MaienM · 2017-02-21T17:43:24Z

Sure, I'll get on that.

calebmer

This is looking pretty good 👍

I’m glad you submit this PR, because I was actually thinking about this in a discussion I was having 😊. This is required for our support for column GRANTs in inserts, right?

Just one comment. I’m not super excited about the time complexity of iterating over all the values and then all of the values fields, however I’m fine with merging it for now with the intention of optimizing when it becomes a problem.

Anyway, I’ll be happy to merge this.

(fun fact: as I was writing this review I saw you added tests 😉)

calebmer · 2017-02-21T19:49:49Z

src/postgres/inventory/collection/PgCollection.ts

+          map.forEach((value, key) => {
+            if (typeof value !== 'undefined') {
+              fields.set(key, this.type.fields.get(key))
+            }


My one concern is that this is O(vn) where v is the number of values and n is the maximum number of fields in one of the values. Is there a more efficient algorithm for this? Can we do some other work while iterating? Think about these questions only for a bit, if you can’t think of a good answer I’m comfortable with merging this as an initial implementation.

Ideally I think we could take a similar approach here as we are taking in #342, although this is a good first implementation. If you could put a TODO comment here that reads something like: TODO: Use field information from GraphQL instead of iterating through values we can address it when it becomes a problem 👍

I hadn't really considered that yet, but I can see how that could become an issue. I unfortunately do not see a way to avoid that with this approach.

Another approach to this problem would be to detect which columns can be inserted on through introspection instead. This only has to be done at startup/watch updates, and does not involve extra looping, so this would effectively be O(1). We would go back to sometimes specifying default for columns that we could leave out completely, but as long as you have the rights to set those columns this is fine. I am not (yet) familiar enough with the project to achieve this though, unfortunately.

Another benefit to detecting these columns through introspection would be that this could also be reflected in the GraphQL schema/docs. We could simply leave columns that cannot be set out of the Input objects, thereby also preventing this from being a runtime issue.

This won't work for my schema; we use multiple roles and those roles have different column access grants. Unless you enumerated all of the roles with access, and applied those learnings at run-time; but that seems complex.

I failed to consider using multiple roles. That will indeed complicate things significantly, to the point where it may not be worth it at all. I think @calebmer's proposal to, if the performance of this code ever becomes an issue, use the field information from GraphQL is probably best then. For now I'll just add a TODO comment about it.

calebmer · 2017-02-21T19:51:50Z

src/postgres/inventory/collection/__tests__/PgCollection-test.js

+    collection1.create(context, value3),
+  ])
+
+  const query = client.query.mock.calls[0][0].text


Could you either check the entire text of the query (vs. using toMatch) with either an inline string literal, or a snapshot? expect(client.query.mock.calls[0][0].text).toMatchSnapshot(). Snapshot is probably the easiest of course. In the future I’d like to be snapshotting more of the SQL queries we generate.

Could you also add a sanity check here: expect(client.query.mock.calls.length).toBe(1) just so that we can be sure that no other queries are being run?

I had that in there initially for that exact reason, but I seem to have removed it accidentally. Whoops! Will add it back.

MaienM · 2017-02-21T20:12:56Z

This is indeed needed for INSERT statements on tables with column-level GRANTS. This is what inspired me to do this, as I have a few columns that are managed on the database level that I do not want editable (mostly created_at and updated_at timestamps).

Also added a sanity check to make sure exactly one (1) query has been executed

benjie · 2017-03-05T14:30:46Z

Awesome, this resolves an issue I've had on another project also (again: updatedAt and createdAt). Thanks! 👍

calebmer · 2017-03-26T19:34:50Z

Released in 3.1.0 🎉

…raphile#367) * Only include required columns in insert statments Columns that have no specific value set (are undefined) previously were excplicitly set to `default`. However, this causes problems if you do not have permission to set this column, whereas not specifing the column at all works fine (and still ends up with the default value). This changes that, by only including columns in the insert statement if they have a value set. If the insert statement is for multiple rows, and not all rows have the same columns, all columns that are present in at least one row will be included. Rows that don't have that column specified will revert to the old behavior, being set to `default` explicitly. * Added a test case * Use snapshots for the test Also added a sanity check to make sure exactly one (1) query has been executed * Added a TODO comment about optimisation

MaienM and others added 2 commits February 21, 2017 20:43

Added a test case

266cc21

Merge branch 'master' into insert-change

02d5eb8

calebmer requested changes Feb 21, 2017

View reviewed changes

MaienM and others added 3 commits February 21, 2017 21:17

Use snapshots for the test

466ad25

Also added a sanity check to make sure exactly one (1) query has been executed

Added a TODO comment about optimisation

f2e2531

Merge branch 'master' into insert-change

a6ed784

calebmer approved these changes Mar 5, 2017

View reviewed changes

calebmer merged commit 61fe455 into graphile:master Mar 5, 2017

calebmer mentioned this pull request Mar 5, 2017

Fix undefined values to be assumed as null values #382

Closed

MaienM deleted the insert-change branch March 5, 2017 17:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only include columns in insert statements if needed #367

Only include columns in insert statements if needed #367

MaienM commented Feb 21, 2017

benjie commented Feb 21, 2017

MaienM commented Feb 21, 2017

calebmer left a comment

calebmer Feb 21, 2017

MaienM Feb 21, 2017

MaienM Feb 21, 2017

benjie Feb 21, 2017

MaienM Feb 22, 2017

calebmer Feb 21, 2017

MaienM Feb 21, 2017

MaienM commented Feb 21, 2017

benjie commented Mar 5, 2017

calebmer commented Mar 26, 2017

Only include columns in insert statements if needed #367

Only include columns in insert statements if needed #367

Conversation

MaienM commented Feb 21, 2017

benjie commented Feb 21, 2017

MaienM commented Feb 21, 2017

calebmer left a comment

Choose a reason for hiding this comment

calebmer Feb 21, 2017

Choose a reason for hiding this comment

MaienM Feb 21, 2017

Choose a reason for hiding this comment

MaienM Feb 21, 2017

Choose a reason for hiding this comment

benjie Feb 21, 2017

Choose a reason for hiding this comment

MaienM Feb 22, 2017

Choose a reason for hiding this comment

calebmer Feb 21, 2017

Choose a reason for hiding this comment

MaienM Feb 21, 2017

Choose a reason for hiding this comment

MaienM commented Feb 21, 2017

benjie commented Mar 5, 2017

calebmer commented Mar 26, 2017