Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only include columns in insert statements if needed #367

Merged
merged 6 commits into from
Mar 5, 2017

Conversation

MaienM
Copy link

@MaienM MaienM commented Feb 21, 2017

Columns that have no specific value set (are undefined) previously were explicitly set to default. However, this causes problems if you do not have permission to set a column, whereas if these columns are not specified at all it works fine, and they will still ends up with their default value.

This changes that, by only including columns in the insert statement if they have a value set. If the insert statement is for multiple rows, and not all rows have the same columns, all columns that are present in at
least one row will be included. Rows that don't have that column specified will revert to the old behaviour, being set to default explicitly.

An example: say you have a table foo with the columns name and secret. You're allowed to read all of them, but you're only allowed to set name. Previously, if you call the mutation createFoo with only a value for name, the query will look something like:

INSERT INTO "foo" ("name", "secret") VALUES ("test", default);

This will cause an error, as you're not allowed to set the column secret. After this change, the query will look something like this:

INSERT INTO "foo" ("name") VALUES ("test");

This will work fine.

Columns that have no specific value set (are undefined) previously
were excplicitly set to `default`. However, this causes problems if you
do not have permission to set this column, whereas not specifing the
column at all works fine (and still ends up with the default value).

This changes that, by only including columns in the insert statement if
they have a value set. If the insert statement is for multiple rows, and
not all rows have the same columns, all columns that are present in at
least one row will be included. Rows that don't have that column
specified will revert to the old behavior, being set to `default`
explicitly.
@benjie
Copy link
Member

benjie commented Feb 21, 2017

This looks good to me 👍 Could you add a test that fails on master and passes on this branch for this code please, to avoid regressions?

@MaienM
Copy link
Author

MaienM commented Feb 21, 2017

Sure, I'll get on that.

Copy link
Collaborator

@calebmer calebmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good 👍

I’m glad you submit this PR, because I was actually thinking about this in a discussion I was having 😊. This is required for our support for column GRANTs in inserts, right?

Just one comment. I’m not super excited about the time complexity of iterating over all the values and then all of the values fields, however I’m fine with merging it for now with the intention of optimizing when it becomes a problem.

Anyway, I’ll be happy to merge this.

(fun fact: as I was writing this review I saw you added tests 😉)

map.forEach((value, key) => {
if (typeof value !== 'undefined') {
fields.set(key, this.type.fields.get(key))
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My one concern is that this is O(vn) where v is the number of values and n is the maximum number of fields in one of the values. Is there a more efficient algorithm for this? Can we do some other work while iterating? Think about these questions only for a bit, if you can’t think of a good answer I’m comfortable with merging this as an initial implementation.

Ideally I think we could take a similar approach here as we are taking in #342, although this is a good first implementation. If you could put a TODO comment here that reads something like: TODO: Use field information from GraphQL instead of iterating through values we can address it when it becomes a problem 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't really considered that yet, but I can see how that could become an issue. I unfortunately do not see a way to avoid that with this approach.

Another approach to this problem would be to detect which columns can be inserted on through introspection instead. This only has to be done at startup/watch updates, and does not involve extra looping, so this would effectively be O(1). We would go back to sometimes specifying default for columns that we could leave out completely, but as long as you have the rights to set those columns this is fine. I am not (yet) familiar enough with the project to achieve this though, unfortunately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another benefit to detecting these columns through introspection would be that this could also be reflected in the GraphQL schema/docs. We could simply leave columns that cannot be set out of the Input objects, thereby also preventing this from being a runtime issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work for my schema; we use multiple roles and those roles have different column access grants. Unless you enumerated all of the roles with access, and applied those learnings at run-time; but that seems complex.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I failed to consider using multiple roles. That will indeed complicate things significantly, to the point where it may not be worth it at all. I think @calebmer's proposal to, if the performance of this code ever becomes an issue, use the field information from GraphQL is probably best then. For now I'll just add a TODO comment about it.

collection1.create(context, value3),
])

const query = client.query.mock.calls[0][0].text
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you either check the entire text of the query (vs. using toMatch) with either an inline string literal, or a snapshot? expect(client.query.mock.calls[0][0].text).toMatchSnapshot(). Snapshot is probably the easiest of course. In the future I’d like to be snapshotting more of the SQL queries we generate.

Could you also add a sanity check here: expect(client.query.mock.calls.length).toBe(1) just so that we can be sure that no other queries are being run?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had that in there initially for that exact reason, but I seem to have removed it accidentally. Whoops! Will add it back.

@MaienM
Copy link
Author

MaienM commented Feb 21, 2017

This is indeed needed for INSERT statements on tables with column-level GRANTS. This is what inspired me to do this, as I have a few columns that are managed on the database level that I do not want editable (mostly created_at and updated_at timestamps).

MaienM and others added 3 commits February 21, 2017 21:17
Also added a sanity check to make sure exactly one (1) query has been
executed
@calebmer calebmer merged commit 61fe455 into graphile:master Mar 5, 2017
@benjie
Copy link
Member

benjie commented Mar 5, 2017

Awesome, this resolves an issue I've had on another project also (again: updatedAt and createdAt). Thanks! 👍

@MaienM MaienM deleted the insert-change branch March 5, 2017 17:52
@calebmer
Copy link
Collaborator

Released in 3.1.0 🎉

Belline pushed a commit to Belline/postgraphql that referenced this pull request Dec 18, 2017
…raphile#367)

* Only include required columns in insert statments

Columns that have no specific value set (are undefined) previously
were excplicitly set to `default`. However, this causes problems if you
do not have permission to set this column, whereas not specifing the
column at all works fine (and still ends up with the default value).

This changes that, by only including columns in the insert statement if
they have a value set. If the insert statement is for multiple rows, and
not all rows have the same columns, all columns that are present in at
least one row will be included. Rows that don't have that column
specified will revert to the old behavior, being set to `default`
explicitly.

* Added a test case

* Use snapshots for the test

Also added a sanity check to make sure exactly one (1) query has been
executed

* Added a TODO comment about optimisation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants