SQL queries must escape table/column names using reserved keywords #936

Closed
cowwoc opened this Issue Sep 17, 2014 · 24 comments

Comments

Projects
None yet
3 participants
@cowwoc
Contributor

cowwoc commented Sep 17, 2014

Repro steps:

  1. Create a table called "user" under Postgresql
  2. Attempt to issue queries against QUser. Query will fail with syntax error.

Expected behavior: QueryDSL should escape any table or column name if they are reserved.

See the following resources for the list of reserved keywords:

https://github.com/jgornick/reservedwordsearch/tree/master/words
https://www.drupal.org/node/141051

The first link is better in that you can drop support for older databases.

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Sep 17, 2014

Member

I think the SQLTemplates#requiresQuotes() should be extended to include not only special characters, but also the reserved keywords.

this on one hand will impose a performance penalty, but if we would solve it with let's say a CharMatcher for the characters and a HashSet for the keywords, the penalty is almost negligible.

What do you think of this idea? Do you think a hierarchical approach is a good one, so that subclasses can register custom keywords (if there are any?) or do you foresee any problems?

Member

Shredder121 commented Sep 17, 2014

I think the SQLTemplates#requiresQuotes() should be extended to include not only special characters, but also the reserved keywords.

this on one hand will impose a performance penalty, but if we would solve it with let's say a CharMatcher for the characters and a HashSet for the keywords, the penalty is almost negligible.

What do you think of this idea? Do you think a hierarchical approach is a good one, so that subclasses can register custom keywords (if there are any?) or do you foresee any problems?

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 17, 2014

Contributor

@Shredder121 Agreed on the negligible performance impact, but who is the question directed at? :)

Contributor

cowwoc commented Sep 17, 2014

@Shredder121 Agreed on the negligible performance impact, but who is the question directed at? :)

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 17, 2014

Contributor

I also think that the best place to define keywords is in SQLTemplates subclasses. For example, user is a reserved word under Postgresql but not under H2 (this is in spite of the fact that ANSI 1999 defines it as a reserved word).

Contributor

cowwoc commented Sep 17, 2014

I also think that the best place to define keywords is in SQLTemplates subclasses. For example, user is a reserved word under Postgresql but not under H2 (this is in spite of the fact that ANSI 1999 defines it as a reserved word).

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Sep 17, 2014

Member

The question is not directed at anyone in particular, but more in a general direction towards the community, and of course @timowest.

But indeed, I was also thinking of database vendors/implementors that have different escaping rules, and you have a point.
I think that leaving it to the subclasses is fine, and encourages isolation (and it will be easier to add new keywords).

Member

Shredder121 commented Sep 17, 2014

The question is not directed at anyone in particular, but more in a general direction towards the community, and of course @timowest.

But indeed, I was also thinking of database vendors/implementors that have different escaping rules, and you have a point.
I think that leaving it to the subclasses is fine, and encourages isolation (and it will be easier to add new keywords).

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 18, 2014

Member

@Shredder121 Sounds good. Quoting depends also if the name appears isolated or after a dot. At least for MySQL, that could also be taken into account.

Member

timowest commented Sep 18, 2014

@Shredder121 Sounds good. Quoting depends also if the name appears isolated or after a dot. At least for MySQL, that could also be taken into account.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 19, 2014

Contributor

Qualified names (names after a dot) appear to work the same under PostgreSQL although I couldn't find any reference to this behavior in their documentation. MySQL mentions this explicitly at http://dev.mysql.com/doc/refman/5.7/en/reserved-words.html:

A word that follows a period in a qualified name must be an identifier, so it need not be quoted even if it is reserved

I've recently converted my application from H2 to PostgreSQL and now the code won't run because of this issue. I've temporarily renamed the table from User to UserAccount but I'm nearing the point where I'd like to push these changes to production. Ideally, I'd like to remove this workaround before doing so.

Do you think you'll get around to this issue within the next week, or should I try taking a crack at it?

Contributor

cowwoc commented Sep 19, 2014

Qualified names (names after a dot) appear to work the same under PostgreSQL although I couldn't find any reference to this behavior in their documentation. MySQL mentions this explicitly at http://dev.mysql.com/doc/refman/5.7/en/reserved-words.html:

A word that follows a period in a qualified name must be an identifier, so it need not be quoted even if it is reserved

I've recently converted my application from H2 to PostgreSQL and now the code won't run because of this issue. I've temporarily renamed the table from User to UserAccount but I'm nearing the point where I'd like to push these changes to production. Ideally, I'd like to remove this workaround before doing so.

Do you think you'll get around to this issue within the next week, or should I try taking a crack at it?

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Sep 19, 2014

Member

The quoting of reserved keywords won't be a problem, the only problem I don't know the solution to yet is the fact that quoting isn't always necessary.
The SQLTemplates doesn't know about the state the serializer is in, so it can't decide whether or not to quote the reserved keyword.

Member

Shredder121 commented Sep 19, 2014

The quoting of reserved keywords won't be a problem, the only problem I don't know the solution to yet is the fact that quoting isn't always necessary.
The SQLTemplates doesn't know about the state the serializer is in, so it can't decide whether or not to quote the reserved keyword.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 19, 2014

Contributor

@Shredder121 Do you have a concrete example? It sounds like you'd need to push the relevant state into SQLTemplates.quoteIdentifier() each time so it has the necessary information to make a decision.

Contributor

cowwoc commented Sep 19, 2014

@Shredder121 Do you have a concrete example? It sounds like you'd need to push the relevant state into SQLTemplates.quoteIdentifier() each time so it has the necessary information to make a decision.

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 19, 2014

Member

quoteIdentifier should be maybe changed to take an extra argument that tells if a dot precedes the name.

@cowwoc Feel free to start a PR. There are some other build issues I have to fix first. Some trouble now with Maven repositories that are not available anymore.

Member

timowest commented Sep 19, 2014

quoteIdentifier should be maybe changed to take an extra argument that tells if a dot precedes the name.

@cowwoc Feel free to start a PR. There are some other build issues I have to fix first. Some trouble now with Maven repositories that are not available anymore.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 19, 2014

Contributor

@timowest I assume you're talking about CUBRID? I can't create a PR until I can build the SQL project, which I can't do until the CUBRID maven repository is replaced.

Please let me know when this is resolved and I'll take a second look.

Contributor

cowwoc commented Sep 19, 2014

@timowest I assume you're talking about CUBRID? I can't create a PR until I can build the SQL project, which I can't do until the CUBRID maven repository is replaced.

Please let me know when this is resolved and I'll take a second look.

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 27, 2014

Member

Maybe the optimization of handling names after a dot could be postponed. The set of reserved words could be maybe passed via constructors argument to SQLTemplates and the subclasses could provide their own variants.

Sorry for sketching, if someone is already working on something. Just thinking out loud.

Member

timowest commented Sep 27, 2014

Maybe the optimization of handling names after a dot could be postponed. The set of reserved words could be maybe passed via constructors argument to SQLTemplates and the subclasses could provide their own variants.

Sorry for sketching, if someone is already working on something. Just thinking out loud.

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Sep 27, 2014

Member

I was actually working on something yes.
It's not bad to think out loud, I think that the reserved words as constructor arguments can work.

I also think that the handling of identifiers after a dot can be postponed.

Member

Shredder121 commented Sep 27, 2014

I was actually working on something yes.
It's not bad to think out loud, I think that the reserved words as constructor arguments can work.

I also think that the handling of identifiers after a dot can be postponed.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 28, 2014

Contributor

I also think that the handling of identifiers after a dot can be postponed.

Are you proposing that in the meantime they get quoted even when not strictly necessary? So, the queries might look uglier than necessary but from a functional point of view there should be no difference?

Contributor

cowwoc commented Sep 28, 2014

I also think that the handling of identifiers after a dot can be postponed.

Are you proposing that in the meantime they get quoted even when not strictly necessary? So, the queries might look uglier than necessary but from a functional point of view there should be no difference?

@timowest

This comment has been minimized.

Show comment
Hide comment
@timowest

timowest Sep 28, 2014

Member

Are you proposing that in the meantime they get quoted even when not strictly necessary? So, the queries might look uglier than necessary but from a functional point of view there should be no difference?

Yes, that was my proposition. Correctness of the query has priority over beauty. My point was mostly that a PR that generally quotes keywords would be ok for me. But one that would take the dot into account would be of course better.

Member

timowest commented Sep 28, 2014

Are you proposing that in the meantime they get quoted even when not strictly necessary? So, the queries might look uglier than necessary but from a functional point of view there should be no difference?

Yes, that was my proposition. Correctness of the query has priority over beauty. My point was mostly that a PR that generally quotes keywords would be ok for me. But one that would take the dot into account would be of course better.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Sep 29, 2014

Contributor

Sounds good to me.

Contributor

cowwoc commented Sep 29, 2014

Sounds good to me.

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Oct 1, 2014

Member

I was wondering which reserved word set we would use for the SQLTemplates.
I added the SQL 2008 reserved words list, but maybe we want to use a combination of more standards?

Member

Shredder121 commented Oct 1, 2014

I was wondering which reserved word set we would use for the SQLTemplates.
I added the SQL 2008 reserved words list, but maybe we want to use a combination of more standards?

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Oct 1, 2014

Member

Yes, thanks, I agree and that is also why I used that, but maybe we want an intersection of the last x standards? At least I think that using the correct SQLTemplates subclass is vital as end user, and we now have a big list of reserved words as it is, but maybe we want to support other reserved words as well?

Member

Shredder121 commented Oct 1, 2014

Yes, thanks, I agree and that is also why I used that, but maybe we want an intersection of the last x standards? At least I think that using the correct SQLTemplates subclass is vital as end user, and we now have a big list of reserved words as it is, but maybe we want to support other reserved words as well?

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 2, 2014

Contributor

@Shredder121 Sorry, I don't understand. What "other reserved words" are you referring to?

Contributor

cowwoc commented Oct 2, 2014

@Shredder121 Sorry, I don't understand. What "other reserved words" are you referring to?

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Oct 2, 2014

Member

I take my words back, I thought that there would be a lot of reserved words that wouldn't be supported if we only chose the SQL 2008 list of reserved words, but this is not true.
So nevermind that.

Anyway, could you maybe test if the changes from quote_reserved_keywords work for you already?

Member

Shredder121 commented Oct 2, 2014

I take my words back, I thought that there would be a lot of reserved words that wouldn't be supported if we only chose the SQL 2008 list of reserved words, but this is not true.
So nevermind that.

Anyway, could you maybe test if the changes from quote_reserved_keywords work for you already?

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 20, 2014

Contributor

I'm going to try testing this tonight. Assuming things go well, please consider merging this into the upcoming release.

Contributor

cowwoc commented Oct 20, 2014

I'm going to try testing this tonight. Assuming things go well, please consider merging this into the upcoming release.

@cowwoc

This comment has been minimized.

Show comment
Hide comment
@cowwoc

cowwoc Oct 21, 2014

Contributor

Tested. Looks good to me.

Contributor

cowwoc commented Oct 21, 2014

Tested. Looks good to me.

@Shredder121

This comment has been minimized.

Show comment
Hide comment
@Shredder121

Shredder121 Oct 21, 2014

Member

Yes, I think that the basic support is good to merge, so I made a pull request.

Member

Shredder121 commented Oct 21, 2014

Yes, I think that the basic support is good to merge, so I made a pull request.

@timowest timowest closed this in #1009 Oct 21, 2014

@timowest timowest added this to the 3.5.1 milestone Oct 21, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment