Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign uppeewee should not gen SQL it knows is invalid #964
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
coleifer
Jun 6, 2016
Owner
How would peewee do this in a performant, reliable manner?
Furthermore, if the error gets handled by the database, why is that so much worse than having it raised by Peewee?
|
How would peewee do this in a performant, reliable manner? Furthermore, if the error gets handled by the database, why is that so much worse than having it raised by Peewee? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 6, 2016
Contributor
How would peewee do this in a performant, reliable manner?
so in https://github.com/coleifer/peewee/blob/master/peewee.py#L1485 AliasMap offers a dictionary lookup which has a side effect of adding the object to the dict if it's not already there. this is useful for generating the aliases in the first place in the from/join clauses, but the where clause should not be triggering new table aliases. in my fix i added a second lookup method that didn't have the side effect, only called by _parse_field.
Furthermore, if the error gets handled by the database, why is that so much worse than having it raised by Peewee?
i'm not writing the SQL, peewee is. and peewee knows it's writing SQL that should never work. and errors like no such column: t2.name are opaque. in this simple example it's obvious, but outside of that it's not. Something like UnreferencedTableException: Table 'Person' isn't being selected from, therefore condition 'Person.name==X' isn't a valid constraint. is more useful.
I'm not saying this is some major make or break thing here. It's just a minor tweak to make it a little better.
so in https://github.com/coleifer/peewee/blob/master/peewee.py#L1485
i'm not writing the SQL, peewee is. and peewee knows it's writing SQL that should never work. and errors like I'm not saying this is some major make or break thing here. It's just a minor tweak to make it a little better. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
coleifer
Jun 6, 2016
Owner
but the where clause should not be triggering new table aliases
Thats a pretty good idea, my concern is if it actually holds true in all cases. If you had a correlated subquery in the where clause, for instance? Not sure but there may be other edge cases?
Thats a pretty good idea, my concern is if it actually holds true in all cases. If you had a correlated subquery in the where clause, for instance? Not sure but there may be other edge cases? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 6, 2016
Contributor
i think so, but it's possible i don't understand the full scope of what _parse_field does. simple inner queries work at least:
Post.select().where(Post.title << Post.select(Post.title)).get()
('SELECT "t1"."id", "t1"."title", "t1"."author_id" FROM "post" AS t1 WHERE ("t1"."title" IN (SELECT "t2"."title" FROM "post" AS t2)) LIMIT 1 OFFSET 0', [])
as does:
post1 = Post.alias()
post2 = Post.alias()
post1.select().where(post1.title << post2.select(post2.title).where(post1.id==post2.id)).get()
('SELECT "t1"."id", "t1"."title", "t1"."author_id" FROM "post" AS t1 WHERE ("t1"."title" IN (SELECT "t2"."title" FROM "post" AS t2 WHERE ("t1"."id" = "t2"."id"))) LIMIT 1 OFFSET 0', [])
|
i think so, but it's possible i don't understand the full scope of what
as does:
|
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 6, 2016
Contributor
i found in from_() it was relying on the side effect from selecting the fields. but that was an easy fix.
also a bunch of unit test failures along the lines of:
AssertionError:
'SELECT 1 FROM (SELECT "t1"."alpha" FROM "alpha" AS t1 WHERE ("t1"."alpha" < ?) UNION SELECT "t2"."beta" FROM "beta" AS t2 WHERE ("t2"."beta" < ?) UNION SELECT "t4"."beta" FROM "beta" AS t4 WHERE ("t4"."beta" > ?)) AS cq'
!=
'SELECT 1 FROM (SELECT "t1"."alpha" FROM "alpha" AS t1 WHERE ("t1"."alpha" < ?) UNION SELECT "t2"."beta" FROM "beta" AS t2 WHERE ("t2"."beta" < ?) UNION SELECT "t3"."beta" FROM "beta" AS t3 WHERE ("t3"."beta" > ?)) AS cq'sigh..... :)
anyway, since you're not opposed i'll go through all these and make a dedicated PR.
|
i found in also a bunch of unit test failures along the lines of: AssertionError:
'SELECT 1 FROM (SELECT "t1"."alpha" FROM "alpha" AS t1 WHERE ("t1"."alpha" < ?) UNION SELECT "t2"."beta" FROM "beta" AS t2 WHERE ("t2"."beta" < ?) UNION SELECT "t4"."beta" FROM "beta" AS t4 WHERE ("t4"."beta" > ?)) AS cq'
!=
'SELECT 1 FROM (SELECT "t1"."alpha" FROM "alpha" AS t1 WHERE ("t1"."alpha" < ?) UNION SELECT "t2"."beta" FROM "beta" AS t2 WHERE ("t2"."beta" < ?) UNION SELECT "t3"."beta" FROM "beta" AS t3 WHERE ("t3"."beta" > ?)) AS cq'sigh..... :) anyway, since you're not opposed i'll go through all these and make a dedicated PR. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
coleifer
Jun 6, 2016
Owner
Hold off on fixing the tests .. maybe just a diff of the actual code changes? I'd like to think about it before you spend too much effort on a clean patch.
|
Hold off on fixing the tests .. maybe just a diff of the actual code changes? I'd like to think about it before you spend too much effort on a clean patch. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 6, 2016
Contributor
can do.
derek@derek-hvst:~/projects/peewee2$ git diff
diff --git a/peewee.py b/peewee.py
index 070328d..1fb7144 100644
--- a/peewee.py
+++ b/peewee.py
@@ -1416,6 +1416,11 @@ class AliasMap(object):
self.add(obj)
return self._alias_map[obj]
+ def lookup(self, obj):
+ if obj in self._alias_map:
+ return self._alias_map[obj]
+ raise ProgrammingError('%s is not a part of this query' % obj)
+
def __contains__(self, obj):
return obj in self._alias_map
@@ -1566,7 +1571,7 @@ class QueryCompiler(object):
def _parse_field(self, node, alias_map, conv):
if alias_map:
sql = '.'.join((
- self.quote(alias_map[node.model_class]),
+ self.quote(alias_map.lookup(node.model_class)),
self.quote(node.db_column)))
else:
sql = self.quote(node.db_column)
@@ -1767,6 +1772,7 @@ class QueryCompiler(object):
if query._from is None:
clauses.append(model.as_entity().alias(alias_map[model]))
else:
+ [alias_map[x] for x in query._from] # seed the alias map
clauses.append(CommaClause(*query._from))
if query._windows is not None:|
can do. derek@derek-hvst:~/projects/peewee2$ git diff
diff --git a/peewee.py b/peewee.py
index 070328d..1fb7144 100644
--- a/peewee.py
+++ b/peewee.py
@@ -1416,6 +1416,11 @@ class AliasMap(object):
self.add(obj)
return self._alias_map[obj]
+ def lookup(self, obj):
+ if obj in self._alias_map:
+ return self._alias_map[obj]
+ raise ProgrammingError('%s is not a part of this query' % obj)
+
def __contains__(self, obj):
return obj in self._alias_map
@@ -1566,7 +1571,7 @@ class QueryCompiler(object):
def _parse_field(self, node, alias_map, conv):
if alias_map:
sql = '.'.join((
- self.quote(alias_map[node.model_class]),
+ self.quote(alias_map.lookup(node.model_class)),
self.quote(node.db_column)))
else:
sql = self.quote(node.db_column)
@@ -1767,6 +1772,7 @@ class QueryCompiler(object):
if query._from is None:
clauses.append(model.as_entity().alias(alias_map[model]))
else:
+ [alias_map[x] for x in query._from] # seed the alias map
clauses.append(CommaClause(*query._from))
if query._windows is not None: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
coleifer
Jun 7, 2016
Owner
Ahh, I see what you were talking about -- I wasn't clear on how you were going to tell, in _parse_field(), that you were in a place where new aliases shouldn't be added. And the way you've worked around is to pre-seed the alias map with the sources in _from. If you were wanting to take this approach, you'd probably want to look also at _joins.
I'm dubious of this approach because, to be really sure, you'd need to recursively check _from, _joins, _where, _select, to find any possibly valid sources that could be referenced.
I think I'll pass on this, but if you want it in your implementation, you can subclass Database and override the compiler_class attribute with a QueryCompiler subclass that has the above logic.
|
Ahh, I see what you were talking about -- I wasn't clear on how you were going to tell, in I'm dubious of this approach because, to be really sure, you'd need to recursively check I think I'll pass on this, but if you want it in your implementation, you can subclass |
coleifer
closed this
Jun 7, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 7, 2016
Contributor
I'm dubious of this approach because, to be really sure, you'd need to recursively check _from, _joins, _where, _select, to find any possibly valid sources that could be referenced.
it would seem to be that _from and _join would be the only two places. _where and _select can't define new table aliases. inside the inner query they can obviously, but that would still be inside a from/join.
it would seem to be that |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
coleifer
Jun 7, 2016
Owner
I was just thinking something like:
Post.select(Post.content, Tag.select(fn.GROUP_CONCAT(' ', Tag.tag)).where(Tag.post == Post.id))|
I was just thinking something like: Post.select(Post.content, Tag.select(fn.GROUP_CONCAT(' ', Tag.tag)).where(Tag.post == Post.id)) |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
keredson
Jun 7, 2016
Contributor
hmmm. looks ok:
SELECT "t1"."title", (SELECT GROUP_CONCAT(?, "t2"."tag") FROM "tag" AS t2 WHERE ("t2"."post_id" = "t1"."id")) FROM "post" AS t1'alias still gets set/added inside the from.
|
hmmm. looks ok: SELECT "t1"."title", (SELECT GROUP_CONCAT(?, "t2"."tag") FROM "tag" AS t2 WHERE ("t2"."post_id" = "t1"."id")) FROM "post" AS t1'alias still gets set/added inside the from. |
added a commit
to keredson/peewee
that referenced
this issue
Jun 7, 2016
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
looks like this is going to be fixed via #983 (comment) |
keredson commentedJun 6, 2016
generates:
which is then executed by the database, which throws an error, as
"t2"."name"doesn't exist.(IMO) it should throw some sort of unreferenced table exception, rather than relying on the database to throw the error, since it already knows that it'll never work.
this code example is admittedly contrived (and obvious what the bug is), but for more complicated queries an error thrown earlier makes the API easier to use (as the error is more explicit about the root cause). and the check is cheap.
i have a fix for this locally. it's intertwined with some other stuff, but i'll be happy to pull it into a PR if it's agreed this is the correct behavior.