-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
warn when postgresql distinct columns used on other dialect #4002
Comments
Michael Bayer (@zzzeek) wrote: Implement get_unique_constraints, get_check_constraints for Oracle Pull-request: zzzeek/sqlalchemy#326 Fixes: #4002 → f8b4f72 |
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
this just came up again in #4687. it is strange that we just ignore the ON here. We should generate some kind of token and produce SQL that won't run on non-PG databases, or make this a StrSQLCompiler thing, e.g. it renders "ON ", and default compiler raises. temporarily making this 1.4 since it is bugging me but I might blow this out again. |
I'll try this one. The plan is to just warn or to produce non runnable sql? Then if we want it to warn we would have a default implementation that warns if distinct on is used, and the postgres dialect would overrid this method. |
so this has been like this for many years, so what we do is to not disrupt whoever might be relying upon this by just emitting a warning. If it were day one then maybe it should have raised but there's no urgency to make it do that right now.
It's not that trivial to change some of the big "overridden" methods like these because we have to worry about all the external dialects that might be using this. Also, while the compiler can always choose to break a coarse grained template method into a series of smaller ones, and the downstream consuming classes can change gradually, we can't go back in the other direction, that is, if some step uses methods A,B and C, we can't change the compiler to combine steps A, B, C into a single method D without still calling A, B, C because the consuming classes still use that. as always, I'm cautious about making methods more fine grained because they are the primary source of performance overhead in cPython. However, looking at how this works right now, it's hard to see other avenues that are not worse. It seems like there should be a "render_distinct()" method and then a "render_distinct_on()" method , each are called based on if "select._distinct" / "select._distinct_on" is not None, and the latter one raises unless its on Postgresql. but this still does concern me. The SQL Server and Sybase dialects use get_select_precolumns also to render the "TOP" keyword, and it's not clear if other dialects have a more complex interaction going on in terms of the "get_select_precolumns()" thing. if we leave get_select_precolumns() in place and then have that method call onto "get_distinct()" / "get_distinct_on()", this retains backwards compatibility. It also adds the most method calls. but we'd have to say that these methods only occur if DISTINCT / DISTINCT ON is present which is not the majority of queries so it probably is OK. the other way to go would be having a dialect level flag, "supports_distinct_on", and the compiler just looks at that in conjunction with "select._distinct_on" being present.
well whichver way, the addition of the new methods is how we'd get there so we start with the most minimal approach which is the warning. |
I agree regarding the method.
I was suggesting it because with the current implementation, if the plan was to emit some sql that would not run in other db, it was the best option. That being said, if the plan is to just warn or raise while compiling, then I think a dialect flag is better. |
Oh I think my comment up there maybe was trying to say there'd be a DistinctOn object or something like that. that might work too, but I don't care either way.
what bothers me about that is that a dialect has to not only implement the rendering for "distinct on" but also has to have the flag that's just for this one little check, and also that this method is inconsistent vs. how we handle other things that aren't supported, like returning, CTEs, multiple table UPDATE, DELETE FROM, sequence increments, empty set expressions (I just went through compiler.py to find all the examples of raising for not-supported SQL elements). having a second way to do something that doesn't match how it's done for half a dozen other similar things makes new dialects more difficult to build and adds complexity. |
Makes sense. How do we solve it? |
I think we make get_select_precolumns like this:
then we change for example SQL Server to be like this:
that is, we change all the dialects to make use of compiler's version of this method. the only dialect doing something weird here is MySQL with some ancient deprecated thing that isn't emitting a warning (that's a bug). so what is not working here, but I'm willing to shrug it off, is if a downstream dialect has a get_select_precolumns() method, and they don't change theirs, they will not get the warning here until they change to use the method in the slightly newer way. I'm kind of OK with that. |
ok. why the two new methods ( |
OK.... so get_select_precolumns raises if distinct_on is present ? ok i think part of it was it wasn't clear what "distinct" could mean and you'll notice MySQL dialect has an alternative interpretation of it, which is ancient and we should take out, but it would have to emit a deprecation warning first |
Ok. I'll raise a warning there, and keep the function for now. It should not be hard to do if isinstance(select._distinct, util.string_types):
# warn here
return select._distinct.upper() + " "
return SQLCompiler.get_select_precolumns(...) Also I've noticed that you use |
I have no idea, that is code from 2007 at the latest. super() probably didn't work that intuitively in python 2.3 or something. |
ok. I'll keep it. if ain't broke etc |
I would change it probably because it's non-idiomatic as it is :) |
maybe a general style update would be better done in its own issue though |
you mean like all the places where super() should be used but isnt? OK. with "major version" changes like these, as opposed to point-release single issue fixes, I tend to take more liberties with getting things cleaned up as I go but it is not critical either way. |
I meant in general. Expecially since v2 is py3 only. A lot of code could be cleaned up. another example is |
oh yes the "class Foo(object):" thing will be changed using automated tools for the py3-only conversion. that's a consistent pattern. this thing though is not really consistent, and there are some places where super() is not used because I needed to control exactly what superclass is being hit, so that will not be as easy to get using an automated process. I think when we find something like this FooClass.do_thing(self, ..) call that isn't idiomatic anywhere and we are changing the function anyway, we just fix that part for now. |
Federico Caselli has proposed a fix for this issue in the master branch: Disallow |
…h-4002 Change-Id: I5702fc90e901c6b5fd8fd9893fac44d169e52f74
…h-4002 Change-Id: Icb028babf2c747660b379bd96bbc43817e205054
I forgot that I still had to change the implementation from raise to warn. |
* Update nova from branch 'master' - Merge "remove DISTINCT ON SQL instruction that does nothing on MySQL" - remove DISTINCT ON SQL instruction that does nothing on MySQL The SQLAlchemy ORM call ``.distinct('host')`` indicates that an expression such as "DISTINCT ON host" should be rendered. However, this syntax is only available on PostgreSQL. When run on any other backend, the expression delivers SQL "DISTINCT" and the additional expressions are ignored. An upcoming version of SQLAlchemy will begin to warn when "DISTINCT ON" is called for on a backend that does not support it, as the semantics of "DISTINCT ON <col>" are quite different from "DISTINCT". As Openstack targets MySQL primarily with SQLite used for unit tests, it should already be guaranteed that this query only needs to be rendering "DISTINCT" in order to pass current tests and use cases, since that's all that's being rendered. Change-Id: I267c6f772d514b442c8c3356c2babc1fe98a8b97 References: sqlalchemy/sqlalchemy#4002
The SQLAlchemy ORM call ``.distinct('host')`` indicates that an expression such as "DISTINCT ON host" should be rendered. However, this syntax is only available on PostgreSQL. When run on any other backend, the expression delivers SQL "DISTINCT" and the additional expressions are ignored. An upcoming version of SQLAlchemy will begin to warn when "DISTINCT ON" is called for on a backend that does not support it, as the semantics of "DISTINCT ON <col>" are quite different from "DISTINCT". As Openstack targets MySQL primarily with SQLite used for unit tests, it should already be guaranteed that this query only needs to be rendering "DISTINCT" in order to pass current tests and use cases, since that's all that's being rendered. Change-Id: I267c6f772d514b442c8c3356c2babc1fe98a8b97 References: sqlalchemy/sqlalchemy#4002
…h-4002 Change-Id: I1093cd696f2d94713a12d828749e16ae5ba61d84
…h-4002 Change-Id: Ie32bbd14edc60637b9d221aa978b99103c2fe7cd
…h-4002 Change-Id: I89608ab4eaeaff122576ca74f952efceb7fa7e30
…h-4002 Change-Id: I300de0d48620f47862fe61be964ad25c7d51d5ce
…h-4002 Change-Id: I52dd35818da96cfac1561209ded1b1970e03e110
it's not fixed on subquery. when we are calling a query with |
Could you provide an example? I'm not sure I understand the problem |
Hello @CaselIT , I also faced with this issue.
And when I run query with .subquery() I have next result
|
Could you open a new issue? That seems a bug that's not related to this issue that just added a warning. Thanks |
Thanks |
This function doesn't need to do anything special that the base function doesn't already do. The only extra thing it's doing is some special handling if self._distinct is a string. This seems to be some ancient, deprecated behavior in the MySQL dialect which was probably copied in unkowningly. Because we override this function, we don't emit the warning that we are silently ignoring DISTINCT ON, which the DistinctOnTest expects. Since we don't have any existing applications, we don't need any compatibility deprecation like MySQL. See SQLAlchemy commit 07d6d211f23f1d9d1d69fd54e8054bccd515bc8c and sqlalchemy/sqlalchemy#4002 for more details about the upstream change.
This function doesn't need to do anything special that the base function doesn't already do. The only extra thing it's doing is some special handling if self._distinct is a string. This seems to be some ancient, deprecated behavior in the MySQL dialect which was probably copied in unkowningly. Because we override this function, we don't emit the warning that we are silently ignoring DISTINCT ON, which the DistinctOnTest expects. Since we don't have any existing applications, we don't need any compatibility deprecation like MySQL. See SQLAlchemy commit 07d6d211f23f1d9d1d69fd54e8054bccd515bc8c and sqlalchemy/sqlalchemy#4002 for more details about the upstream change.
Migrated issue, originally created by Michael Bayer (@zzzeek)
e.g. query.distinct()
also add doc note.
The text was updated successfully, but these errors were encountered: