Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refs #26112 -- Fixed aggregate GIS test on Oracle #6041

Closed
wants to merge 1 commit into from

Conversation

shaib
Copy link
Member

@shaib shaib commented Jan 25, 2016

This fixes the test -- but is it the right fix?

Is it OK in this case that Sum still aggregates over (single) records? This aggregation necessitates the unintuitive and expensive (causes a database round-trip per row later) defer() call.

Should the Area function be modified to return a FloatField on Oracle?

@shaib
Copy link
Member Author

shaib commented Jan 25, 2016

buildbot, test on oracle.

@shaib
Copy link
Member Author

shaib commented Jan 25, 2016

3rd time was a charm.

@yellowcap
Copy link
Contributor

Its true that the sum is not necessary in the test case, as there is no additional grouping. But I came across this bug when trying to do sum over grouped values, using something along the lines of

 Country.objects.all().values('continent').annotate(area_sum=Sum(Area('mpoly')))

To cover these situations, the Area function should work in combination with Sum using its standard output field. So I think the test should pass as it was and the Area or Sum functions need to be fixed.

I am not familiar with oracle, does the problem arise because Sum is called on an expression that only returns one item?

@shaib
Copy link
Member Author

shaib commented Jan 26, 2016

There are two problems: The easy one is that on Oracle, unlike other backends, Area() returns a Decimal instead of a float. I thought a little more about it, and I think we should not change that -- Oracle allows higher precision, it is not our place to bring it down to a lower common denominator. So for that, I now think that the solution in the PR is correct.

Per the Sum, I was for some reason under the impression that Area() here returns a value for each of the polygons in mpoly and it needs Sum to aggregate them (did I mention it was late at night?). Summing over one value is not an issue -- the issue is grouping over mpoly (when you call annotate() with an aggregate function like Sum, Django adds grouping over all the non-aggregate columns; the SQL standard requires that every column selected from a SELECT with a GROUP BY is either an aggregate or in the GROUP BY). Assuming now that Area() returns one value for the whole polygon array, again, I think the PR's solution is correct (we have similar cases involving grouping over text fields, where Oracle presents the same problem).

@yellowcap
Copy link
Contributor

Your intuition about Area() was correct, it does return one value per polygon in mpoly, and the Sum function aggregates them. The test is probably not explicit enough to show that, as it does the aggregate only over one polygon, which is a bit artificial.

For the test to be more explicit, one could also do the following:

    for c in Country.objects.all():
        CountryWebMercator.objects.create(name='First', mpoly=c.mpoly)
        CountryWebMercator.objects.create(name='Second', mpoly=c.mpoly)
    # This should not crash
    CountryWebMercator.objects.values('name').annotate(area_sum=Sum(Area('mpoly')))

I suppose that the query above should also work in oracle as is, without having to override the output_field? If that's the case and it fails the Sum or Area should be changed, including keeping track of Decimals.

@shaib sorry for insisting, I guess I still don't understand why the query fails on oracle, hence my question.

@shaib
Copy link
Member Author

shaib commented Jan 26, 2016

@yellowcap no need to apologize, asking questions is good.

mpoly is a MultiPolygonField. It can hold multiple polygons in one row. My intuition yesterday was that in this case, Area returns an array in one row, and Sum would some over that. I don't believe that to be the case anymore; Area returns one number (per database row), even if there are many polygons in mpoly on that row.

W.r.t. Decimals: Your query would indeed work on Oracle as it is, without overriding the output_field (and mine did when I only added defer('mpoly')). But then, the test fails on computing result - c.mpoly.area, because c.mpoly.area is a float and not a Decimal, and these types don't mix.

Actually, when I wrote my last comment, I didn't realize the float in question came from the area property of the field value. This makes me reconsider; I would appreciate the opinion of people who are more familiar than me with the GIS functionality -- is there more value in exposing the full abilities of Oracle here, or in making Area() compatible with MultiPolygonField.area?

@timgraham
Copy link
Member

@jtiai, can you advise on this?

@jtiai
Copy link

jtiai commented Feb 5, 2016

At least in Oracle area from multipolygon should result a single, combined area from all polygons in geometry. As a resulting values I think it's precise enough to return floats. In extreme cases when very big polygons would be used it might cause losing a precision but I guess in normal activity that won't be a problem.

@timgraham timgraham changed the title Fixed #26112 -- Fixed aggregate GIS test on Oracle Refs #26112 -- Fixed aggregate GIS test on Oracle Feb 5, 2016
@shaib shaib force-pushed the ticket_26112 branch 3 times, most recently from 14fb955 to 8acace5 Compare February 6, 2016 18:22
@shaib
Copy link
Member Author

shaib commented Feb 6, 2016

@claudep @jtiai following advice from @jtiai I changed the AreaField conversion so it turns Decimals into floats -- this makes the areas returned from database "play well" with areas calculated on the Python side. Do you agree? Do you think it makes sense to do the same to DistanceField?

While at it, shouldn't we document these field types?

@shaib
Copy link
Member Author

shaib commented Feb 6, 2016

buildbot, test on Oracle.

@shaib
Copy link
Member Author

shaib commented Feb 6, 2016

There is no failure in the default test, Jenkins got confused and thought the py2.7/mysql-gis test was cancelled but in fact it was successful.

@@ -505,6 +509,10 @@ Miscellaneous
argument of the ``render_options()`` method is also removed, making
``selected_choices`` the first argument.

* On Oracle/GIS, the :class:`django.contrib.gis.db.models.functions.Area`
aggregate function used to return a ``decimal.Decimal`` value (of square
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"used to" -> "now returns float instead of decimal.Decimal."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Changed to:
now returns float instead of decimal.Decimal (it is still wrapped in a measure of square meters).

@timgraham
Copy link
Member

Does the test change sufficiently cover the change? At least a comment in the test seems warranted.

@shaib
Copy link
Member Author

shaib commented Feb 8, 2016

Comment added.

Made sure the test doesn't try to aggregate over MultiPolygonField and made
AreaField turn decimals into floats on the way from the DB.

Thanks Daniel Wiesmann, Jani Tiainen and Tim Graham  for review and discussion.
@timgraham
Copy link
Member

merged in bb51dc9, thanks!

@timgraham timgraham closed this Feb 9, 2016
@shaib
Copy link
Member Author

shaib commented Feb 9, 2016

Thanks for committing, I was sort of waiting for input from the geodjango gurus about making DistacneField also return float instead of Decimal.

@claudep
Copy link
Member

claudep commented Feb 9, 2016

Sorry Shai for being unresponsive on this one. I admit that Oracle stuff doesn't thrill me. Making DistanceField a float might add some consistency to the backend.

@shaib
Copy link
Member Author

shaib commented Feb 9, 2016

@claudep your time is your own and it's certainly your choice where to spend it, no need to apologize there. Thanks for taking a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants