-
-
Notifications
You must be signed in to change notification settings - Fork 404
Set function COSTs #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set function COSTs #104
Conversation
Costs are based on a set of 5M polygons from a UK OSM extract, based on comparing runtimes with those of bigint addition, subtraction, modulo, division, squaring, and square root. The tests were done with PostGIS 2.1.8, so 2.2.0 and later functions are not yet costed. Also adds comments where costs are guessed. ref https://lists.osgeo.org/pipermail/postgis-devel/2016-May/025808.html
I'm going thru these now. Some look suspicious. e.g. ST_AsEWKT(geometry) (really costs 750 ?) If it does that would suggest a bug or some really old code in our code. -- ST_PointOnSurface (really costs 2500, I would expect that to be less than ST_Intersection) |
I think I got all of them except for some of the guesed comments things you put in for some of the existing. I went thru 1 by one because I'm inept with dealing with conflicts with @pramsey latest commit. There were some that I thought should have costs on them that I didn't see, but I'll revisit those later. I also changed the ST_DistanceSpheroid and ST_Distance as I'm pretty sure ST_DistanceSpheroid is 5-20 times costlier than ST_Distance. Though we may still want to up both of them. Details here: https://trac.osgeo.org/postgis/ticket/3557 |
I just double-checked with a different dataset, I noticed that I missed ST_AsText somehow, but it has a similar cost
I haven't set the ST_Intersection cost, two-geometry functions are another round of testing. I expect it will be expensive |
Those are two-geometry functions, I didn't touch their costs. |
ST_PointOnSurface internally uses ST_Intersection, intersecting On the other hand ST_PointOnSurface is guaranteed to always (except Should COST define min, max or avg cost, btw ? |
I don't think it really matters, though I think average would be best. e.g ST_FuncA and ST_FuncB The planner will process ST_FuncB first and skip ST_FuncA if ST_FuncA is costlier. For parallelism it's important in as much as how it affects if parallelization will kick in or not. The costs usage on that as pramsey noted is pretty flaky so having a general rule of thumb is best we can do and making sure we don't set costs higher than functions that are lower than others. That said I think the relationship functions are most important to get right (at least hierarchy wise). |
@pnorman how exactly was the cost measured? The mail thread suggests it is in units of cpu_operator_cost - but cpu_operator_cost itself is 0.0025 * seq_page_cost in default config. Due to the latest course of reverting this change back to 1 - was this multiplier taken into account? Did you try taking a more complicated operator than a bigint addition for the base? |
Yes. This does not matter, as cpu_operator_cost cancels out in the math.
I have not been involved in any revert.
See the mail thread for the operators/functions used. |
Work in progress, not yet complete
ref https://lists.osgeo.org/pipermail/postgis-devel/2016-May/025808.html
Notes from implementing
_postgis_deprecate
has a cost of 100. This means deprecating a function likeST_length_spheroid
makes it 20% slower, and a quick function likeST_force_2d
becomes 20x slower.postgis.sql.in