Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Costing #556

Closed
wants to merge 3 commits into from
Closed

Costing #556

wants to merge 3 commits into from

Conversation

Algunenano
Copy link
Member

A big set of cost changes.

  • After doing some benchmarks with a PG12 + postgis 3.0, 4 CPU, 16GB server and compare them with PG11 + postgis 2.5 (CARTO forks in both, which included costs before they were introduced upstream) we've seen multiple benchmarks where the performance degraded. The reason of this degradation has turned out to be that the cost of some functions has increased quite a bit, triggering Postgresql parallelism when it isn't the best option.

I've adjusted the functions involved and reduced the cost of _COST_LOW and _COST_MEDIUM to a place that makes small things less prone to parallelism. I'm talking about queries in the 10-40 ms range where a parallel plan isn't worth it.

Added to that, I've reviewed (I think all) geometry functions. I've left untouched any of the ones that operate over multiple geometries (ST_Intersects, and so on) since I didn't see a good way to assign a cost to those, but for those with a single geometry I've run them mostly with huge geometries and compare between them.

The references I've taken are:

  • The function does a read over the serialized geometry, for example ST_SRID: _COST_DEFAULT, that is cost 1.
  • The functions does a fast operation over the geometry after deserialize it, for example ST_RemoveRepeatedPoints: _COST_LOW, that is cost 50.
  • The function does a slower operation over the geometry (5-10x slower than ST_RemoveRepeatedPoints), for example ST_AsText: _COST_MEDIUM, that is cost 500.
  • The function does an even slower operation (over 10+x slower than ST_RemoveRepeatedPoints), for example ST_Transform or ST_IsValid: _COST_HIGH, that is 10000.

There are multiple changes based on those rules and I've also added multiple cost to functions that didn't have it. I haven't touched geographies all that much (just here and there).

Since there are many changes, I don't plan to backport this to 3.0 and only apply it to 3.1+

Reference: https://trac.osgeo.org/postgis/ticket/4490

@Algunenano Algunenano requested a review from pramsey April 29, 2020 15:49
Copy link
Member

@pramsey pramsey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've done a more thorough job than I did, looks good.

@Algunenano
Copy link
Member Author

Well, it took me 8 months to get around to it. So there is that 😄

@strk strk closed this in b28cfb9 Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants