Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up selection of objects in TopoGeometry layers #50906

Merged
merged 1 commit into from
Jan 24, 2023

Conversation

strk
Copy link
Contributor

@strk strk commented Nov 16, 2022

This is aimed at using spatial indexes on primitive tables. Only affects non-hierarchical TopoGeometry layers access.

@github-actions github-actions bot added this to the 3.30.0 milestone Nov 16, 2022
@strk
Copy link
Contributor Author

strk commented Nov 16, 2022

Needs to use MATERIALIZED conditionally based on PostgreSQL version. It was tested against PostgreSQL 13 and benefits from an index on id(topogeom) which is not provided by default as of PostGIS-3.3

@github-actions
Copy link

github-actions bot commented Dec 1, 2022

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Dec 1, 2022
@strk
Copy link
Contributor Author

strk commented Dec 6, 2022

@pigreco did you try this patch ? If you did, can you tell us if and how it changes the user experience ?

@github-actions github-actions bot removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Dec 6, 2022
@pigreco
Copy link
Sponsor Contributor

pigreco commented Dec 6, 2022

@pigreco did you try this patch ? If you did, can you tell us if and how it changes the user experience ?

Hi Sandro,
i downloaded this build (https://github.com/qgis/QGIS/actions/runs/3483926687)
started the uso_suolo project, the time to display the layer uso_suolo.tgeom is 10 times slower :-(
I'm definitely doing something wrong

@strk
Copy link
Contributor Author

strk commented Dec 6, 2022

10 times slower with the changes in this PR is a great info to have. You didn't necessarely do anything wrong, it may be the code itself. See if/how things change if you add an index on id(tgeom) on uso_suolo

@strk
Copy link
Contributor Author

strk commented Dec 6, 2022

Oh, of course the expected speedup is ONLY when having only a small portion of the whole dataset visible in the map view. Viewing the WHOLE map will indeed be slower in any case, with this patch, I'm afraid.

@pigreco
Copy link
Sponsor Contributor

pigreco commented Dec 6, 2022

See if/how things change if you add an index on id(tgeom) on uso_suolo

BINGOOOOOOOOO!!!

With the index created:
whole layer view:

no patch: 9991 ms:uso_suolo_tgeom_0b662af6_57ad_4217_81e8_51573a4806f2
patched: 5673ms:uso_suolo_tgeom_0b662af6_57ad_4217_81e8_51573a4806f2

for small portions:

no patch: 5224 ms: uso_suolo_tgeom_0b662af6_57ad_4217_81e8_51573a4806f2
patched: 320 ms:uso_suolo_tgeom_0b662af6_57ad_4217_81e8_51573a4806f2

2022-12-06_21h46_52.mp4

@pigreco
Copy link
Sponsor Contributor

pigreco commented Dec 18, 2022

Please let's not make this work useless, what does it take to move forward?

@strk
Copy link
Contributor Author

strk commented Dec 21, 2022 via email

@github-actions
Copy link

github-actions bot commented Jan 5, 2023

The QGIS project highly values your contribution and would love to see this work merged! Unfortunately this PR has not had any activity in the last 14 days and is being automatically marked as "stale". If you think this pull request should be merged, please check

  • that all unit tests are passing

  • that all comments by reviewers have been addressed

  • that there is enough information for reviewers, in particular

    • link to any issues which this pull request fixes

    • add a description of workflows which this pull request fixes

    • add screenshots if applicable

  • that you have written unit tests where possible
    In case you should have any uncertainty, please leave a comment and we will be happy to help you proceed with this pull request.
    If there is no further activity on this pull request, it will be closed in a week.

@github-actions github-actions bot added the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Jan 5, 2023
@pigreco
Copy link
Sponsor Contributor

pigreco commented Jan 5, 2023

why don't any other developers want to review?

@github-actions github-actions bot removed the stale Uh oh! Seems this work is abandoned, and the PR is about to close. label Jan 5, 2023
@strk
Copy link
Contributor Author

strk commented Jan 12, 2023

QGIS is a large project, too large for anyone to be able to review every single spot in it. This PR is for a well-defined spot: the PostgreSQL provider. Even narrower: the topology support in the PostgreSQL provider. I'm probably the only developer who cares about it, and this makes it hard for anyone else to review the work. This is my hypothesis.

That said, the PR is marked as a work in progress (WIP) because a conditional needs to be added to not use the optimized code where it would slow down rather than speed up operations (presence of index). Are you able to help with that work ? I'm not finding the time to work on it at the moment.

Of course it would still be nice if the bot would not keep working toward shooting the (partial) work down by closing the ticket, but that's a battle I seem to be loosing over time...

@pigreco
Copy link
Sponsor Contributor

pigreco commented Jan 12, 2023

Are you able to help with that work ?

I'm not a developer and I can't do the review, so this work is doomed to oblivion.
I don't like this aspect of QGIS.

Thank you for the time you have dedicated to us

@nyalldawson
Copy link
Collaborator

@pigreco

I don't like this aspect of QGIS.

I'm not sure exactly what COULD change here -- this PR is clearly marked and described as not ready for merge, and the original author has stated that they can't resolve the outstanding issues due to time pressure.

Given that we can't merge a half-complete work, and that there's no paid QGIS staff who can just pick up this work and complete it, what would you see as the solution here?

@pigreco
Copy link
Sponsor Contributor

pigreco commented Jan 13, 2023

what would you see as the solution here?

Thank you very much for this message and for the final question.

In this PR, as in all, a review by another developer is needed in order to merge the new feature.
This is an excellent rule, but it poses serious limits as the number of developers who can make the revisions are few and therefore there is the risk of wasting hours of work (as in this case).

I don't have a ready solution to this problem, but I experience it as a problem.

A likely solution would be to greatly expand the pool of developers who can do PR reviews

@agiudiceandrea
Copy link
Contributor

In this PR, as in all, a review by another developer is needed in order to merge the new feature.

@pigreco, while it seems also to me that the review process needs more working time / love / funds / help / ... /, this PR is tagged as "WIP" in the title by the author, i.e. it is a work in progress, a non finished work, so it is not ready to be reviewed. Usually (and obviously) the reviewers don't review a PR until it is ready to be reviewed.

elems AS (

-- TODO: skip if layer cannot be composed by nodes
SELECT ARRAY[n.node_id,1] te
Copy link
Contributor

@lnicola lnicola Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: you might end up rewriting this anyway, but returning two columns here is probably easier for readers, and for Postgres too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, ARRAY construction removed. For the record, readers might have recognized I was building (uselessly) "TopoElement" object -- https://postgis.net/docs/topoelement.html

@strk
Copy link
Contributor Author

strk commented Jan 13, 2023

I've gathered all my forces and pushed what should be the last bit. Would ask @pigreco to please test again with and without the index, at high and low zoom levels, between QGIS with and without this patch. If everything looks fine I guess next step would be for PostGIS to automatically create those indices (which would have more chances to be implemented if you can create a ticket on the PostGIS trac, crosslinking it to this PR :)

@strk
Copy link
Contributor Author

strk commented Jan 13, 2023

I've to say I'm a bit surprised by the 10x slower report when the index is not present so may be worth further inspecting the query sent to the database to understand where the time is spent. I've improved the query a little bit too, and although I've implemented code to ONLY use it when the index is present I disabled that for now, to allow for tests.

Here's what I see on my system (PostgreSQL 13) for the table containing all italian municipalities (~8000) defined against a topology with 27741 faces (one municipality - savona - was further split by lines)

BBOX: st_makeenvelope(347910.26990525069413707,4890245.42507795430719852,745029.62383155268616974,5112265.28829615749418736,32632)

Old query:
Seq Scan on com01012022_wgs84 (cost=0.00..9132.17 rows=1 width=36) (actual time=3.662..11590.442 rows=3669 loops=1)
Planning Time: 0.143 ms
Execution Time: 11590.902 ms

New query, no index:
Hash Join (cost=2816.05..10354.33 rows=988 width=36) (actual time=16.015..4496.000 rows=3669 loops=1)
Planning Time: 0.724 ms
Execution Time: 4496.447 ms

New query, indexed:
Nested Loop (cost=2820.51..4125.22 rows=988 width=36) (actual time=26.620..4473.872 rows=3669 loops=1)
Planning Time: 0.900 ms
Execution Time: 4474.340 ms

All in all I'm thinking this work needs more time to be really tested in multiple conditions.

@pigreco
Copy link
Sponsor Contributor

pigreco commented Jan 14, 2023

@strk

Would ask @pigreco to please test again with and without the index, at high and low zoom levels, between QGIS with and without this patch.

All in all I'm thinking this work needs more time to be really tested in multiple conditions.

Do I have to do the tests you described or not?
which build should I download for testing anyway?

@strk
Copy link
Contributor Author

strk commented Jan 16, 2023

Do I have to do the tests you described or not? which build should I download for testing anyway?

Yes please, build of 3cfc9e6 ('ve no idea where to find such build). Against build of the ancestor commit in common with the master branch ( eb168a5 ) or against the master branch...

@strk
Copy link
Contributor Author

strk commented Jan 17, 2023

@pigreco the "details" link did not give me any detail, I've now restarted the job, let's hope it will succeed this time.

@strk
Copy link
Contributor Author

strk commented Jan 18, 2023

@pigreco the re-run of CI succeeded so the link should now be there.

@pigreco
Copy link
Sponsor Contributor

pigreco commented Jan 19, 2023

I did the required tests I used:

  • QGIS 3.28.2 Florence
  • the patched build
  • with and without index

with index

version whole area test_area
3.28 ~ 10000 ms ~ 5000 ms
patch ~ 5000ms ~ 130 ms

without index

version whole area test_area
3.28 ~ 10000 ms ~ 5000 ms
patch ~ 5000 ms ~ 2500 ms
2023-01-19_18h53_38.mp4

@strk
Copy link
Contributor Author

strk commented Jan 20, 2023

Great, thank you for testing @pigreco so we basically don't need the special handling for presence or absence of the index, which should simplify the code. Worth testing also with lines and with points if you can :)

@strk strk changed the title WIP: Use ad-hoc WHERE clause for TopoGeometry layers Use ad-hoc WHERE clause for TopoGeometry layers Jan 20, 2023
@strk
Copy link
Contributor Author

strk commented Jan 20, 2023

I've simplified the code by removing the check for presence of an index (given the new code is faster even without an index) and rebased against current master. Removed the WIP indication so this is ready for review too now.

int layerLevel;
enum TopoFeatureType
{
Puntal = 1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe Point, Line, Area, Mixed, or something like that?

@strk
Copy link
Contributor Author

strk commented Jan 21, 2023 via email

@lnicola
Copy link
Contributor

lnicola commented Jan 21, 2023

Ouch. All right, sorry for that.

@strk
Copy link
Contributor Author

strk commented Jan 21, 2023 via email

This is aimed at using spatial indexes on primitive tables.
Only affects non-hierarchical TopoGeometry layers access.

Thanks Laurențiu Nicola (grayshade) for excellent feedback and
Salvatore Fiandaca (pigreco) for testing !
@strk
Copy link
Contributor Author

strk commented Jan 21, 2023

Pushed the naming change, for consistency.

@strk strk changed the title Use ad-hoc WHERE clause for TopoGeometry layers Speed up selection of objects in TopoGeometry layers Jan 23, 2023
@strk strk enabled auto-merge (rebase) January 23, 2023 22:44
@strk strk requested review from lnicola, ccrook, jef-n and pcav and removed request for lnicola January 23, 2023 22:44
@strk strk added the Optimization I feel the need... the need for speed! label Jan 23, 2023
@strk
Copy link
Contributor Author

strk commented Jan 24, 2023

Any chance to get this PR approved before feature freeze ? It's a very localized change, only affects users of PostGIS TopoGeometry layers (very few). Two of the few users have tested it and found to be greatly helping :)

Copy link
Contributor

@lnicola lnicola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I'm not really familiar with the code so don't read too much into this :-).

@strk
Copy link
Contributor Author

strk commented Jan 24, 2023

Looks good to me, but I'm not really familiar with the code so don't read too much into this :-).

You're very kind Laurențiu but I'm afraid it takes an approval from someone with a specific level of authorization in order for this PR to be mergeable. I'm not familiar with the internals of GitHub burocracy and cannot find right away the list of such group of people. @nyalldawson do you have a pointer ?

@agiudiceandrea
Copy link
Contributor

Hi @strk, this PR has been ready for review for only 4 days. Although there are PR waiting for merge for weeks or months, I would suggest you to send a message to the developer ml.

Copy link
Contributor

@rouault rouault left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, with the caveat that I'm not familiar with that part of the code.

Note: "grep -ri topogeom tests/" shows very little tests (apparently just testing detection of PostGIS topography layers but not much more, unless I'm missing something). That could be something to develop to make changes in that part of the code feel less risky

@strk strk merged commit c513bad into qgis:master Jan 24, 2023
@pcav
Copy link
Member

pcav commented Jan 25, 2023

Looks useful to me, thanks @strk

@strk strk deleted the faster-topogeom-selection branch January 25, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization I feel the need... the need for speed! PostGIS data provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants