New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QGEP Visual performance #308
Comments
One thing would be to log the queries to see what exactly is requested, and then use http://stackoverflow.com/questions/722221/how-to-log-postgresql-queries I think there is also another level to improve on, which is purely perceptional. If the image wouldn't be cleared (to white) but left in place until the new image is rendered, the perceived impression by the user would be much better. This is something that would be around 2-3 days of development on QGIS core. |
Hi, two processors seems really low to me, especially if it is a windows server shared by other applications or services. Is QGIS multithreaded enabled on client side? For DB analysing, a great tool to use is POWA http://dalibo.github.io/powa/. It will advice you the missing indexes and most consuming queries quickly. By experience, rendering is really impacted by complex symbology and labeling too. You can benchmark your map with a copy using only simple styles and no labeling. In the end, you'll find many bottlenecks, for different use cases. If you server is overloaded by parallel clients, add CPU. If it's not enough, think of a load balancing tool to serve read only layers on a pool of replicated postgresql. |
What about a fading transition :) that would be nice for time manager ! |
Thanks for all the hints @haubourg
What I was thinking was to do this on a per-layer basis. Only replace the layer in the composited image once data is present (or complete). This way it wouldn't be impacted too much by a single slow (WMS) layer. |
@haubourg could you tag it with qwat contrib? |
fyi: There is a PR to QGIS master (QGIS 3) has kind of a pre-rendering where one has the impression of infinite panning. But it is not yet in master, because it is still being discussed. See qgis/QGIS#4777 and e-mail thread starting at https://lists.osgeo.org/pipermail/qgis-developer/2017-July/049261.html Note that I have tested it on Linux (self compiled) with cadastral data and it works quite well. Haven't tested it on Windows. I hope that this makes it into master, as I think it is quite an improvement. |
I'm starting to dive into QGEP with monitoring tools to seek for unoptimized things, I will keep posting here and declare issues when confirmed : test conditions:
Database optimization
We have a That query is creating temp blocks I/O :
SymbologyA lot of NOT IN exceptions are set in ruled based symbology. I'm wondering if this is not slowing things a bit: LabelingLabeling in Pully's dataset adds more than 7 seconds of 100% cpu for computing for large scale. This is quite heavy (I here the computer's fan on each redraw). Transaction GroupsTransaction groups may deactivate multithreaded renderer, but I still see threads opened in htop. . Relation reference widgetsWhen testing edition on the small demo dataset, I face 100 % CPU for the relation widget, for instance for creating a maintenance event, and create a video file association in it. Testing Pully's dataset with customized project reveals a lot higher CPU consumption on any operation. Indicator / qgep demo / qgep pullynb of reaches / 17 209 / 27 897 One hypothesis is that some operations do not scale correctly. |
Nice findings. When you remove the distinct, can you check if there are duplicated reaches (comparing the obj_id)? The obj_id is used as PK when editing the view, so it must be unique. |
I was sidetracked during the tests, Let's keep on this morning. Concerning the Select Distinct ON, we need to work on subqueries to ensure they won't generate duplicates. I'll ask here if I need help understanding the data model (but that's only ~14 % of speed improvements, I'll first keep testing other hypothesis before doing fixes) |
@ponceta @m-kuhn
I couldn't reproduce the crashes Pully encounters when digitizing too fast. So we might need a code sprint in pully to narrow it down. Remote testing is not sufficient here. |
Thanks @haubourg, let's skip the last point + crash for now, I think I isolated that. |
Some more investigation about performances. -- spatial index (it was missing in my database, we should check it exists in QGEP datamodel !)
CREATE INDEX IF NOT EXISTS od_reach_progression_geometry_idx ON qgep.od_reach USING GIST (progression_geometry);
-- indexes on reach points FK (used in SQL JOINS)
CREATE INDEX IF NOT EXISTS od_reach_fk_reach_point_to_idx ON qgep.od_reach (fk_reach_point_to);
CREATE INDEX IF NOT EXISTS od_reach_fk_reach_point_from_idx ON qgep.od_reach (fk_reach_point_from);
CREATE INDEX IF NOT EXISTS od_reach_fk_pipe_profile_idx ON qgep.od_reach (fk_pipe_profile); Other indexes already exists for primary keys ( In my database (and perhaps in yours ?) these indexes are NOT used, see:
You can see the big difference between execution time and query plans ! Some comments:
Here are the SQL test scripts
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
SELECT *
FROM "qgep"."vw_qgep_reach"
WHERE "progression_geometry" && st_makeenvelope(1728327.03249295568093657,8240789.26074041239917278,1728608.10987572139129043,8240958.16933418624103069,3949)
EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
-- SELECT * FROM (
WITH active_maintenance_event AS (
SELECT me.obj_id,
me.identifier,
me.active_zone,
mews.fk_wastewater_structure
FROM qgep.od_maintenance_event me
LEFT JOIN qgep.re_maintenance_event_wastewater_structure mews ON mews.fk_maintenance_event::text = me.obj_id::text
WHERE me.active_zone IS NOT NULL
)
SELECT DISTINCT ON (re.obj_id) re.obj_id,
re.clear_height,
CASE
WHEN pp.height_width_ratio IS NOT NULL THEN round(re.clear_height::numeric * pp.height_width_ratio)::smallint::integer
ELSE re.clear_height
END AS width,
re.coefficient_of_friction,
re.elevation_determination,
re.horizontal_positioning,
re.inside_coating,
re.length_effective,
CASE
WHEN rp_from.level > 0::numeric AND rp_to.level > 0::numeric THEN round((rp_from.level - rp_to.level) / re.length_effective * 1000::numeric, 1)
ELSE NULL::numeric
END AS slope_per_mill,
re.material,
re.progression_geometry,
re.reliner_material,
re.reliner_nominal_size,
re.relining_construction,
re.relining_kind,
re.ring_stiffness,
re.slope_building_plan,
re.wall_roughness,
re.fk_pipe_profile,
ch.bedding_encasement,
ch.function_hierarchic,
ch.connection_type,
ch.function_hydraulic,
ch.jetting_interval,
ch.pipe_length,
ch.usage_current,
ch.usage_planned,
ws.obj_id AS ws_obj_id,
ws.accessibility,
ws.contract_section,
ws.financing,
ws.gross_costs,
ws.inspection_interval,
ws.location_name,
ws.records,
ws.renovation_necessity,
ws.replacement_value,
ws.rv_base_year,
ws.rv_construction_type,
ws.status,
ws.structure_condition,
ws.subsidies,
ws.year_of_construction,
ws.year_of_replacement,
ws.fk_owner,
ws.fk_operator,
ne.identifier,
ne.remark,
ne.last_modification,
ne.fk_dataowner,
ne.fk_provider,
ne.fk_wastewater_structure,
rp_from.obj_id AS rp_from_obj_id,
rp_from.elevation_accuracy AS rp_from_elevation_accuracy,
rp_from.identifier AS rp_from_identifier,
rp_from.level AS rp_from_level,
rp_from.outlet_shape AS rp_from_outlet_shape,
rp_from.position_of_connection AS rp_from_position_of_connection,
rp_from.remark AS rp_from_remark,
rp_from.last_modification AS rp_from_last_modification,
rp_from.fk_dataowner AS rp_from_fk_dataowner,
rp_from.fk_provider AS rp_from_fk_provider,
rp_from.fk_wastewater_networkelement AS rp_from_fk_wastewater_networkelement,
rp_to.obj_id AS rp_to_obj_id,
rp_to.elevation_accuracy AS rp_to_elevation_accuracy,
rp_to.identifier AS rp_to_identifier,
rp_to.level AS rp_to_level,
rp_to.outlet_shape AS rp_to_outlet_shape,
rp_to.position_of_connection AS rp_to_position_of_connection,
rp_to.remark AS rp_to_remark,
rp_to.last_modification AS rp_to_last_modification,
rp_to.fk_dataowner AS rp_to_fk_dataowner,
rp_to.fk_provider AS rp_to_fk_provider,
rp_to.fk_wastewater_networkelement AS rp_to_fk_wastewater_networkelement,
am.obj_id AS me_obj_id,
am.active_zone AS me_active_zone,
am.identifier AS me_identifier
FROM qgep.od_reach re
LEFT JOIN qgep.od_wastewater_networkelement ne ON ne.obj_id::text = re.obj_id::text
LEFT JOIN qgep.od_reach_point rp_from ON rp_from.obj_id::text = re.fk_reach_point_from::text
LEFT JOIN qgep.od_reach_point rp_to ON rp_to.obj_id::text = re.fk_reach_point_to::text
LEFT JOIN qgep.od_wastewater_structure ws ON ne.fk_wastewater_structure::text = ws.obj_id::text
LEFT JOIN qgep.od_channel ch ON ch.obj_id::text = ws.obj_id::text
LEFT JOIN qgep.od_pipe_profile pp ON re.fk_pipe_profile::text = pp.obj_id::text
LEFT JOIN active_maintenance_event am ON am.fk_wastewater_structure::text = ch.obj_id::text
-- ) foo
WHERE "progression_geometry" && st_makeenvelope(1728327.03249295568093657,8240789.26074041239917278,1728608.10987572139129043,8240958.16933418624103069,3949)
;
|
NB : please change the |
By the way, one "easy" way to test the performance QGIS side would be to create materialized view based on NB: I proposed this read-only project to a user which complained about performances, and in his opinion, it is ok to have a quick read-only project for the day-to-day use, and the original QGEP project for editing. -- qgep.vw_qgep_wastewater_structure_materialized
DROP MATERIALIZED VIEW IF EXISTS qgep.vw_qgep_reach_materialized;
CREATE MATERIALIZED VIEW qgep.vw_qgep_reach_materialized AS
SELECT *
FROM qgep.vw_qgep_reach;
CREATE INDEX IF NOT EXISTS vw_qgep_reach_mat_obj_id_idx ON qgep.vw_qgep_reach_materialized (obj_id);
CREATE INDEX IF NOT EXISTS vw_qgep_reach_mat_geom_idx ON qgep.vw_qgep_reach_materialized USING GIST (progression_geometry);
-- qgep.vw_qgep_wastewater_structure
DROP MATERIALIZED VIEW IF EXISTS qgep.vw_qgep_wastewater_structure_materialized;
CREATE MATERIALIZED VIEW qgep.vw_qgep_wastewater_structure_materialized AS
SELECT *
FROM qgep.vw_qgep_wastewater_structure;
CREATE INDEX IF NOT EXISTS vw_qgep_wastewater_structure_mat_obj_id_idx ON qgep.vw_qgep_wastewater_structure_materialized (obj_id);
CREATE INDEX IF NOT EXISTS vw_qgep_wastewater_structure_mat_geom_idx ON qgep.vw_qgep_wastewater_structure_materialized USING GIST (situation_geometry);
Of course we should instead improve performance, but this quick-and-dirty solution could let the users wait a bit. |
How do you keep this materialized view up-to-date? A cron job? |
which you can run (one solution among these)
|
Yes, this sounds like a complete list of possibilities. Although the last one will probably need to be triggered by about 10 other tables which are involved. Which one would you prefer for a production system? |
For a production system, I would still improve the performance of the original views ;-) (working on it, perhaps remove the WITH clause in vw_qgep_reach will let PostgreSQL compute a better query plan ?) Meanwhile, I would:
This would avoid to have a refresh triggered for each edit on the "10 tables" Or we can just add a new button (or logic) in QGEP plugin to run this REFRESH, as it is done for the networkx graph |
NB: asked a question here in pgsql-performance mailing list: |
(Spotted your question in the mailing list) |
Thanks @ladayaroslav for the interest. I already tried to change them ( even into 100), but the query plan stays the same... Is this the right way to change them for the current session ?
|
Yes. |
Hi, David Rowley just answered the question in the pgsql-performance thread I opened. I will try this ASAP and report |
Nice find! |
Tested with the sub-select without the DISTINCT ON clause, and it turns out well I have not tested it yet with the view, but I am pretty sure the query plan will be improved too. As @haubourg wrote, this DISTINCT ON seems not usefull, as there must be not duplicates without it (I have not tested with multiple covers for the same structure...) |
Yay, sounds good.
In theory I agree, in practice data exists where duplicates are present (not sure for what reasons, I think it was with the demo data). The effect of this is, that the view can no longer be edited because the column is no longer unique. The basic requirement for any solution will be, that it is able to produce a unique (and stable) obj_id. |
@mdouchin big applause for digging on step further here ! @m-kuhn I usually tend to think that duplicates in data sets are a work for the data administrator and hardproofing the tools so that they still work with bad quality in the database is finally not something good since the administrator thinks data integrity is correct. |
I do not even understand where these duplicates come from, so that needs to be assessed first (I was quite surprised by them). Once we know what's going on we can decide what can be done against it (constraints, helpers for manual cleaning, ...). |
Oh, concerning labeling ressource consumption, I just realize default settings for PAL engine where quite hungry: @ponceta : could you check what are your settings in and if downgrading to more reasonable values (say 8 tries for points and 10 for lines) improves things? |
In fact, labeling PAL settings are stored in the project file. |
Not at all |
I just dropped the
To test this, please manually run these two scripts.
If everyone is happy with these changes we can proceed to merge QGEP/datamodel#45 |
On the total rendering time, getting rid of both these distinct select is a real improvement. I get around half of the rendering time. Lowering PAL engine values makes a small improvement, can be a way to improve rendering when editing (low label placement importance). |
+1 ! |
Just as a sidenote for complete information, I had to rewrite the queries so they use a subselect instead of windowed functions to avoid having them return multiple rows for the same wastewater structure. |
commit f2e941f5465eca1b30d4d16b88e7fbfe078ac57c Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 13:41:27 2017 +0200 vw_qgep_wastewater_structure: Hide wastewater node information if not unique commit 13bfa7b0100324f57978210c123b599ec0e51a43 Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 13:17:53 2017 +0200 Improve visual performance of reach and wastewater structure References QGEP#308 commit f1cb81cd03211a7184f57f4013c6df2fde4c4ee5 Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 16:57:46 2017 +0200 [travis] Install postgis extension commit 51b3e6588fe4b7150b054ac694f54d56d89f9804 Author: Maxime Trolliet <maxime.trolliet@prilly.ch> Date: Thu Jul 27 10:56:36 2017 +0200 Add oid prefix for Prilly
commit f2e941f5465eca1b30d4d16b88e7fbfe078ac57c Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 13:41:27 2017 +0200 vw_qgep_wastewater_structure: Hide wastewater node information if not unique commit 13bfa7b0100324f57978210c123b599ec0e51a43 Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 13:17:53 2017 +0200 Improve visual performance of reach and wastewater structure References #308 commit f1cb81cd03211a7184f57f4013c6df2fde4c4ee5 Author: Matthias Kuhn <matthias@opengis.ch> Date: Tue Sep 12 16:57:46 2017 +0200 [travis] Install postgis extension commit 51b3e6588fe4b7150b054ac694f54d56d89f9804 Author: Maxime Trolliet <maxime.trolliet@prilly.ch> Date: Thu Jul 27 10:56:36 2017 +0200 Add oid prefix for Prilly
This pull request here cuts rendering time (in edit mode) almost to half: |
I made this change a while ago (maybe 2.14 era?) - in my testing there was only a minimal performance impact at the benefit of vastly improved map labeling (more labels, nicer placement). Most of the cost of labeling comes from elsewhere (preprocessing the geometries) vs solving the label problem. |
Dropping "distinct ON" and adjusting PAL engine values has done the job! |
2018-03-21T17:35:10 1 Actualisation du canevas : 1047 ms |
As we saw with @m-kuhn, visual performance on rendering in QGIS can be slowed due to informations queried in the database for
It would be great, to set some specs to run in an optimal way QGEP using the QGEP's proposed symbology.
@m-kuhn do you know some other tricks to monitor what takes time to render (except the canvas loading time)
I'm talking about server recommanded CPU and RAM in particular but of course other specs could be add like PG version and so on (which is in the doc)
The text was updated successfully, but these errors were encountered: