Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature creation very slow when using joins #36167

Closed
Alexandre27 opened this issue May 4, 2020 · 14 comments · Fixed by #36866
Closed

Feature creation very slow when using joins #36167

Alexandre27 opened this issue May 4, 2020 · 14 comments · Fixed by #36866
Assignees
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Digitizing Related to feature digitizing map tools or functionality High Priority Regression Something which used to work, but doesn't anymore

Comments

@Alexandre27
Copy link

Describe the bug
When creating features on a layer that has vector joins, qgis takes more than 40 seconds (with form opening enabled) and 10s (hiding form) with a point layer with 15k features with a postgresql source on localhost. This makes qgis edition unusable when there are a large number of features (60k on a real dataset).

How to Reproduce
Create a postgis database using the following commands:

CREATE EXTENSION postgis;

CREATE TABLE point_layer (
    ogc_fid integer PRIMARY KEY,
    geom public.geometry(GeometryZ,3763),
    height double precision
);

CREATE INDEX point_layer_geom_idx ON point_layer USING gist (geom);

CREATE TABLE attribute_layer (
    id SERIAL PRIMARY KEY,
    fid integer REFERENCES point_layer(ogc_fid),
    some_value text
);

INSERT INTO point_layer(ogc_fid, geom, height)
SELECT idx, ST_SetSRID(ST_MakePoint(0, idx, 0), 3763), 0 FROM generate_series(0, 15000) as s(idx);

Create a feature on point_layer using the attached qgis project (which assumes that the db is running on port 5432 on localhost with user postgis and password postgis). In this project the join is already made.

This should take a long time to complete.

QGIS and OS versions

QGIS version 3.10.5-A Coruña QGIS code branch Release 3.10
Compiled against Qt 5.9.5 Running against Qt 5.9.5
Compiled against GDAL/OGR 2.2.3 Running against GDAL/OGR 2.2.3
Compiled against GEOS 3.6.2-CAPI-1.10.2 Running against GEOS 3.6.2-CAPI-1.10.2 4d2925d6
Compiled against SQLite 3.22.0 Running against SQLite 3.22.0
PostgreSQL Client Version 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1) SpatiaLite Version 4.3.0a
QWT Version 6.1.3 QScintilla2 Version 2.10.2
PROJ.4 Version 493
OS Version Ubuntu 18.04.4 LTS
Active python plugins processing; db_manager; MetaSearch
QGIS version 3.12.2-București QGIS code revision 8a1fb33
Compiled against Qt 5.9.5 Running against Qt 5.9.5
Compiled against GDAL/OGR 2.2.3 Running against GDAL/OGR 2.2.3
Compiled against GEOS 3.6.2-CAPI-1.10.2 Running against GEOS 3.7.1-CAPI-1.11.1 27a5e771
Compiled against SQLite 3.22.0 Running against SQLite 3.22.0
PostgreSQL Client Version 10.12 (Ubuntu 10.12-0ubuntu0.18.04.1) SpatiaLite Version 4.3.0a
QWT Version 6.1.3 QScintilla2 Version 2.10.2
PROJ.4 Version 493
OS Version Ubuntu 18.04.4 LTS
Active python plugins processing; db_manager; MetaSearch

Additional context

When trying to find out what the problem was I noticed that qgis makes a lot of repeated queries when doing a feature creation. This seems to cause the delay as it gets completely unusable when connecting to a remote postgres db.

Here is the queries made on a single feature creation with the form enabled.

This problem was already submitted on stack exchange and qgis-users mailing list.

@Alexandre27 Alexandre27 added the Bug Either a bug report, or a bug fix. Let's hope for the latter! label May 4, 2020
@gioman
Copy link
Contributor

gioman commented May 4, 2020

@Alexandre27 is the join created in QGIS or at a DB level?
Can you please attach the dump (with data) of the two tables?

@gioman gioman added Feedback Waiting on the submitter for answers Digitizing Related to feature digitizing map tools or functionality labels May 4, 2020
@nulopes
Copy link

nulopes commented May 4, 2020

@gioman I'm working on this with @Alexandre27, the join is created on QGIS.

The script that we sent above creates some sample data with this command:

INSERT INTO point_layer(ogc_fid, geom, height)
SELECT idx, ST_SetSRID(ST_MakePoint(0, idx, 0), 3763), 0 FROM generate_series(0, 15000) as s(idx);

The specific data does not seem to be important, just the amount of data (for us 15k seems to make the problem obvious).

We don't need any data on the attribute_layer to see the slowness happen. If you wish we can still send you a full dump.

@gioman
Copy link
Contributor

gioman commented May 4, 2020

@gioman I'm working on this with @Alexandre27, the join is created on QGIS.

@Alexandre27 @nulopes Olá,

If you wish we can still send you a full dump.

if it doesn't bore too much, yes I would prefer to have the dump of both tables. Also another question: does the problem only occur with PostGIS data? What about file based data? Do you see the same issue?

@Alexandre27
Copy link
Author

@gioman Olá,

Attached is the full db dump. We haven't tested the joins with other data sources.

@gioman gioman removed the Feedback Waiting on the submitter for answers label May 5, 2020
@gioman
Copy link
Contributor

gioman commented May 5, 2020

Attached is the full db dump. We haven't tested the joins with other data sources.

@Alexandre27 viva,

I tested on master on Ubuntu 18.04 and I see the same issue.

Any idea if this regressed at some point in the past (i.e. do you know if in previous releases this was not an issue?)?

@nulopes
Copy link

nulopes commented May 5, 2020

@gioman Olá,

Any idea if this regressed at some point in the past (i.e. do you know if in previous releases this was not an issue?)?

We tested with 3.4 and 3.8 and had the same results, should we go even further back?

@gioman
Copy link
Contributor

gioman commented May 5, 2020

We tested with 3.4 and 3.8 and had the same results, should we go even further back?

testing the last LTR of the 2.* series (2.18) would also be very useful. What we need to understand if this is a regression as it would get more attention during bug fixing campaigns.

@nulopes
Copy link

nulopes commented May 5, 2020

testing the last LTR of the 2.* series (2.18) would also be very useful. What we need to understand if this is a regression as it would get more attention during bug fixing campaigns.

I've now tested with 2.18 and it's instantaneous, so this seems to be a regression.

@gioman gioman added Regression Something which used to work, but doesn't anymore High Priority labels May 5, 2020
@gioman
Copy link
Contributor

gioman commented May 5, 2020

I've now tested with 2.18 and it's instantaneous, so this seems to be a regression.

@nulopes thanks, hopefully it will be fixed in the next round of bug fixing.

@nulopes
Copy link

nulopes commented May 5, 2020

@gioman thanks, If I can help anymore (even if it implies some debugging), please let me know

@gioman
Copy link
Contributor

gioman commented May 5, 2020

@gioman thanks, If I can help anymore (even if it implies some debugging), please let me know

@nulopes thanks! the project is open in every sense, so you are most welcome to jump in with debugging, patches, requests for commercial support for targeted bug fixes (see the proper page on the qgis.org site), etc.

@elpaso elpaso self-assigned this May 29, 2020
@elpaso
Copy link
Contributor

elpaso commented May 29, 2020

I've been able to reproduce the issue, it's not a trivial fix though.

@nulopes
Copy link

nulopes commented Jun 1, 2020

@elpaso I had the feeling it wouldn't be. I have already setup an environment and compiled qgis from source but with such a large project I don't know where to start. Do you think I can help and lay some ground for the fix? I haven't programmed in C++ for 2 years now, but I'd be glad to help.

@elpaso
Copy link
Contributor

elpaso commented Jun 1, 2020

Thank you @nulopes, but I've already spent some time on this issue, if you are willing to help I would recommend that you start with something easier, there are plenty of bugs in the issue queue, just pick one that you feel comfortable with, assign it to you and give it a try.

elpaso added a commit to elpaso/QGIS that referenced this issue Jun 1, 2020
Unique constraint validation was sub-optimized in several points,
this PR addresses a few of the critical paths all in QgsVectorLayerUtils

- in createFeatures: on-demand creation of the cached values
- in validateAttribute: don't check for uniqueness if the value is NULL
and a NOT NULL constraint was violated
- in valueExists: search source layers for values in joined fields

Fixes qgis#36167
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Either a bug report, or a bug fix. Let's hope for the latter! Digitizing Related to feature digitizing map tools or functionality High Priority Regression Something which used to work, but doesn't anymore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants