Skip to content
Permalink
Browse files

Fix #11965 - improve performance of selectedFeatures()

The slowness of merge / split features tools was caused by the change in the logic
in selectedFeatures(): instead of fetching individual features one by one by ID,
the whole layer is traversed. Such approach makes sense when many features are selected,
but with few features there is considerable delay when dealing with big layers.

The implementation is not ideal, but for some common cases the performance is much better.

Merging of features now does not request selected features when not necessary.

When rendering, avoid using layer's extent() method that may force expensive
calculation of layer's extent.
  • Loading branch information
wonder-sk committed Feb 17, 2015
1 parent ffd7f8a commit 255cbd2c2a5d4836b02765ace9050856649216b1
Showing with 34 additions and 14 deletions.
  1. +8 −6 src/app/qgisapp.cpp
  2. +1 −2 src/core/qgsmaprendererjob.cpp
  3. +25 −6 src/core/qgsvectorlayer.cpp
@@ -5911,6 +5911,7 @@ void QgisApp::mergeSelectedFeatures()
}

//get initial selection (may be altered by attribute merge dialog later)
QgsFeatureIds featureIds = vl->selectedFeaturesIds();
QgsFeatureList featureList = vl->selectedFeatures(); //get QList<QgsFeature>
bool canceled;
QgsGeometry* unionGeom = unionGeometries( vl, featureList, canceled );
@@ -5939,20 +5940,21 @@ void QgisApp::mergeSelectedFeatures()
return;
}

QgsFeatureList featureListAfter = vl->selectedFeatures();
QgsFeatureIds featureIdsAfter = vl->selectedFeaturesIds();

if ( featureListAfter.size() < 2 )
if ( featureIdsAfter.size() < 2 )
{
QMessageBox::information( 0, tr( "Not enough features selected" ), tr( "The merge tool requires at least two selected features" ) );
delete unionGeom;
return;
}

//if the user changed the feature selection in the merge dialog, we need to repeat the union and check the type
if ( featureList.size() != featureListAfter.size() )
if ( featureIds.size() != featureIdsAfter.size() )
{
delete unionGeom;
bool canceled;
QgsFeatureList featureListAfter = vl->selectedFeatures();
unionGeom = unionGeometries( vl, featureListAfter, canceled );
if ( !unionGeom )
{
@@ -5978,10 +5980,10 @@ void QgisApp::mergeSelectedFeatures()
newFeature.setGeometry( unionGeom );
newFeature.setAttributes( d.mergedAttributes() );

QgsFeatureList::const_iterator feature_it = featureListAfter.constBegin();
for ( ; feature_it != featureListAfter.constEnd(); ++feature_it )
QgsFeatureIds::const_iterator feature_it = featureIdsAfter.constBegin();
for ( ; feature_it != featureIdsAfter.constEnd(); ++feature_it )
{
vl->deleteFeature( feature_it->id() );
vl->deleteFeature( *feature_it );
}

vl->addFeature( newFeature, false );
@@ -166,12 +166,11 @@ LayerRenderJobs QgsMapRendererJob::prepareJobs( QPainter* painter, QgsPalLabelin
continue;
}

QgsDebugMsg( QString( "layer %1: minscale:%2 maxscale:%3 scaledepvis:%4 extent:%5 blendmode:%6" )
QgsDebugMsg( QString( "layer %1: minscale:%2 maxscale:%3 scaledepvis:%4 blendmode:%5" )
.arg( ml->name() )
.arg( ml->minimumScale() )
.arg( ml->maximumScale() )
.arg( ml->hasScaleBasedVisibility() )
.arg( ml->extent().toString() )
.arg( ml->blendMode() )
);

@@ -2327,24 +2327,43 @@ const QgsFeatureIds& QgsVectorLayer::selectedFeaturesIds() const
QgsFeatureList QgsVectorLayer::selectedFeatures()
{
QgsFeatureList features;

QgsFeatureIterator it = selectedFeaturesIterator();

QgsFeature f;
while ( it.nextFeature( f ) )

This comment has been minimized.

Copy link
@m-kuhn

m-kuhn Feb 17, 2015

Member

Wouldn't it be even better to fix the FilterFids code for affected providers?
That would give a better performance and fix similar effects in other areas of the code/plugins. Evenmore as the current threshold of 8 is arbitrary (and I couldn't think of a good way to get an appropriate guess)

This comment has been minimized.

Copy link
@wonder-sk

wonder-sk Feb 17, 2015

Author Member

That would be better, but that would require more non-trivial changes for which I feel is too late before the release.

Deciding on a good threshold is difficult - even within one provider... any ideas to make it less arbitrary are welcome :-)

This comment has been minimized.

Copy link
@m-kuhn

m-kuhn Feb 17, 2015

Member

My best shot is: Dismiss the threshold and fix the providers ;)

This comment has been minimized.

Copy link
@wonder-sk

wonder-sk Feb 18, 2015

Author Member

How would you fix the providers without introducing some arbitrary thresholds? If many features are selected, I believe it would be still more efficient to do sequential read rather than executing query that picks individual FIDs "WHERE fid IN (id1, id2, id3, ...)".

This comment has been minimized.

Copy link
@m-kuhn

m-kuhn Feb 18, 2015

Member

I'd completely go for the IN (1, 2, 3) approach.

There are too many variables that influence the outcome (network latency, network throughput (up and down may be different), server CPU, client CPU, disk seek time, dataset size)

The most important thing is that a selection between 1 and say 10000 features works fine (I guess that's what you're using 99%) and in this range the IN approach should work fine. Uploading should not take too long and the server should have an index on the id.

The sequential selection may become faster with bigger selections. Delay is probably mainly a function of upload speed and dataset size (for real databases). But estimating these properly is reading tea leaves at best.

This comment has been minimized.

Copy link
@nyalldawson

nyalldawson Feb 19, 2015

Collaborator

why not use a % threshold rather than a fixed value? Eg, if filtering to 95+% of the rows, it's probably better to fetch them all and do the filtering ourselves, but for < 95% use an IN(... ).

This comment has been minimized.

Copy link
@wonder-sk

wonder-sk Feb 19, 2015

Author Member

Well that would require using feature count which may trigger slow counting of features.
The % threshold may need to be tweaked per provider anyway...

This comment has been minimized.

Copy link
@m-kuhn

m-kuhn Feb 19, 2015

Member

I agree this needs to be tweaked per provider.

  • Postgres et al, should not have problems with giving a feature count immediately, so a percentage approach should be fine. And these will be the ones that profit most from changes in this area.
  • For CSV et al. we'll have to resort to sequential read in every case anyway.
  • For OGR I do not know the details.

But I think this is all really advanced magic and just using IN in every case for now should be a very suitable way of dealing with this with a very simple logic.


if ( mSelectedFeatureIds.count() <= 8 )
{
// for small amount of selected features, fetch them directly
// because request with FilterFids would go iterate over the whole layer
foreach ( int fid, mSelectedFeatureIds )
{
getFeatures( QgsFeatureRequest( fid ) ).nextFeature( f );
features << f;
}
}
else
{
features.push_back( f );
QgsFeatureIterator it = selectedFeaturesIterator();

while ( it.nextFeature( f ) )
{
features.push_back( f );
}
}

return features;
}

QgsFeatureIterator QgsVectorLayer::selectedFeaturesIterator( QgsFeatureRequest request )
{
if ( mSelectedFeatureIds.count() == 0 )
return QgsFeatureIterator();

if ( geometryType() == QGis::NoGeometry )
request.setFlags( QgsFeatureRequest::NoGeometry );

request.setFilterFids( mSelectedFeatureIds );
if ( mSelectedFeatureIds.count() == 1 )
request.setFilterFid( *mSelectedFeatureIds.constBegin() );
else
request.setFilterFids( mSelectedFeatureIds );

return getFeatures( request );
}

0 comments on commit 255cbd2

Please sign in to comment.
You can’t perform that action at this time.