Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QgsQueryResultModel #40750

Merged
merged 13 commits into from Jan 14, 2021
Merged

Conversation

elpaso
Copy link
Contributor

@elpaso elpaso commented Dec 24, 2020

A model for QgsAbstractDatabaseProviderConnection::QueryResult

@elpaso elpaso added API API improvement only, no visible user interface changes PyQGIS Related to the PyQGIS API labels Dec 24, 2020
@github-actions github-actions bot added this to the 3.18.0 milestone Dec 24, 2020
@m-kuhn
Copy link
Member

m-kuhn commented Dec 24, 2020

Nice.

When is "canFetchMore" and "fetchMore" called?

What I'm mostly interested in knowing is if this keeps a long-living iterator around, and if this could deplete the connection pool and freeze if multiple models are open?

@elpaso
Copy link
Contributor Author

elpaso commented Dec 24, 2020

Nice.

When is "canFetchMore" and "fetchMore" called?

I suppose it will be called by the view https://doc.qt.io/qt-5/qabstractitemmodel.html#canFetchMore

What I'm mostly interested in knowing is if this keeps a long-living iterator around, and if this could deplete the connection pool and freeze if multiple models are open?

That's implementation dependent: OGR/spatialite use the connections pool and PG also does. I'm not sure about the internals of MSSQL though.

Maybe it would be worthwhile to close the connections and release the resources when iteration is completed though, I haven't done that. Anyway, it can be done in a future PR, it's not directly related to this model but to the other PR #40726.

@m-kuhn
Copy link
Member

m-kuhn commented Dec 24, 2020

Isn't it related to this PR, since this approach here demands a long living iterator in the member variables?

So it should either directly consume the complete iterator on instantiation or accept a request instead that it will send in a series of execSql calls with different limit/offset parameters for paging?

@elpaso elpaso added the Prototype For comment only, do not merge. Avoid using this tag and open a QEP instead! label Dec 24, 2020
@elpaso
Copy link
Contributor Author

elpaso commented Dec 24, 2020

Isn't it related to this PR, since this approach here demands a long living iterator in the member variables?

So it should either directly consume the complete iterator on instantiation or accept a request instead that it will send in a series of execSql calls with different limit/offset parameters for paging?

What I'm trying to do here is a model that behaves like this: https://doc.qt.io/qt-5/qtwidgets-itemviews-fetchmore-example.html

So, no: no paging in client code, but I'm trying to use what QT offers.

I said it's not directly related because it's QgsQueryResult internal implementation how the iterator works (it's not even public API).

Fetching all rows would defeat the whole purpose of this model and freeze the GUI if we have to fetch all rows before using it.

After chatting with @m-kuhn we think that a better approach would require fetching all rows in a separate thread and delete the iterator when done (releasing the resources), to avoid occupying too many connections in long-living iterators.

@elpaso elpaso added the NOT FOR MERGE Don't merge! label Dec 24, 2020
@elpaso elpaso force-pushed the connections-api-queryresult-model branch 4 times, most recently from 4133d05 to d9f602c Compare December 28, 2020 08:07
@elpaso elpaso removed NOT FOR MERGE Don't merge! Prototype For comment only, do not merge. Avoid using this tag and open a QEP instead! labels Dec 28, 2020
@elpaso
Copy link
Contributor Author

elpaso commented Dec 28, 2020

@m-kuhn I've implemented the threaded fetcher in the model and fixed a few other issues. The iterator is now thread-safe.

{
mWorker = new QgsQueryResultFetcher( &mQueryResult );
mWorker->moveToThread( &mWorkerThread );
connect( &mWorkerThread, &QThread::finished, mWorker, &QObject::deleteLater );
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete later won't null the pointer, so this could cause a crash in the destructor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Would it be ok to move worker's deleteLater at the end of the destructor after waiting for the thread to quit?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change the raw mWorker pointer member to be a QPointer instead -- that should help


virtual QVariantList nextRowPrivate() = 0;
virtual bool hasNextRowPrivate() const = 0;
mutable QList<QList<QVariant>> mRows;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're storing all the fetched results in a list in memory, doesn't this negate the benefit of having an iterator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory-wise, yes. There is still an advantage in using the iterator that is that the fetching is lazy.

This is the only way I've found to make it work from multiple threads and with rows() without re-running the query every time, which we don't want to happen because the query might have side-effects.

Maybe the root issue is that this implementation started with rows(), then I added the iterator later. So, if you call rows() multiple times you still fetch the rows only once, additionally you can also call rows() and use the iterator at the same time from multiple threads without issues. at() is also supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nyalldawson forgot to mention: in order for the model to work, we need to store the rows anyway, we could store them in the model itself, but IMO storing them in the (private and shared) iterator has the advantages mentioned above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elpaso Good point regarding the model requiring the storage of all rows, but I think ideally it should be model's responsibility to store the list of rows instead of the iterator's. Then the iterator could be a "pure iterator", as memory-friendly as possible and could be used in other contexts (ie outside of the model) without inheriting the storage requirement of the model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And to add to that -- we could make the QueryResult::rows() method still iterate over all rows, but remove the mRows member and just build a list in the rows() method internally and return that. So rows will still be slow to call and subject to potential memory limitations, but nextRow() will be as light as possible..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we cannot run rows() two times without re-running the query (with possible side-effects), this is why I was storing the results inside the class.

But I'm ok with turning it into a real iterator over the rows, but we'll need to drop rows() or just make it return an empty set when called the second time.

@elpaso elpaso force-pushed the connections-api-queryresult-model branch from 65d543c to f58385b Compare December 29, 2020 08:41
@elpaso elpaso force-pushed the connections-api-queryresult-model branch from f58385b to f926915 Compare January 6, 2021 08:50
Store rows in the model, remove rewind and random access from
query result class.

This implementation is way simpler but the (forward) iterator
can be used only once.

I'm still convinced that the old implementation was better and more
flexible but I need to move forward.

Client code will need to store the results from the iterator.
@elpaso elpaso force-pushed the connections-api-queryresult-model branch from f926915 to 337c7c3 Compare January 13, 2021 10:48
@elpaso elpaso merged commit a0d847f into qgis:master Jan 14, 2021
@elpaso elpaso deleted the connections-api-queryresult-model branch January 14, 2021 10:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API API improvement only, no visible user interface changes PyQGIS Related to the PyQGIS API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants