Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve deep relations with PostgreSQL - transaction groups, notify, savepoints #97

Open
3 of 6 tasks
haubourg opened this issue Jun 22, 2017 · 9 comments
Open
3 of 6 tasks

Comments

@haubourg
Copy link
Member

haubourg commented Jun 22, 2017

QGIS Enhancement: Improve deep relations with PostgreSQL - transaction groups, notify, savepoints

Date 2017/06/22

Authors Régis Haubourg (@haubourg) / Vincent Mora (@vmora)

Contact regis.haubourg@oslandia.com

maintainer @vmora

Version QGIS 3.X

Summary

This QEP as been granted by QGIS.org 2017.

Since version 2.14, QGIS offers the not-so-well-known ability to handle transaction groups,
which means it can instantly evaluate triggers on database side, and refresh all layers in the
same transaction. This is a big win for usability, but some drawbacks glitches remain, such as
the lack of the undo/redo edit buffer, a very raw way of saving (ie quitting edition session) or
having the legend cluttered by so many edit symbols (a pen symbol).

Current proposal is to achieve the following targets:

  • Add a flag to allow dirtying edit buffer from the API when calling transaction.executeSQL('aSQLquery'). Currently, if nothing has been modified in QGIS layer, and we call a stored procedure (like SELECT cut_pipe_on_xy('pipe_id', 'x', 'y'); ), QGIS isn't aware that the dataset has changed, so the "save" action isn't available and users can only close edit session, which execute a ROLLBACK, and then obviously looses the current work.

  • Restore an undo /redo feature by taking advantage of PostgreSQL. If possible we will try to
    take advantage of PostgreSQL named Savepoints.

  • Allow to have some layers not switching to edit mode in QGIS, even if they belong to the
    same connection. These layers will still benefit from the instant refresh, but won't clutter the
    legend with the edit pen symbol everywhere, nor risk to load QGIS snapping cache for nothing.

Note : this is already possible in the Project properties / Identify / ReadOnly column :-) .

  • take advantage of PostgreSQL NOTIFY signals to trigger behavior in QGIS when something changes in the database ( see here ) . A first implementation proposal is to allow a map canvas refresh, but we can imagine really dynamic applications driven by the database events by converting NOTIFY messages into QGIS signals.

Proposed Solutions

Dirty Edit Buffer on ExecuteSQL

WIP

UNDO/REDO on SavePoints

A first PR is drafted here: qgis/QGIS#4765

NOTIFY / LISTEN

Notes:
NOTIFY / LISTEN is a rough PostgreSQL mechanism.
It is not possible to list all channels that could be emitted by in a database. Thus, users will have to type by hand the channel name and message name they want to listen to.
A notify message is raised using the SQL command:
NOTIFY aChannelName, 'aMessage';
where the message text is optional, or
SELECT pg_notify('fo' || 'o', 'pay' || 'load');
which is more convenient when using variables

UI modifications proposed are

  • add specific NOTIFY text box to "subscribe" a PostgreSQL connection in DB connection dialog

dialogpostgresconnectionmockup

  • add an option in layer properties rendering options dialog

A new checkbox to allow refreshing a layer when a signal is emitted. It has a line edit widget to add an optional filter for the message carried by the NOTIFY. The layer will be refreshed only if the NOTIFY message matches.

image

Example(s)

Any operational application where data can be modified somewhere and where QGIS needs to be notified without having to constantly refresh layers.

Affected Files

ui/qgsvectorlayerpropertiesbase.ui
ui/qgspgnewconnectionbase.ui
..

Performance Implications

NOTIFY/ LISTEN requires QGIS to have a permanent connection to PostgreSQL opened to keep listening the NOTIFY messages. Users must be aware that this will consume some more available connections in the connection pool.
Also using a connection pooler like pg_pool should be tested to be sure connections are not closed unexpectedly from the pooler side.

Further Considerations/Improvements

(optional)

Backwards Compatibility

This work is only intended to QGIS 3 branch and is not supposed to break compatibility

Issue Tracking ID(s)

(optional)

Votes

(required)

@haubourg
Copy link
Member Author

@m-kuhn @3nids Hi guys, we have an opportunity to improve transaction groups maykbe here. Is there anything missing currently for some thick database applications you know of?

@m-kuhn
Copy link
Member

m-kuhn commented Jun 23, 2017

Nice! This has been on my wishlist for a long time. I had a quick look and it sounds really good.

I was especially happy to see that signalling more than just a simple "data changed" signal (dynamic applications driven by the database events) is being considered. Have you put some more thoughts into that already? In particular, did you think of a particular "signal signature convention" (something like SELECT pg_notify('my_layer_event', '{"action": "add", "primary_key": {"my_object_id": 7} }'); ) or some other way to propagate in more details what happened on db side?

@haubourg
Copy link
Member Author

Well, by now we thought about just implementing something very raw but using some structured text inside the message could be powerful for sure. I just have to check there is no limitation to the text formatting and length in the payload part of a NOTIFY, but that would be possible for sure.
Your particular example implies we start exposing sort of a flexible API between QGIS and PostgreSQL isn't it?
That is really interesting, do you have in mind sort of a dynamic db-driven Meta-project (ouch, long sentence)?
@vmora any opinion ?

@rduivenvoorde
Copy link
Contributor

Mmm, maybe silly question, but having a database with 1.6 biljon measurements and flowing in new ones every couple of minutes; does this mean the datasource could be update in QGIS every time new measurements come in?

Does this mean that the full (sub)query will be called? Or is there some way to dynamically maybe update the subquery (which is based both on extent and on a time-range)?

@haubourg
Copy link
Member Author

Hi Richard,
NOTIFY can be raised by a trigger on each INSERT, and in your case it will probably not be a good idea indeed. You can do anything you want in trigger functions or stored procedure, so it will be up to you to see if the use case is smart. Did I answer the question correctly?

@rduivenvoorde
Copy link
Contributor

@haubourg thanks for answereing, but not sure :-) We normally show just 50000 features at a time (so a small subset of the full dataset), say the measurements of the last 10 hours (showing them using the timemanager).
So say you show last 10 hours of measurements, it would be cool if I could trigger getting the (newly inserted) last 10 minutes measurements without triggering the (slow) 50000 features query again.
I was hoping that the trigger could be used to 'silently' update QGIS-datasource every now and then.
But probably I'm hoping for too much magic :-)

@vmora
Copy link

vmora commented Jun 25, 2017

@rduivenvoorde to the best of my knowledge, a complete fetch is done on e.g. pan/zoom, even if some features are already there.

Your request seems similar (at the filter level) to having qgis make a smarter query, or two queries: one for added + and one for "removed". It's out of the scope of what we initially planned, but an interesting idea nonetheless.

@haubourg
Copy link
Member Author

@rduivenvoorde Ah ok, I understand now, and it reminds me the topics about caching data in QGIS to avoid fetching all the dataset, with all the hard questions about invalidating cache :)

There are only two hard things in Computer Science: cache invalidation and naming things.
-- Phil Karlton

NOTIFY could eventually be used as a system to notify a cache invalidation maybe, now you have to figure out how to send that message from PG. If you have tons of data being loaded every minute, a trigger on inserts or even update is probably not a good idea. Maybe you internal update procedure could explicitly cast the notify signal.

Just wondering loudly, is it possible to create custom providers to test that ? I mean a DB provider caching data, loading only new data (how to filter new data then? using a sequencial id?) and caching it ? Notify signal would be the trigger for a new fetch for the whole dataset. In any case this is very specific to that particular database workflow.

@haubourg
Copy link
Member Author

haubourg commented Nov 9, 2017

Hi all,
last point remaining is to be able to dirty edit buffer when transaction.executeSQL('aSQLquery') is run.
Currently transaction will fail if the SQL call returns a tuple.
We changed our stored procedure to return no tuple, using this tip: https://stackoverflow.com/questions/1343954/how-do-i-write-a-postgres-stored-procedure-that-doesnt-return-anything
But it still fails. @pblottiere will launch a PR to remove that limitation in QGIS and then we will be done with the Grant App'. I considering it's more a bug fix than a feature, opinions about that and feature freeze?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants