-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Projects in PostgreSQL #118
Comments
+1 I would like to see a web service style endpoint added in the future as well. |
Huge +1 from me too. Also storing projects in SQLite would be nice. |
+1 I actually was thinking at this improvement recently. I'd like to see it for GPKG too (I mean in core, not as a plugin), so I'm happy to see an API to abstract project persistence. |
Interesting - thanks! However - don't forget that there are also SERVICE names instead of the hostname/port/db combo regarding the connections to Postgis. Some organizations rely heavily on these SERVICE definitions and need them to be supported where ever there are connections to PostgreSQL DBs. |
+1 .. Would be nice to see that wherever it is possible (postgis, sqlite, gepackage) |
+1 thanks @wonder-sk |
Does it seem a little odd putting a serialized dump of all settings into a single field? Is it too complex to consider one field per project key? I know that would be vastly more complex to implement. |
Interesting. I suppose the list of metadata (last modified timestamp / username) is not exhaustive and we'll want to expand it in the future. In this case will the schema be altered by future QGIS versions? Did you consider to use a jsonb column for metadata? |
@andreasneumann good point. Looking into postgres docs they already have the URI string defined supporting SERVICE and even more features like connection through unix sockets or multiple hosts. Seems like it is best to adopt what they support, so if I read it correctly, for SERVICE one would use URI like this: @tomchadwin We really want to dump everything into a single field - the project structure is complex and it changes over time, so we really just want to use it as a simple binary storage. (that's how we store layer styles as well) Also it makes it easier to handle other backends like this as well. @m-kuhn Haven't thought about the metadata much really and I am not even completely convinced we need those for the table, but it seemed like a good idea as one can get similar metadata from filesystem. Using jsonb could be a good option to keep it extensible for the future... |
+1
Thanks
Il 19 marzo 2018 10:23:15 GMT-04:00, Andreas Neumann <notifications@github.com> ha scritto:
…Interesting - thanks!
However - don't forget that there are also SERVICE names instead of the
hostname/port/db combo regarding the connections to Postgis.
Some organizations rely heavily on these SERVICE definitions and need
them to be supported where ever there are connections to PostgreSQL
DBs.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#118 (comment)
--
Sorry for being short
|
@wonder-sk - regarding SERVICE defintions: The SERVICE definition may also contain the dbname, or even the user name. Typical usage is to have service configurations with a combination of "host/port/dbname" and separate user and pw. Password may be specified in a .pgpass file. See also https://www.postgresql.org/docs/current/static/libpq-pgservice.html or https://www.endpoint.com/blog/2016/10/26/postgres-connection-service-file |
I have just realized I have forgotten to add a small Q&A section I had prepared - now in the QEP... @andreasneumann right. For the record, the URI syntax I was referring to is documented here: https://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING |
I think this is a very interesting issue; it deserves great attention and much caution.
I suggest considering both options but favouring the latter. |
@cbertelli thanks for your ideas, out of budget here, you are welcome to support further work |
@cbertelli The complexity of the QGIS project file (XML) is really high - it goes far beyond just having a list of layers with styles... there is layer tree, extra project properties, extra map canvases, layer form definitions, attribute constraints, 3D view settings, print layouts, server configuration and much more. Writing all this information into some extra tables would be an overkill. At the same time the format would be still pretty much "proprietary" - most of that will always be specific to QGIS and there is no standard for such thing that would cover everything we need. If anyone is interested in parsing data out of QGIS project files, there is C++/Python API for it - I can imagine some plugins reading/writing projects in other (standard?) formats, but that would be always at the expense of loosing some project information (just like when using SLD/SE styles). |
Leaving aside budget considerations, there is also a consideration that such a structure would suggest a stability in the project file which is not existent. We also don't want to promise such a stability because this would no longer allow to continue developing QGIS at the current pace. The QGIS project file is open, but not an open standard. If we go somewhere in the direction of such meta-information, this will be a subset of configuration options, based on a standard which is to be developed in a process indendent from QGIS (if it doesn't already exist anyway). |
This is already off-topic, but for those interested in interoperability with other GIS applications: there is OGC OWS Context (http://www.owscontext.org/) which is meant to store information about a bunch of layers and their styling in a standard format. @doublebyte1 has implemented support for it in QGPKG plugin. It also uses XML/JSON to encode the project - see https://github.com/pka/qgpkg/blob/master/owc_geopackage_extension.md |
@wonder-sk Nice proposal. Another point to note is that we have the optional QGZ zipped format available. On another thread, we agreed that switching it as the default file format would be good to avoid users to have to be aware of ancillary qgd database and any future additional ressources stored along the project. That would mean it could be stored "inside" the project container. I suppose we want to rely on GDAL VSI virtual file system http://www.gdal.org/gdal_virtual_file_systems.html . @rouault do you think extending the VSI to file hosted in data bases could provide the abstraction required to seamlessly access any files in PG, GPKG, etc DB? |
There's already a similar thing used internally by the Oracle GeoRaster driver. It is /vsiocilob/user,password,server,table_name,rowid |
For zipped/non-zipped project format, this should be transparent from the point of view of the storage backend - it is just some binary data that will be passed to/from QgsProject. In terms of support for auxiliary storage (.qgd), I would say that it should simply always get embedded into the project ZIP file if it is needed. Not sure how we need GDAL VSI for access to the additional resources - all that should be handled by QgsProject I guess... |
Hum... It would be indeed very simple to store the project content into a single column, but IMHO it would be good to have the project into JSONB (or XML, but I would prefer JSONB) to be able to query the content easily. I reckon JSONB is for PostgreSQL. For Sqlite, we could use http://sqlite.org/json1.html |
|
@mdouchin we can't really store projects in JSONB as the projects are only stored in XML or in ZIP (containing project XML + other resources). Just like .qgs project files should not be accessed directly with third party XML parsers because the file format may change at any time, the same applies also when project files are written to the database: they should be treated as arbitrary binary data and only accessed through QgsProject API (because that is the only safe way to do it). Regarding to whether keep also format type in the |
@luipir The support for project reading/writing is not going to be directly related to data providers: we want to support also arbitrary project file backends independent from providers (e.g. imagine a plugin adding support for storage of projects in Dropbox / Nextcloud). Also, data providers cannot be created in python as far as I know (no support in QgsProviderRegistry). Not sure I understand why we should go through VSI if we can access data directly with more flexibility (e.g. through QgsPostgresConn for postgres) |
@wonder-sk sorry, I was biased by the saveSLD feature. You're right there is no relation with dataProvider. |
First one to use the new interfaces to write an Autodesk Vault plugin gets a massive thumbs up from me! I'm guessing we'll see a git project plugin at since stage too. Proposal looks good to me, but I'm still a bit unclear how the auxiliary storage would work with it. Can you clarify that? Just to put an end to the "don't use XML blob" arguments: it's simply not possible until qgis 4.0 at the minimum -- XML reading and writing is baked into the stable API in hundreds of QGIS classes. We CANNOT change this. |
One more question: will there be a common interface for querying available projects from a backend? |
@nyalldawson I would say that if auxiliary storage is needed with the project, the sqlite database file should be saved as a zipped project with the .qgd file in the zip file. That seems like the only reasonable solution to me. Creating project-specific tables with aux data inside postgres seems just too complicated and fragile. Often only admins are allowed to create new tables in shared databases.
Yes: there should be virtual methods to read/write project, list projects, remove project, rename project (?) and for display of backend specific GUI to pick project name for load/save. |
The issue here is that I can only see our reliance on auxiliary tables growing more with time, and this limitation will definitely hurt. At the same time, I'm also against trying to recreate auxiliary tables on the server. Just thinking aloud, but could a big blob of the zipped data just be thrown in as a column in the database table? |
Just thinking aloud, but could a big blob of the zipped data just be
thrown in as a column in the database table?
Just a blob in the end. I think this is the best method really. We just
create a qgz file local (in memory) and then save into the database. Like
it has been said before we don't really want people to be messing around
with the raw project files as there is no spec to follow
…On Tue, Mar 20, 2018 at 7:38 AM, Nyall Dawson ***@***.***> wrote:
I would say that if auxiliary storage is needed with the project, the
sqlite database file should be saved as a zipped project with the .qgd file
in the zip file.
The issue here is that I can only see our reliance on auxiliary tables
growing more with time, and this limitation will definitely hurt.
At the same time, I'm also against trying to recreate auxiliary tables on
the server.
Just thinking aloud, but could a big blob of the zipped data just be
thrown in as a column in the database table?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#118 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAXS3I7V0Y37J59MpnOTi0SKZakffStCks5tgCVIgaJpZM4SwOUV>
.
|
Using a database as if it was a filesystem has some side effects. Having PostgreSQL parse XML could help avoid security issues. Storing the project as a blog saves a lot of space, but has some drawbacks. |
@cbertelli are you talking about locks ? |
XML is a structured data inside PostgreSQL, the DBMS checks it before storage and you can also query it with xpath. You cannot write "format C:\ /s" inside it as a "joke" for your buddy. If you want to implement security checks you don't need to use huge and obscure regular expressions to find out threats. |
Hi, How do you deal with user concurrency ?
Any conflict management mechanism ? Like hashes of the project loaded and saved so as to warn the user if he is going to overwrite a different project than he initially loaded ? Or versioning the projects ? What would be the mechanisms to clean the projects DB ? |
Also :
|
@vpicavet The project storage interface does not deal with user concurrency at all - it is left for backends to deal with it. Some implementations may do versioning if they feel like doing that. For PostgreSQL implementation I do not plan any conflict management or versioning: simply the last save wins - just as if people were storing projects on a shared network volume. If anyone has a need for versioning, there could be a more complicated "PostgreSQL with versioning" storage implementation. Cleaning of projects DB - the storage implementations should support removal of projects as well. In GUI this could be triggered from browser panel or through custom GUI when loading/saving projects for the given implementation. Location of the table: Where to get user names: Private information - agreed it is an important topic, but I think this is fairly independent from this QEP - the storage of projects in PostgreSQL is not really different from storage of project files on a shared network volume where the same risks apply. |
Well, I think for common users a filesystem is by default local, not shared, whereas a database is shared with other users by default. As for the table location, if you can choose the schema it's indeed ok. As for user names, you would stick with PostgreSQL user then. There are often shared user connections and cases where the user name would better be chosen from the OS or QGIS itself, but this probably does not matter that much. |
@vpicavet There are secure alternatives for credentials storage: the QGIS authentication database, storing credentials in a shared project is normally not a good idea. |
Well, I mainly see file storage on shared network drives out there. I agree with Martin that the PG storage will not change the situation here. We have the service file definition or the authentication system to deal with externalized credentials and Auth. |
I'm referring to MapInfo project format with that comment, not XML -- you can insert any kind of system command inside a mapinfo workspace and it's executed silently when loading the project! (This also applies to tab files loaded within mapinfo... shockingly insecure stuff!) Sorry, the discussion got sidetracked there. I'm -0 on using a postgres XML data type to store the qgis project XML, simply because this would apply a validity check on the XML. This constraint isn't currently present, and I suspect it may break plugins (and possibly even core code). As others have stated, better to treat qgis XML as its own proprietary format and not rely on parsing it. The format changes too frequently for any parser to be stable (and indeed - even when qgis server itself used to use this approach it was extremely fragile). |
+10 ! |
Yes although maybe we should just store the qgz format as blob and be done with. If you are not meant to parse it from the database directly anyway we can store the other project data in a single file like the file system. @wonder-sk can I suggest putting in a "version" column which is just an incrementing id that gets changed on save. This can be used in future to compare the local version to the server version to see if someone else has written to that record and if they have you can then see what the user id and date time is.. We can use date and time for this however it allows for change the database timezone (lets say you have it in AWS and move the region) without breaking the version check. |
I think we could generalise this a bit, if QgsProjectStorage had some virtual methods like Currently the file-based projects have a nice feature where the modified timestamp is checked before save -- this could be extended to use |
I'd be +1 on jsonb here. I think this could then work nicely alongside #115 and we could add the other project metadata there too - e.g. keywords/subject/title/abstract/etc. |
@NathanW2 I would really like to stay away from adding any extra column like "version" at the project storage level:
Working on the implementation for a bit, this seems like the only reasonable solution. Otherwise we get this explosion of combinations: 1. xml or zip, 2. with or without aux db, 3. file system or custom storage. If we allow all options, we have 2 x 2 x 2 = 8 possible cases that may need different handling in different cases. I am just a simple person - my head may explode when trying to deal with more than few combinations. @nyalldawson if I understand correctly your suggestion regarding #115, you would like to have a cache of some pieces of project metadata like title/keywords to be able to query those quickly without having to open the whole project? I can imagine having optional support for that in project storage interface at some point. I am just not looking forward to the day when someone will want to cache also map thumbnails :-) |
@wonder-sk I'm ok if you don't want to implement it first go just maybe do what Nyall had and have a virtual method to get the version hash and a method to compare the version which can then just be what ever the project storage thinks is the best method to work that out. We will have a bit of time left for 3.2 so I would say do whatever you think is best to get it merged and then we can test it out. |
IMHO versioning can also be done by the backend, via rules or triggers, and should be left for the future. |
About concurrency: would it be an idea to force single write-use of a project and have a 'locked' column. So the first user 'locks' the project (username written in locked column)? |
@rduivenvoorde this kind of locking would be tricky to get right. Problems I see:
Nyall had a good point - when saving a project, QGIS checks whether the timestamp has not changed since the timestamp when it was loaded - if it did, it shows a warning: "The loaded project file on disk was meanwhile changed. Do you want to overwrite the changes?". It seems to me that this is good enough if we extend it to custom project storage as well. |
agreed with @wonder-sk here. Locking tools are good ideas at start and end-up in nightmares so often. |
I understand and I admit all my suggestions may become constraints and slow down development, but I've learnt to think that standards are opportunities for freedom more that constraints. @doublebyte1's ideas (thanks @wonder-sk for pointing them out) and development are really promising. Would it be feasible to split this proposal in two parts, one that build upon existing standards, the other that stores what is not yet stable or not useful for the user? |
@cbertelli I am not sure if splitting the proposal would really help anything. In the current form it covers 1. project storage interface, 2. implementation for PostgreSQL. Anyone is free to implement a backend of their choice - either storing QGIS project format directly or using a different format that's understood by other applications as well. |
I have a question on this feature I see already implemented. Does QGIS Server also supports this and how? |
Hello I just tried this option from windows 7 with postgres 9.4 and Qgis Bon 3.2 is not working I have activated the option to save projects in the postgres connection Maybe some additional configuration is required? Please help this option is more than interesting for me |
Please file a ticket under here: |
I have noticed that all users of a database must alter their connection properties to enable saving of QGIS projects in PostgreSQL. Seems this should only be done once by the DBO without every user having to alter their connection details...? |
QGIS Enhancement: Projects in PostgreSQL
Date 2018/03/19
Author Martin Dobias (@wonder-sk)
Contact martin dot dobias at lutraconsulting dot co dot uk
maintainer @wonder-sk
Version QGIS 3.2
Summary
The idea is to introduce the possibility to save/load projects inside PostgreSQL databases in addition to saving/loading projects as .qgs files. This would be done by introducing a QGIS-specific table to the database (in a similar fashion like in-database storage of layer styles works already).
There is already QGPKG python plugin that does a similar thing - load/save QGIS projects in GeoPackage files. This QEP only focuses on PostgreSQL, however it paves the road for native support for other project backends like GeoPackage.
Proposed Solution
The “Project” menu should get two more menu items: “Load from…” and “Save to…” - both of them will then list of available project storage backends (e.g. PostgreSQL, GeoPackage). Clicking these items would open backend-specific GUI to load/save project. In case of PostgreSQL it will allow user to pick connection, schema and project name.
In addition to that, there should be support in Browser panel: project files stored in database should show up in the tree and dragging a project into QGIS would load it. In the future there may be support for drag&drop of project files between different project storage backends - but that is not going to be available in the initial implementation.
When working with a project that is already stored in database, there should be no difference in the workflow - clicking “Save” action would overwrite the existing version in the backend storage.
Database Storage
We will introduce a new table
qgis_projects
which would be only created the first time user wants to save a QGIS project in a particular schema. The table will have the following columns:Design
We will add
QgsProjectStorage
class - an interface that abstracts listing of project files, reading, writing and their removal. Individual storage backends would implement this interface to provide the project read/write support.Another new class,
QgsProjectStorageRegistry
, will keep a registry of project storages. It will be a singleton accessible fromQgsApplication
just like other singletons.QgsProject
class will support backend-specific file names. If file name starts with “abc:” where “abc” is one of the known storage backends, the initial phase of project reading and the final phase of project writing is delegated to the storage backends. If there is no such match, project will be stored as a .qgs file in a path as usual.Proposed project filename syntax for PostgreSQL:
postgres://[user[:pass]@]host[:port]/?dbname=X&schema=Y&project=Z
Q & A
Q: How about paths to layers or other files if they are not from the database?
All references to files will be saved as absolute paths - if a project is used by multiple users, these paths should be on a shared drive so that the projects can be successfully loaded by everyone.
Q: Will plugins be able to hook into this functionality?
Yes - anyone could implement a custom QgsProjectStorage and register it in QgsProjectStorageRegistry singleton.
Affected Files
QgsProject
Performance Implications
None
Backwards Compatibility
N/A
Votes
(required)
The text was updated successfully, but these errors were encountered: