New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OGR provider: only enable journal_mode = wal when editing. requires GDAL >= 3.4.2 (fixes #23991) #47098
Conversation
I'd love to see this backported. Maybe after it lives on 3.24 for a month or so. |
Agree, especially if this means that the GPKG open in read only mode will stop to be marked as changed on the file system. |
95a062a
to
99aecd8
Compare
…DAL >= 3.4.2 (fixes qgis#23991) This requires the NOLOCK open option of the GPKG driver added per OSGeo/gdal#5207 For earlier GDAL version, previous behaviour is kept. With GDAL >= 3.4.2, when creating a QgsOgrProvider object, we first open it in update mode without forcing WAL, to get the appropriate capabilities, close it, and re-open it in read-only mode with the NOLOCK=YES open option. This option will only be honoured if the file wasn't already in WAL mode. When editing a layer, the file is re-opened in update mode and with enabling WAL to avoid blocking between readers and writers. When closing a file, journal mode is attempted to be reset to DELETE as before. I've verified that test_provider_ogr_gpkg.py and test_provider_ogr.py pass locally with GDAL master + OSGeo/gdal#5207
99aecd8
to
f280714
Compare
This would fix #23991, thank you so much in advance. |
Thank you SO very much for handling this! 👍 🎉 Looking forward to the next releases. FYI @paleolimbot |
When a file is opened as NOLOCK=yes, does that really mean no locking? Isn't it then the case that reading from the geopackage is unsound if some other program has it open? I think an underlying issue is likely a difference of opinion as to the nature of geopackage files. qgis seems to view them as private data files within the qgis world, as opposed to databases that can be accessed by others, or at least that's how it seems reading the discussions. sqlite is an odd case in databases, as it can be both private storage for a program (perhaps the more common case) as well as functioning like a regular database. |
yes
see bottom of comment of OSGeo/gdal#5207 (comment) |
@rouault tested with QGIS 3.24.0 and GDAL 3.4.2 on Windows:
|
The NOLOCK=YES open option is mostly an internal detail. Users shouldn't need to set it generally. The provider is supposed to do the right thing, and set it automatically when the layer is in read-only mode, whatever how it was added to the project |
It seems to me it is not the case. Adding a GPKG layer (not opened by any other process) with drag and drop or using the Add Vector Layer window and leaving the NOLOCK option as |
Has your QGIS version being rebuilt against GDAL 3.4.2 headers (and not just using it at runtime) ? |
Actually not. I'm using the GDAL 3.4.2 runtime with QGIS 3.24.0 built against GDAL 3.4.1. So it should is this the issue. Then I will try again with a QGIS version built against GDAL 3.4.2. Thanks for the clarification. I apologise for the noise... |
Hi The behaviour seems correct to me when a GPKG file is dragged and dropped in QGIS or added to a map using the Browser panel (in such cases the whal and shm files are not created and the modify date of the gpkg file is not changed). If the GPKG file is added to a map using the Vector Layer window, then the behaviour is always the same (whal and shm files are not created and the modify date of the gpkg file is not changed) however the user sets the NOLOCK option ( |
@agiudiceandrea I was mentioned in your comment, but you probably meant @rouault ? |
Please, excuse me! I meant to tag Even Rouault. |
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs qgis#47098 (comment)
it shouldn't be exposed (it is currently, as it is done automatically since QGIS auto-discovers open options from GDAL driver metadata). Fixed per #47992 |
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs #47098 (comment)
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs #47098 (comment)
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs #47098 (comment)
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs #47098 (comment)
This option is automatically set by the OGR provider. No need for the user to mess with it. Refs qgis/QGIS#47098 (comment)
Fixes qgis#48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before qgis#47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProviderUtils which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes qgis#48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before qgis#47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before qgis#47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes #48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before #47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes #48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before #47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes #48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before #47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes #48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before #47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
Fixes #48012 Basically, for offline editing, we ask QgsOgrProvider to adopt the behaviour it had before #47098, that is use SQLite3 journal_mode = WAL even in read-only mode. Sorry, this "fix" is horrible, but I couldn't get with something better after many hours of investigation. The root cause seems to be that QgsOfflineEditing creates layers using direct OGR API or sqlite3 API, and also uses QgsOgrProvider which has a GDAL dataset reuse mechanism. And all that does not place nicely at all. Maybe the root root cause is that the OGR GPKG driver establishes the list of layers at its opening, and if creates a new one behind its back, which is prone to happen with the dataset caching metchanism, the GDALDataset::GetLayerByName() method doesn't see the newly created layer. With that in mind, I tried to add calls to QgsOgrProviderUtils::invalidateCachedDatasets() but to no avail. Also QgsOgrProviderUtils::invalidateCachedLastModifiedDate() which is used by QgsOgrProvider::createEmptyLayer(), but to no avail. So basically I don't understand why offline editing worked before with a GPKG backend. I'd say it is pure luck. Note: I also saw that qgsofflineediting.cpp at line 315 calls offlineLayer->dataProvider()->invalidateConnections() but that method is only implemented in the Spatialite provider. I tried to implement it in the OGR one, but to no avail too.
hopefully #48119 should fix network issues, but only compile-tested here |
Thanks a lot again! I will try the MingW64 build and report back ASAP. |
This requires the NOLOCK open option of the GPKG driver added per OSGeo/gdal#5207
For earlier GDAL version, previous behaviour is kept.
With GDAL >= 3.4.2, when creating a QgsOgrProvider object, we first open
it in update mode without forcing WAL, to get the appropriate
capabilities, close it, and re-open it in read-only mode with the
NOLOCK=YES open option. This option will only be honoured if the file
wasn't already in WAL mode.
When editing a layer, the file is re-opened in update mode and with
enabling WAL to avoid blocking between readers and writers.
When closing a file, journal mode is attempted to be reset to DELETE as
before.
I've verified that test_provider_ogr_gpkg.py and test_provider_ogr.py
pass locally with GDAL master + OSGeo/gdal#5207
Could be worth backporting to LTR, but perhaps would require some additional testing from others