-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Internal Proj Error: [...] database disk image is malformed" when multiprocessing since pyproj 2.3 #426
Comments
If you don't mind trying PROJ 6.2, you could give this a go: #412 |
Also might be interested in #386 |
On 04/09/2019 16:13, Alan D. Snow wrote:
If you don't mind trying PROJ 6.2, you could give this a go: #412
<#412>
For now I have pinned our packages to <2.3, since proj4 >=6.2 is not yet
packaged on conda-forge, and I cannot easily package a patched version
of pyproj either.
But this does look very useful and probably fixes this issue. I will see
if I can somehow manage to get a conda environment with it and proj 6.2
to test in.
|
Side note, on conda-forge is is renamed to |
Even with Proj 6.2+, we are encountering this bug; or a very similar one, where using Proj from multiple processes seems to corrupt the database file. Here:
as one can observe for instance in https://travis-ci.org/PyPSA/pypsa-eur/jobs/639710706?utm_medium=notification&utm_source=email . We are not calling pyproj ourselves, but call into Thanks for any help, though |
@coroa, you are correct. |
Thanks for the fast answer! I managed to find a solution in the meantime. For future reference for other people finding this issue based on the "database disk image is malformed" error in conjunction with multiprocessing (and maybe also for @TimoRoth). In my case, making sure that the module imports of gdal happened after forking to multiple processes fixed the issue. The import of gdal does create a gdal context, which -- I suspect -- also contains an sqlite database handle to the proj.db database, which gets corrupted by multiple processes writing to it. Another working alternative is to use |
Doesn't this needs reporting to gdal? The |
After some further analysis, this is caused by gdal using the proj C API itself, to create non-autoclosing proj contexts. In the long run, a way to globally control the proj behaviour would be ideal. An env var that forces it to always autoclose the db, or at least change the default from false to true. |
Code Sample, a copy-pastable example if possible
It's unfortunately not possible to produce a minimal example, this only happens in the full setup of our project, but is 100% reproducible there.
See for example: https://travis-ci.org/OGGM/OGGM-Anaconda/jobs/580670196#L1406
I tried triggering this by just calling the pyproj.Proj() invocation in a lot of parallel processes, but it was not impressed by that and worked fine.
Problem description
Ever since pyproj 2.3
pyproj.exceptions.CRSError: Invalid projection: +init=epsg:4326 +type=crs: (Internal Proj Error: proj_create: SQLite error on SELECT auth_name FROM authority_list: database disk image is malformed)
occurs when trying to do pyproj.Proj("+init=EPSG:4326", preserve_units=True) in our concurrent multiprocessing setup.
Turning off multiprocessing and running things sequentially works around the issue.
Downgrading pyproj to <2.3 also fixes it. Mind that I did not downgrade the underlying proj4 binary library, so purely downgrade pyproj is enough to stop this from happening.
Environment Information
Installation method
Conda environment information (if you installed with conda):
Environment (
conda list
):Details about
conda
and system (conda info
):The text was updated successfully, but these errors were encountered: