Skip to content

BUG: sqlite read error with ProcessPoolExecutor #933

Closed
@nialov

Description

@nialov

This bug seems similar (or exactly the same ?) as #426

Code Sample, a copy-pastable example if possible

I've created a poetry environment and Python scripts which reproduce the bug on my system. Due to the parallel nature it might not get reproduced on every system (?).

git clone https://github.com/nialov/pyproj-multiprocessing-bug-hunt.git
cd pyproj-multiprocessing-bug-hunt
# Need poetry installed on system
poetry install
# Script with parallel processes and which tries to reproduce bug
poetry run python script_parallel.py
# Sanity check script with sequential processing which doesn't error.
poetry run python script_parallel.py

Problem description

pyproj 3.2.0 errors when reading its sqlite file in parallel using Python concurrent.futures.ProcessPoolExecutor. I assume any method to create parallel processes in Python will recreate this.

This bug occurred with pyproj 3.2.0 and is not present with pyproj 3.1.0.

Error message:

➜ pr python script_parallel.py
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/usr/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/lib/python3.8/site-packages/pyproj/crs/crs.py", line 326, in __init__
    self._local.crs = _CRS(self.srs)
  File "pyproj/_crs.pyx", line 2347, in pyproj._crs._CRS.__init__
pyproj.exceptions.CRSError: Invalid projection: EPSG:3067: (Internal Proj Error: proj_create: SQLite error on SELECT auth_name FROM authority_list: database disk image is malformed)
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "script_parallel.py", line 13, in <module>
    print(process.result())
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
pyproj.exceptions.CRSError: Invalid projection: EPSG:3067: (Internal Proj Error: proj_create: SQLite error on SELECT auth_name FROM authority_list: database disk image is malformed)

Expected Output

Should work in parallel. Added a sequential example script script_sequential.py as sanity check.

Environment Information

➜ pr pyproj -v
pyproj info:
    pyproj: 3.2.0
      PROJ: 8.1.1
  data dir: /home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/lib/python3.8/site-packages/pyproj/proj_dir/share/proj
user_data_dir: /home/nialov/.local/share/proj

System:
    python: 3.8.10 (default, Jun  2 2021, 10:49:15)  [GCC 10.3.0]
executable: /home/nialov/.cache/pypoetry/virtualenvs/pyproj-multiprocessing-bug-hunt-ovaqiMDF-py3.8/bin/python
   machine: Linux-4.19.84-microsoft-standard-x86_64-with-glibc2.32

Python deps:
   certifi: 2021.05.30
       pip: 21.1.3
setuptools: 57.4.0
    Cython: None

Installation method

Installed from pypi onto Ubuntu 20.10.

Metadata

Metadata

Assignees

No one assigned

    Labels

    projBug or issue related to PROJ

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions