Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sporadic error with multi-threaded overview generation #10245

Closed
matsamentet opened this issue Jun 19, 2024 · 2 comments · Fixed by #10246
Closed

Sporadic error with multi-threaded overview generation #10245

matsamentet opened this issue Jun 19, 2024 · 2 comments · Fixed by #10246
Assignees

Comments

@matsamentet
Copy link

matsamentet commented Jun 19, 2024

What is the bug?

Running BuildOverviews on a GeoPackage dataset with config GDAL_NUM_THREADS=ALL_CPUS and bilinear resampling results in sporadic errors. The errors seem to happen around 50% of the time. Simpler resampling methods such as average and nearest does not appear to cause the error.

Steps to reproduce the issue

Running a script to repeat the BuildOverviews() function on the same source
MassiveGeopackage_output.zip
file multiple times. The files are stored on a spinning hard drive.

from osgeo import gdal
import os
import shutil

srcfile = r'F:\05_Annet\2024-06-19-gdaladdo-bug\data\MassiveGeopackage_output.gpkg'
copiedfile = r'F:\05_Annet\2024-06-19-gdaladdo-bug\out\test-overviews.gpkg'

for i in range(20):
    print(f'Attempt {i}')
    
    shutil.copy(srcfile,copiedfile)
    gdal.SetConfigOption('GDAL_NUM_THREADS','ALL_CPUS')
    
    gpkg_ds = gdal.OpenEx(copiedfile,1)
    gpkg_ds.BuildOverviews(resampling='bilinear', overviewlist=[2,4])
    gpkg_ds = None

    os.unlink(copiedfile)

This results in the following output:

Attempt 1
ERROR 1: sqlite3_exec(COMMIT) failed: database is locked
ERROR 1: F:\05_Annet\2024-06-19-gdaladdo-bug\out\test-overviews.gpkg - zoom_level=1, band 1: IReadBlock failed at X offset 0, Y offset 0
Attempt 2
Attempt 3
ERROR 1: sqlite3_exec(COMMIT) failed: database is locked
ERROR 1: F:\05_Annet\2024-06-19-gdaladdo-bug\out\test-overviews.gpkg - zoom_level=1, band 1: IReadBlock failed at X offset 0, Y offset 0
Attempt 4
Attempt 5
ERROR 1: sqlite3_exec(COMMIT) failed: database is locked
ERROR 1: F:\05_Annet\2024-06-19-gdaladdo-bug\out\test-overviews.gpkg - zoom_level=1, band 1: IReadBlock failed at X offset 0, Y offset 0
Attempt 6
Attempt 7
Attempt 8
Attempt 9
Attempt 10
Attempt 11
ERROR 1: sqlite3_exec(COMMIT) failed: database is locked
ERROR 1: F:\05_Annet\2024-06-19-gdaladdo-bug\out\test-overviews.gpkg - zoom_level=1, band 1: IReadBlock failed at X offset 0, Y offset 0
Attempt 12
Attempt 13
Attempt 14
Attempt 15
Attempt 16
Attempt 17

Versions and provenance

Windows 11
Python 3.12
GDAL 3.8.2 wheel from https://github.com/cgohlke/geospatial-wheels

Additional context

No response

@rouault rouault self-assigned this Jun 19, 2024
rouault added a commit to rouault/gdal that referenced this issue Jun 19, 2024
rouault added a commit to rouault/gdal that referenced this issue Jun 19, 2024
@jratike80
Copy link
Collaborator

Of course there should be no errors of crashes, but I wonder how much sense there is in creating overviews in multiple threads. GeoPackage is an SQLite database and only one process at a time can write. But maybe it is slower to create the overviews than to save them into the database. Fortunately that should be rather easy to test.

@rouault
Copy link
Member

rouault commented Jun 19, 2024

but I wonder how much sense there is in creating overviews in multiple threads.

Multithreading in overview generation only affects the computation part (well for the GeoTIFF driver, this can also cause multi-threaded writes within the driver). This can help a bit depending on the resampling kernel. The issue here was that one of those computation worker threads accidentally queried the NBITS metadata item on the overview band, which on the first call can cause a SQLite request, wheras the I/O thread could potentially read or write GeoPackage blobs at the same time, hence a concurrent use of the same SQLite handle. Fixed per #10246

rouault added a commit to rouault/gdal that referenced this issue Jun 19, 2024
rouault added a commit to rouault/gdal that referenced this issue Jun 20, 2024
rouault added a commit to rouault/gdal that referenced this issue Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants