-
Notifications
You must be signed in to change notification settings - Fork 52
Description
Windows 10
Python 3.13.5
libvips 8.17
I am using pyvips very successfully to stitch large images. But now I am hitting a limit.
In Windows there seems to be a maximum number of simultaneously open files. It can be increased to 8192 but no further.
import win32file
win32file._setmaxstdio(8192)
I have more than 14000 single images (each with a size of 2048x2048 Px). ~10% overlap. I am correcting each single image with vips (vignetting, scaling, rotation,...). With the merge() command I am constructing one single large image. Finally I get my region of interest for the large image with an affine() transformation command. I am writing my output image to a zarr array:
block_size = 2**14
z = zarr.zeros(shape=[im_shape[0], im_shape[1]],
chunks=(block_size, block_size),
dtype='uint8',
store = r'D:\temp\z',
compressor = zarr.Blosc(cname='zstd', clevel=3, shuffle=zarr.Blosc.SHUFFLE))
for y in range(0, im_shape[0], block_size):
for x in range(0, im_shape[1], block_size):
chunk = vips_large_image.crop(x, y, min([x + block_size, im_shape[1]])-x, min([y + block_size, im_shape[0]])-y)
z[y : min([y + block_size, im_shape[0]]), x : min([x + block_size, im_shape[1]])] = chunk.numpy()
During that i am tracking the number of open files for the process. And this number increases continuously until it reaches 8192. That ends it with an error:
print(len(psutil.Process().open_files()))
439
531
621
...
...
...
8021
8097
8178
_______________________________________________________
Error: unable to write to memory
D:\some_file.tif: unable to open for read
system error: Too many open files
I have tried the invalidate() command on the chunks after writing. No effect. I suppose this controls caching only. Not if a file stays open or not.
I have thought about multiprocessing as it would allow me to open 8192 images for every process? Also this could maybe increase performance? For some reason CPU usage never surpasses 20% on my system during processing? Or might there be a different bottleneck aside from processing power?
Is there a more elegant and straight forward solution?