-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deleting temporary files #608
Comments
Thanks @plvoit for taking the time to create this issue and providing the information. As you are referring to those files which are created like this: tmpfile = tempfile.NamedTemporaryFile(mode="w+b").name
ogr_src = gdal_create_dataset(
"ESRI Shapefile", os.path.join("/vsimem", tmpfile), gdal_type=gdal.OF_VECTOR
) Those files are created inside the I'm not sure when the user doesn't need that file any more. We could assume, we are safe to delete if the object is getting out of scope. If you are up to this I'll gladly accept a PullRequest adding that functionality. But as you've also asked about getting back the filenames, there is already machinery for that. You can use the GDAL Dataset to retrieve the filename to delete manually: tmp_filename = src.ds.GetDescription()
print(tmp_filename)
|
Hello kmuehlbauer, |
Good work @plvoit, hope to see you around more! |
Thank you, glad I could contribute a little! |
@pvoit I'd recommend to ask this question over at https://openradar.discourse.group. You get a much wider audience there. I'll have to think about this a bit and let you know over there. In general it should be possible also with the power of GDAL. |
Functions like VectorSource and ZonalDataPoly create temporary files which don't get deleted after a script is completed. With large multiprocessing jobs this can cause the system to crash because it fills up the tmp-directory
MCVE Code Sample
If one follows this tutorial several temporary folders get created:
https://docs.wradlib.org/en/stable/notebooks/zonalstats/wradlib_zonalstats_quickstart.html
Expected Output
Delete unnecessary tmp-files or at least return tmp-file names to the user
Problem Description
When running a large job which processes many polygons and rasters the storage of temporary files can cause the system to crash.
It would be nice if a user had the option to remove the files which get created when, e.g, reading shapefiles with VectorSource and similar functions. For this it would be necessary to know the the names of the temporary directories which get created.
The name gets created in gdal.py in the methdo _check_src and stored in the variable tmpfile.
If these tmpfile names would be stored somewhere and returned to the user, one could manually delete these files when not needed anymore.
Version
Output of wrl.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:45:29)
[GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.15.0-56-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: ('en_GB', 'UTF-8')
libhdf5: 1.10.6
libnetcdf: 4.8.0
xarray: 0.20.2
pandas: 1.3.4
numpy: 1.21.4
scipy: 1.7.3
netCDF4: 1.5.7
pydap: None
h5netcdf: 0.11.0
h5py: 3.3.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: 1.2.8
cfgrib: 0.9.8.5
iris: None
bottleneck: None
dask: 2022.04.0
distributed: 2022.4.0
matplotlib: 3.5.1
cartopy: 0.20.0
seaborn: 0.11.2
numbagg: None
fsspec: 2022.11.0
cupy: None
pint: None
sparse: None
setuptools: 65.5.1
pip: 22.3.1
conda: None
pytest: 6.2.5
IPython: 8.7.0
sphinx: 5.3.0
wradlib: 1.18.0
The text was updated successfully, but these errors were encountered: