Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error short read /dev/urandom: Success fatal exists GDAL commands #9024

Closed
mrwormhole opened this issue Jan 4, 2024 · 1 comment
Closed
Assignees

Comments

@mrwormhole
Copy link

mrwormhole commented Jan 4, 2024

Expected behavior and actual behavior.

We have been using GDAL for last 2 years as a docker image with our server that is written in Go, we have been using gdallocationinfo / gdalinfo / gdalwarp / gdal_translate / gdal_calc.py as CLI app within our server application in order to perform tiffs that are located inside GCP cloud storage (object bucket) through /vsigs/ virtual file system with no problems.

Since we are running higher traffic and more GDAL commands in a longer period of time, we are getting "error short read /dev/urandom: Success" more often where errno is -1 (Success). Very rarely also "error short read /dev/urandom: No such file or directory" where errno is 2 (ENOENT). This causes us to receive fatal exists from GDAL itself and unsuccessful tiff creations which we don't know where to trace in GDAL source code after this part. But what we do know is we are getting this occasionally in gdallocationinfo and gdalinfo and gdalwarp.

We have looked at the GDAL but couldn't determine why GDAL is picking a random number like this since Linux Kernel after 2015 supports getrandom() and using device files to pull a random number seen as a malicious and error-prone from Linux Standards perspective. Furthermore, json-c lib has already made changes to their version which could benefit us if GDAL doesn't do exit(1) or if GDAL uses getrandom for linux machines since it is the defacto after 2015, please see the notes.

Here are the unsynced changes between GDAL and json-c

https://github.com/OSGeo/gdal/blame/master/ogr/ogrsf_frmts/geojson/libjson/random_seed.c#L186 (this is where we get the stderr from GDAL commands)
https://github.com/json-c/json-c/blob/master/random_seed.c#L222 (these are the updated changes GDAL hasn't caught up)

Proposed Solution

1- Using getrandom for linux and leaving this issue of get_dev_random_seed for other OS systems such as apple and unix
Or
2- Removing exit(1) for fatal exits from get_dev_random_seed and also closing the file before short read as short read is not really an error, fatal exits leave opened files on the hang with memory leaks

Steps to reproduce the problem.

Not reproducible locally with determinism, happens from time to time with no certainty (when the entropy runs out)

Operating system

Ubuntu 22.04 (same as the container's OS)

GDAL version and provenance

ubuntu-small-3.8.1 amd64 as a docker image

Collabration

we are happy to submit a patch or align on this matter if we can help, please let us know how you want to proceed

@rouault rouault self-assigned this Jan 4, 2024
@rouault rouault closed this as completed in 54064b6 Jan 4, 2024
rouault added a commit that referenced this issue Jan 4, 2024
Internal libjson: resync random_seed.c with upstream, and use getrandom() implementation when available (fixes #9024)
rouault added a commit that referenced this issue Jan 4, 2024
@mrwormhole
Copy link
Author

thanks a lot 👍 @rouault

rouault added a commit that referenced this issue Jan 4, 2024
[Backport release/3.8] Internal libjson: resync random_seed.c with upstream, and use getrandom() implementation when available (fixes #9024)
ralphraul pushed a commit to 1SpatialGroupLtd/gdal that referenced this issue Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants