-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gdal2tiles: Fix max cache setting #2112
Conversation
I am not familiar with your testing infrastructure. We could add a test for this by using more than 20 threads. Could anyone point me in the right direction or take a stab at this? (If so, how should I validate the test fails before this fix?) |
I'm not sure how to write a test for that. Hard to check behaviour without instrumentation. gdal2tiles tests are in autotest/pyscripts/test_gdal2tiles.py |
Only easy way I can think of is write a test with 30 threads, verify it fills up ram and make sure it just runs with this fix. |
As much as I like tests to go with pull request, here honestly it seems a bit complicated and I'd be concerned by portability issues to do that check among all the OS.... |
That said, are you sure your change works ... ? gdal.SetCacheMax() has only effect in the current process. It should not affect spawned worker processes... |
I am pretty sure it works. Since I am not experienced in this, I can only speculate, but since the threads spawned from the same python script, I think gdal is re-used and only initialized once. Some debugging tests: I added With the fix: Without the fix: |
Is that on Linux ? If so, according to https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods, multiprocessing uses by default the "fork" method, and thus GDAL is indeed only intitialized once. But on Mac & Windows, I bet this would fail. So perhaps the fix is to have both the original code & your change ! |
Linux indeed. Would you be able to replicate the experiment above on a different platform? |
If I had time... but I'm not scaling very well... |
We can leave in both as it won't hurt and link the docs you mentioned? |
Yes, a comment explaining this weird setup would be needed. I guess the os.environ[] has to be done before gdal.SetCacheMax() Once you're satisfied with a fix, maybe @jratike80 would be willing to have a try on Windows (no obligation Jukka...) |
I hope I did this right on Windows by commenting out either the original line or the suggested one
The original code with The new line with |
Is the number the output of this? Because you need to log in the threads.
The first result seems very weird to me as it is higher than the set value. The second one does seem correct but not divided by the amount of processes. Thank you for trying it out already. I will provide a better example soon If I can |
I am not a programmer so I need rather explicit advice. I searched for text "ReadRaster Extent", found one in a comment and one in a code, and added your snippet here:
It is around line 939. |
That is the correct place. (Does need verbose flag indeed like this.) |
I checked the numbers and wrote them with thousand separators, should be easier to read now. |
Oh yes totally my bad. Seems like we indeed need both solutions to work on all platforms. I will improve the PR and check back with both. Thanks a lot |
@jratike80 Could you check again? Should be the same behaviour in both cases now. |
Is any more work expected on this PR from my side, @rouault ? Thanks! |
Looks good for me. Let's give a chance to @jratike80 to perhaps have another try when he has some time |
Merging this as it looks reasonable, and I want to pull that in 3.0 branch as well for tomorrow release |
Ok thanks! Will it be backported as well? |
yes was backported in time for GDAL 3.0.3 and 2.4.4 RC1 |
Since
GDAL_CACHEMAX
is only read once, setting it here has no effect anymore for future calls (even multiprocessed).By setting the cache directly using
SetCacheMax
, we can fix the cache per process.What does this PR do?
Fixes gdal2tiles max cache setting
What are related issues/pull requests?
N/A
Tasklist
Environment
Discovery of this was done on Ubuntu azure machines.
F4, 4 threads to tile: Ram 4/8GB
F32, 20 threads to tile: SIGKILL
F32: 16 threads to tile: 58/64GB
This shows us the default 5% GDAL_CACHEMAX is being used for every process instead of being split over processes.