gdal2tiles: Fix max cache setting #2112

EvertEt · 2019-12-16T10:41:34Z

Since GDAL_CACHEMAX is only read once, setting it here has no effect anymore for future calls (even multiprocessed).
By setting the cache directly using SetCacheMax, we can fix the cache per process.

What does this PR do?

Fixes gdal2tiles max cache setting

What are related issues/pull requests?

N/A

Tasklist

Add test case(s)
Review
Adjust for comments
All CI builds and checks have passed

Environment

Discovery of this was done on Ubuntu azure machines.
F4, 4 threads to tile: Ram 4/8GB
F32, 20 threads to tile: SIGKILL
F32: 16 threads to tile: 58/64GB

This shows us the default 5% GDAL_CACHEMAX is being used for every process instead of being split over processes.

EvertEt · 2019-12-16T10:46:46Z

I am not familiar with your testing infrastructure. We could add a test for this by using more than 20 threads. Could anyone point me in the right direction or take a stab at this? (If so, how should I validate the test fails before this fix?)

rouault · 2019-12-16T19:25:12Z

I'm not sure how to write a test for that. Hard to check behaviour without instrumentation. gdal2tiles tests are in autotest/pyscripts/test_gdal2tiles.py

EvertEt · 2019-12-16T19:28:14Z

Only easy way I can think of is write a test with 30 threads, verify it fills up ram and make sure it just runs with this fix.

rouault · 2019-12-16T20:32:22Z

verify it fills up ram and make sure it just runs with this fix.

As much as I like tests to go with pull request, here honestly it seems a bit complicated and I'd be concerned by portability issues to do that check among all the OS....

rouault · 2019-12-17T15:05:48Z

That said, are you sure your change works ... ? gdal.SetCacheMax() has only effect in the current process. It should not affect spawned worker processes...

EvertEt · 2019-12-18T09:58:22Z

I am pretty sure it works. Since I am not experienced in this, I can only speculate, but since the threads spawned from the same python script, I think gdal is re-used and only initialized once.

Some debugging tests: I added print(gdal.GetCacheMax()) after the ReadRaster Extent log in create_base_tile. Before calling gdal2tiles.py with subprocess.check_call I set GDAL_CACHEMAX to 31104. I set nb_processes to 32.

With the fix:
GDAL_CACHEMAX: 3,26149e+10 bytes (31104 MB)
gdal.GetCacheMax(): 1019215872
3,26149e+10 / 1019215872 = 32

Without the fix:
GDAL_CACHEMAX: 3,25992e+10 bytes (31089 MB)
gdal.GetCacheMax(): 32599179264
3,25992e+10 / 32599179264 = 1

rouault · 2019-12-18T11:07:35Z

I think gdal is re-used and only initialized once.

Is that on Linux ? If so, according to https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods, multiprocessing uses by default the "fork" method, and thus GDAL is indeed only intitialized once. But on Mac & Windows, I bet this would fail. So perhaps the fix is to have both the original code & your change !

EvertEt · 2019-12-18T11:12:00Z

Linux indeed. Would you be able to replicate the experiment above on a different platform?

rouault · 2019-12-18T11:16:56Z

Would you be able to replicate the experiment above on a different platform?

If I had time... but I'm not scaling very well...

EvertEt · 2019-12-18T11:19:44Z

We can leave in both as it won't hurt and link the docs you mentioned?

rouault · 2019-12-18T11:34:22Z

We can leave in both as it won't hurt and link the docs you mentioned?

Yes, a comment explaining this weird setup would be needed. I guess the os.environ[] has to be done before gdal.SetCacheMax()

Once you're satisfied with a fix, maybe @jratike80 would be willing to have a try on Windows (no obligation Jukka...)

jratike80 · 2019-12-18T13:23:55Z

I hope I did this right on Windows by commenting out either the original line or the suggested one

\#Make sure that all processes do not consume more than GDAL_CACHEMAX
os.environ['GDAL_CACHEMAX'] = '%d' % int(gdal.GetCacheMax() / nb_processes)
#gdal.SetCacheMax(max(1024 * 1024, math.floor(gdal.GetCacheMax() / nb_processes)))

The original code with os.environ produces with SET GDAL_CACHEMAX=31104 and
gdal2tiles --processes=4 file1.tif delete -v
8 153 726 976

The new line with "gdal.SetCacheMax.. gives
32 614 907 904

EvertEt · 2019-12-18T13:30:31Z

Is the number the output of this? Because you need to log in the threads.

I added print(gdal.GetCacheMax()) after the ReadRaster Extent log in create_base_tile

The first result seems very weird to me as it is higher than the set value. The second one does seem correct but not divided by the amount of processes.
You should not need the verbose log.

Thank you for trying it out already. I will provide a better example soon If I can

jratike80 · 2019-12-18T13:36:35Z

I am not a programmer so I need rather explicit advice. I searched for text "ReadRaster Extent", found one in a comment and one in a code, and added your snippet here:

    if options.verbose:
        print("\tReadRaster Extent: ",
              (rx, ry, rxsize, rysize), (wx, wy, wxsize, wysize))
        print(gdal.GetCacheMax())

It is around line 939.

EvertEt · 2019-12-18T13:40:47Z

That is the correct place. (Does need verbose flag indeed like this.)
I will try to investigate further as to why you are getting these values since the original seems super wrong (higher) and the other one should be divided (expected because of the above)

jratike80 · 2019-12-18T13:51:08Z

I checked the numbers and wrote them with thousand separators, should be easier to read now.

EvertEt · 2019-12-18T13:54:51Z

Oh yes totally my bad. Seems like we indeed need both solutions to work on all platforms. I will improve the PR and check back with both. Thanks a lot

EvertEt · 2019-12-21T21:57:12Z

@jratike80 Could you check again? Should be the same behaviour in both cases now.

EvertEt · 2020-01-06T10:56:30Z

Is any more work expected on this PR from my side, @rouault ? Thanks!

rouault · 2020-01-06T11:35:55Z

Is any more work expected on this PR from my side, @rouault ?

Looks good for me. Let's give a chance to @jratike80 to perhaps have another try when he has some time

rouault · 2020-01-07T18:53:57Z

Merging this as it looks reasonable, and I want to pull that in 3.0 branch as well for tomorrow release

EvertEt · 2020-01-08T13:18:58Z

Ok thanks! Will it be backported as well?

rouault · 2020-01-08T13:28:42Z

yes was backported in time for GDAL 3.0.3 and 2.4.4 RC1

Fix gdal2tiles max cache setting

3b3f112

EvertEt force-pushed the patch-1 branch from aa4428b to 3b3f112 Compare December 20, 2019 15:33

rouault merged commit 870ec8f into OSGeo:master Jan 7, 2020

EvertEt deleted the patch-1 branch January 7, 2020 18:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gdal2tiles: Fix max cache setting #2112

gdal2tiles: Fix max cache setting #2112

EvertEt commented Dec 16, 2019

EvertEt commented Dec 16, 2019

rouault commented Dec 16, 2019

EvertEt commented Dec 16, 2019

rouault commented Dec 16, 2019

rouault commented Dec 17, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

jratike80 commented Dec 18, 2019 •

edited

EvertEt commented Dec 18, 2019

jratike80 commented Dec 18, 2019

EvertEt commented Dec 18, 2019

jratike80 commented Dec 18, 2019

EvertEt commented Dec 18, 2019

EvertEt commented Dec 21, 2019

EvertEt commented Jan 6, 2020

rouault commented Jan 6, 2020

rouault commented Jan 7, 2020

EvertEt commented Jan 8, 2020

rouault commented Jan 8, 2020

gdal2tiles: Fix max cache setting #2112

gdal2tiles: Fix max cache setting #2112

Conversation

EvertEt commented Dec 16, 2019

What does this PR do?

What are related issues/pull requests?

Tasklist

Environment

EvertEt commented Dec 16, 2019

rouault commented Dec 16, 2019

EvertEt commented Dec 16, 2019

rouault commented Dec 16, 2019

rouault commented Dec 17, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

EvertEt commented Dec 18, 2019

rouault commented Dec 18, 2019

jratike80 commented Dec 18, 2019 • edited

EvertEt commented Dec 18, 2019

jratike80 commented Dec 18, 2019

EvertEt commented Dec 18, 2019

jratike80 commented Dec 18, 2019

EvertEt commented Dec 18, 2019

EvertEt commented Dec 21, 2019

EvertEt commented Jan 6, 2020

rouault commented Jan 6, 2020

rouault commented Jan 7, 2020

EvertEt commented Jan 8, 2020

rouault commented Jan 8, 2020

jratike80 commented Dec 18, 2019 •

edited