gc: rename module and remove (de)committed memory distinction #496

leandro-lucarella-sociomantic · 2013-05-30T11:14:39Z

The distinction is only useful for Windows, Posix has no standard way to
do this, even when apparently you can emulate it in Linux using
PROT_NONE while mmap()ping (but Linux don't commit memory until you
write on it, so is pointless to use it in Linux).

This distinction makes the code a little more complex, because is spread
in several parts of the GC, and it doesn't have an evident positive
impact (having memory mapped but not committed is a very transient state
anyway, except on extremely rare situations, the GC will commit all the
memory eventually, as you keep allocating, it could take more or less
time, but eventually you'll have all your memory committed just before
the first time the collector runs).

About the rename:

In the future this file will hold more OS-specific functions needed by
the GC, and this file looks like a good place to put them, so all
OS-specific functions are grouped in a single module.

Also, having the prefix gc on every file, when all this is inside a gc
package, doesn't seem to make a lot of sense, thus it will be removed
progressively.

In the future this file will hold more OS-specific functions needed by the GC, and this file looks like a good place to put them, so all OS-specific functions are grouped in a single module. Also, having the prefix gc on every file, when all this is inside a gc package, doesn't seem to make a lot of sense, thus it will be removed progressively.

leandro-lucarella-sociomantic · 2013-05-30T11:16:22Z

I've done some performance check in Linux with the tests in benchmark/gcbench and there is no evident performance impact, either for good or bad. If anyone else can do performance tests on other operating systems (specially Windows), it will be highly appreciated.

The distinction is only useful for Windows, Posix has no standard way to do this, even when apparently you can emulate it in Linux using PROT_NONE while mmap()ping (but Linux don't commit memory until you write on it, so is pointless to use it in Linux). This distinction makes the code a little more complex, because is spread in several parts of the GC, and it doesn't have an evident positive impact (having memory mapped but not committed is a very transient state anyway, except on extremely rare situations, the GC will commit all the memory eventually, as you keep allocating, it could take more or less time, but eventually you'll have all your memory committed just before the first time the collector runs).

leandro-lucarella-sociomantic · 2013-05-30T11:21:49Z

(fixed unittest)

MartinNowak · 2013-05-30T15:35:18Z

So we never have many pools with uncommited memory?
Then this change seems reasonable to me otherwise it
could trigger an out of memory situation.
http://msdn.microsoft.com/en-us/library/ms810627.aspx

leandro-lucarella-sociomantic · 2013-05-30T15:44:39Z

No, just one pool, maybe you can have another pool with some uncommitted space if an allocation failed because it can't find a big enough chunk free/uncommitted in the current pools, but normally everything should be committed eventually in the GC, otherwise you have a serious fragmentation problem, which is a problem by itself.

MartinNowak · 2013-05-30T15:51:11Z

Then let's do this, I will review this in detail when I'm back home.

leandro-lucarella-sociomantic · 2013-05-30T16:17:08Z

Remember, before pulling this it would be good if somebody can run at least the benckmarks in benchmark/gcbench in Windows (and any other stuff you have to test) to verify there is no performance impact in Windows. Also looking at the real memory usage of applications could help a lot to verify there is no increase in memory usage. I think it should be safe, but it's better to be careful :)

rainers · 2013-05-30T16:58:44Z

I've run testgc3 from the test-suite on win32 with avgtime:

exe with patch:

Total time (ms): 77225.8
Repetitions : 10
Sample mode : 7460 (3 ocurrences)
Median time : 7480.43
Avg time : 7722.58
Std dev. : 701.738
Minimum : 7454.74
Maximum : 9825.61
95% conf.int. : [6347.2, 9097.96] e = 1375.38
99% conf.int. : [5915.03, 9530.14] e = 1807.56
EstimatedAvg95%: [7287.65, 8157.52] e = 434.934
EstimatedAvg99%: [7150.98, 8294.18] e = 571.6

exe without patch:

Total time (ms): 83800.4
Repetitions : 10
Sample mode : 7420 (2 ocurrences)
Median time : 8303.8
Avg time : 8380.04
Std dev. : 877.004
Minimum : 7425.66
Maximum : 9774.73
95% conf.int. : [6661.15, 10098.9] e = 1718.9
99% conf.int. : [6121.03, 10639.1] e = 2259.01
EstimatedAvg95%: [7836.48, 8923.6] e = 543.562
EstimatedAvg99%: [7665.68, 9094.4] e = 714.362

Repeating the runs produced Total time (ms): 77955.7 and 80084.3, so measurements vary quite a bit, but a net improvement seems to exist.

I could not try the parser tests as they are currently blocked by bug 9044/6461.

leandro-lucarella-sociomantic · 2013-05-31T10:55:42Z

src/gc/gcx.d

-            ncommitted += tocommit;
-
-            return tocommit > n;
+            searchStart = npages;


I'm not very familiar with this searchStart optimization, so if anybody can verify this bit of code is still correct, it will be appreciated!

leandro-lucarella-sociomantic · 2013-05-31T11:08:06Z

And nice program this avgtime. I ran all the benchmarks and these is the summary of the results:

--- avgtime.huge_single.d.2.063.log 2013-05-31 12:37:28.674837751 +0200
+++ avgtime.huge_single.d.git.log   2013-05-31 12:42:33.120600823 +0200
-Avg time       : 3.7239
-Std dev.       : 0.413316
+Avg time       : 4.3469
+Std dev.       : 0.522962
--- avgtime.rand_large.d.2.063.log  2013-05-31 12:37:42.631101841 +0200
+++ avgtime.rand_large.d.git.log    2013-05-31 12:42:47.028864209 +0200
-Avg time       : 1201.35
-Std dev.       : 69.8966
+Avg time       : 1195.81
+Std dev.       : 56.8985
--- avgtime.rand_small.d.2.063.log  2013-05-31 12:37:55.899352924 +0200
+++ avgtime.rand_small.d.git.log    2013-05-31 12:43:00.193113502 +0200
-Avg time       : 1137.72
-Std dev.       : 25.0925
+Avg time       : 1122.44
+Std dev.       : 18.6431
--- avgtime.tree1.d.2.063.log   2013-05-31 12:42:11.508191567 +0200
+++ avgtime.tree1.d.git.log 2013-05-31 12:47:16.073960787 +0200
-Avg time       : 23182
-Std dev.       : 181.616
+Avg time       : 23187.8
+Std dev.       : 183.165
--- avgtime.tree2.d.2.063.log   2013-05-31 12:42:16.972295037 +0200
+++ avgtime.tree2.d.git.log 2013-05-31 12:47:21.506063722 +0200
-Avg time       : 419.456
-Std dev.       : 4.29148
+Avg time       : 416.213
+Std dev.       : 1.55321

The only case where the unpatched version is worse is huge_single, which I guess it could be one of those corner cases where there is a lot of memory reserved but not committed. The test only does this:

import std.stdio, core.memory;

void main(string[] args) {
    enum mul = 1000;
    auto ptr = GC.malloc(mul * 1_048_576, GC.BlkAttr.NO_SCAN);

    GC.collect();
}

I have no idea why there is any difference, because all that memory should be even committed, but seems to be a pretty unrealistic case.

This is how I generated these diffs:

druntime/benchmark/gcbench:cdgc %=$ for f in *.d; do echo $f; with-dmd2 rdmd --force --build-only $f; avgtime -h -p -d -r10 ./`basename $f .d` > /tmp/avgtime.$f.2.063.log; done
huge_single.d
rand_large.d
rand_small.d
tree1.d
tree2.d

druntime/benchmark/gcbench:cdgc %=$ for f in *.d; do echo $f; with-dmd2-git rdmd --force --build-only $f; avgtime -h -p -d -r10 ./`basename $f .d` > /tmp/avgtime.$f.git.log; done
huge_single.d
rand_large.d
rand_small.d
tree1.d
tree2.d

for f in avgtime.*063.log; do diff -u $f `basename $f .2.063.log`.git.log | grep '+++ \|--- \|Avg \|Std '; done

leandro-lucarella-sociomantic · 2013-05-31T11:33:44Z

I noticed my desktop had very high memory usage while doing this tests, so I freed some memory and repeated them, also adding -release -O -inline while compiling and using nice -n-10 to run the test to minimize the impact of other tasks in the tests. The results are very different and make more sense :)

--- avgtime.huge_single.d.2.063.log 2013-05-31 13:24:25.585416424 +0200
+++ avgtime.huge_single.d.git.log   2013-05-31 13:18:27.846272607 +0200
-Avg time       : 0.958
-Std dev.       : 0.0843955
+Avg time       : 0.9093
+Std dev.       : 0.031493
--- avgtime.rand_large.d.2.063.log  2013-05-31 13:24:39.345690778 +0200
+++ avgtime.rand_large.d.git.log    2013-05-31 13:18:41.366543012 +0200
-Avg time       : 1168.21
-Std dev.       : 50.7714
+Avg time       : 1152.12
+Std dev.       : 49.1627
--- avgtime.rand_small.d.2.063.log  2013-05-31 13:24:51.925941565 +0200
+++ avgtime.rand_small.d.git.log    2013-05-31 13:18:53.966794978 +0200
-Avg time       : 1063.46
-Std dev.       : 21.5651
+Avg time       : 1067.1
+Std dev.       : 13.2134
--- avgtime.tree1.d.2.063.log   2013-05-31 13:29:35.739592805 +0200
+++ avgtime.tree1.d.git.log 2013-05-31 13:23:37.724461937 +0200
-Avg time       : 25720.7
-Std dev.       : 129.007
+Avg time       : 25715.2
+Std dev.       : 149.692
--- avgtime.tree2.d.2.063.log   2013-05-31 13:29:43.647750086 +0200
+++ avgtime.tree2.d.git.log 2013-05-31 13:23:45.608619195 +0200
-Avg time       : 613.644
-Std dev.       : 7.81523
+Avg time       : 611.554
+Std dev.       : 4.12177

Now the only test where the new version is worse is rand_small and the difference is minimal :)

MartinNowak · 2013-05-31T13:11:12Z

Looks OK to me.
Everytime I look at the GC I want to rewrite the whole thing, what a mess.

gc: rename module and remove (de)committed memory distinction

leandro-lucarella-sociomantic · 2013-05-31T13:20:47Z

Indeed, and part of the plan is to clean the mess a little bit.

Thanks.

leandro-lucarella-sociomantic reviewed May 31, 2013
View reviewed changes

MartinNowak added a commit that referenced this pull request May 31, 2013

Merge pull request #496 from leandro-lucarella-sociomantic/cdgc

faec6c7

gc: rename module and remove (de)committed memory distinction

MartinNowak merged commit faec6c7 into dlang:master May 31, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gc: rename module and remove (de)committed memory distinction #496

gc: rename module and remove (de)committed memory distinction #496

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

MartinNowak commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

MartinNowak commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

rainers commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic May 31, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

MartinNowak commented May 31, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

Uh oh!

Uh oh!

gc: rename module and remove (de)committed memory distinction #496

gc: rename module and remove (de)committed memory distinction #496

Uh oh!

Conversation

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

MartinNowak commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

MartinNowak commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 30, 2013

Uh oh!

rainers commented May 30, 2013

Uh oh!

leandro-lucarella-sociomantic May 31, 2013

Choose a reason for hiding this comment

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

MartinNowak commented May 31, 2013

Uh oh!

leandro-lucarella-sociomantic commented May 31, 2013

Uh oh!

Uh oh!