Fix cache inconsistency when free'ing from GC #797

etcimon · 2014-05-21T20:08:59Z

After a pointer is freed, its size and info is not removed from cache thus if it is allocated for another purpose any subsequent realloc calls will refer to the previous size.

This can cause memory corruption with an application that uses GC.free, delete or realloc(p, 0), or a complete crash if a big allocation's size is cached and the page pool was reduced with minimize, ie. the call to realloc will try to resize in-place rather than fetch a new allocation, and it will try to copy over bytes from/to an invalid memory region causing an Access Violation vibe-d/vibe.d#470

I also added USE_CACHE consistency and had it evaluated at compile-time.

After a pointer is freed, it is not removed from cache thus if it is allocated for another purpose any subsequent realloc calls will refer to the previous size. This can cause memory corruption with an application that uses `GC.free`, `delete` or `realloc(p, 0)`, or a complete crash if a big allocation's size is cached and the page pool was reduced with minimize, ie. the call to `realloc` will try to resize in-place rather than with malloc. Verify USE_CACHE at compile-time Fix alignment

MartinNowak · 2014-05-22T06:21:34Z

src/gc/gc.d

+            if (p == cached_info_key){
+	        cached_info_key = cached_info_key.init;
+	        cached_info_val = cached_info_val.init;
+	    }


The cache fields are member of Gcx, so it's gcx.cached_info_key.

MartinNowak · 2014-05-22T07:14:35Z

Thanks a lot for taking this down.

Testing .editorconfig again... Identation should be fine now...

etcimon · 2014-05-22T14:33:26Z

Thanks a lot for taking this down.

Np! I'm also planning adding sampling to the GC right now. I was surprised to see it's an easy optimization that could be added on top of this.

Basically, when marking, you take 1 in X of the references and send them to a specific array that represents the pool they refer to. Then, next time you're going to collect you test them individually and if they're mostly there, you skip marking/free'ing for that particular pool during collection. You can force collection on certain pools every 1 in X collections to even out the average lifetime of the references.

You're going to want to have a lower certainty of failure for big allocations, but basically you're using probabilities to avoid pushing a lot of useless load on the processor, especially when you're in a part of an application that's just allocating a lot (sampling will determine that the software is not in a state of data removal).

http://en.wikipedia.org/wiki/Bayes_factor

MartinNowak · 2014-05-22T15:03:36Z

s/taking/tracking/ :).

MartinNowak · 2014-05-22T15:03:42Z

Auto-merge toggled on

Safety0ff · 2014-05-22T15:41:06Z

I'm in favor of removing the size cache (the info cache can stay.)
When I did some tests I noticed that the hit rate was very poor whereas the info cache was acceptable. The only time it's really useful is when you have a sequence of GC calls which all deal with the same memory block.

etcimon · 2014-05-22T15:45:40Z

When I did some tests I noticed that the hit rate was very poor whereas the info cache was acceptable.

Did you test with an appender?

etcimon · 2014-05-22T15:50:12Z

@MartinNowak Have you considered using a special GC.BlkAttr.NO_SCAN on appender!string in phobos? Or when doing new string in general? I've never seen an immutable char array carry pointers in it and it seems like it would save most of the application memory from being scanned.

Safety0ff · 2014-05-22T15:53:17Z

Did you test with an appender?

I actually used the phobos unittest suite as my test. The results was 1.4% hit rate (120k queries). I don't have the numbers anymore (except for the overall) but only a 1 to 3 unittests gave decent hit rates.
The testsuite however, might not have given accurate results on how real code behaves, but I never got around to doing more tests.

The info cache had a hit rate of 51%.

etcimon · 2014-05-22T17:02:43Z

I actually used the phobos unittest suite as my test.

It wouldn't represent the normal uses. I think most applications need to interact with i/o streams, which relies on the appender. When appending, you're often going to call extend about 10-20 times in a row.

Fix cache inconsistency when free'ing from GC

Safety0ff · 2014-05-22T17:15:39Z

When appending, you're often going to call extend about 10-20 times in a row.

With the current implementation the cache only helps in the early return case (when you can't extend due to the size being greater than the page size.)

I want to remove redundant pool searches. Currently the worst case path in extend is 2 findPool calls which are logarithmic in the number of pools.
If caching really provides benefit, the perhaps a cache in findPool can replace both cache and size caches since finding the size / info given a pool is done in contant time.

etcimon · 2014-05-22T17:35:10Z

I think the best way to remove redundant pool searches is sampling for the mark() step. With sampling, you can mark pools to be skipped entirely during collection. For that, you need a little Bayesian probabilistic statistics (machine learning) but the tradeoff is a >90% faster collection, when it's not skipped altogether.

Second, the easiest way to make it faster is to adopt the use of NO_SCAN from a higher-level, using the type information. If you're not recursing uselessly, you won't call findPool, and the savings are exponential to the size of the application.
http://forum.dlang.org/thread/lll9dr$2ll7$1@digitalmars.com

Safety0ff · 2014-05-22T17:37:42Z

I mean caching findPool for calls done outside of marking (I'd expect it would be subject to trashing in mark.)

etcimon · 2014-05-22T17:38:41Z

I mean caching findPool for calls done outside of marking (I'd expect it would be subject to trashing in mark.)

Oh, yes of course. I don't think it has a significant cost either way and removing cache removes complexity which makes bugs easier to catch ;)

MartinNowak · 2014-05-22T18:04:12Z

NO_SCAN for appender would be great. I have to think a bit more about the sampling, sounds interesting.

etcimon · 2014-05-22T19:08:37Z

In the forums, it was told that noscan was already implemented inside the typeinfo. I'm not sure if it's used with in-scope declarations here: https://github.com/D-Programming-Language/phobos/blob/58112802ed34206c21d39a6c359be2445fd46804/std/array.d#L2289 ?

MartinNowak · 2014-05-22T21:26:01Z

Ah, Appender is already using NO_SCAN (here here).
Can you explain the sampling in a little more detail or provide some reading?

Safety0ff · 2014-05-22T21:48:06Z

@etcimon could you please create a test case for this pull request to avoid regression?

etcimon · 2014-05-23T00:54:03Z

Can you explain the sampling in a little more detail or provide some reading?

Ah, yes it took a while to find it again because my textbooks were in french.

http://books.google.ca/books?id=b-XFrpBQ7d0C&lpg=PA119&pg=PA154#v=onepage&q&f=false

The pages 154 to 183 are relevant, there's a lot of types of probabilistic distributions but these pages talk about the important ones. At first a good sampling algorithm can do a great job assuming a binomial, hypergeometric, bernouilli, or poisson distributions, they're pretty much a law of nature anyways.

Basically we'd keep a sample of ..say.. n: 30 pointers found in the heap that point to a pool table, if there is a total of N: 400. This is a sample size chosen according to the distribution. and then we'd test the probability of >10% of them having been deleted based on how many of the 30 failed *(sample.p)==sample.pOldVal. If the null hypothesis e.g. Ho: the probability of >10% pointers were changed in population N is <25% is rejected (this could even be determined with a sample of 3-4 pointers for a N: 400 population if we can be flexible), we run a collection on the pool, and we can even verify these statistics and re-adjust the model (std deviation & average) from there, to be even more precise and necessitate an even smaller sample the next time.

Ideally the distribution would change dynamically with the bayes factor (never tried it but it looks interesting ;) and so would the std deviation and other parameters. This doesn't need to be completed for the method to save CPU cycles though, it'll do what it has to do: delay collection for pools that don't need it right away.

MartinNowak · 2014-05-23T04:55:38Z

Interesting idea, but first we'd need to figure out how to save CPU cycle using that, because as it stands you'd still need to scan the pool for pointers to other pools, right? Not sure if simply excluding a pool from being marked saves a lot of cycles. Scanning the whole pool at once, instead of selectively, will be a little faster though.

etcimon · 2014-05-23T05:02:55Z

as it stands you'd still need to scan the pool for pointers to other pools,

Well, the way I understand it is that you're not scanning the pool for pointers if it's set as NO_SCAN. These pools would only contain basic types, like a char[40] pointed to by a char* from another pool (which would not be in a NO_SCAN pool).

edit: oh I thought you were talking about the NO_SCAN idea from the other thread. I'm tired..

Yes, the way I see it is that this would mostly save time if all the collections are probably unchanged, as in not so many deletions. It would add the ability to skip collection most of the time (80% of collections are done for 0.01% of the data? I don't know?), fitting to pools would help with being more precise, and maybe even more permissive about the importance of them (small allocations, meh...)

MartinNowak · 2014-05-23T05:11:31Z

So that would be during a phase when your program allocates a lot of memory without freeing anything. Maybe a coarser but simpler heuristic to not trigger collection during those phases could have the same effect. For example taking the relative amount of memory reclaimed.

etcimon · 2014-05-23T05:16:02Z

A relative amount implies you compare allocations vs deallocations, how would you go about guessing it in advance? And you're still freeing things, just not enough to trigger collection

MartinNowak · 2014-05-23T06:02:29Z

You can't know it in advance, but you could adapt thresholds after a collection or do some more sophisticated prediction for the next collection. The same applies to keeping track of still alive pointers to a certain pool though.

MartinNowak reviewed May 22, 2014
View reviewed changes

Fixed indentation and referencing error

b2fe3e5

Testing .editorconfig again... Identation should be fine now...

MartinNowak added a commit that referenced this pull request May 22, 2014

Merge pull request #797 from etcimon/patch-1

a3f4378

Fix cache inconsistency when free'ing from GC

MartinNowak merged commit a3f4378 into dlang:master May 22, 2014

etcimon deleted the patch-1 branch May 22, 2014 17:49

Safety0ff mentioned this pull request May 23, 2014

GC: Remove redundant pool searches #801

Merged

Uh oh!

Fix cache inconsistency when free'ing from GC #797

Fix cache inconsistency when free'ing from GC #797

Uh oh!

Conversation

etcimon commented May 21, 2014

Uh oh!

MartinNowak May 22, 2014

Choose a reason for hiding this comment

Uh oh!

MartinNowak commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

MartinNowak commented May 22, 2014

Uh oh!

MartinNowak commented May 22, 2014

Uh oh!

Safety0ff commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

Safety0ff commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

Safety0ff commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

Safety0ff commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

MartinNowak commented May 22, 2014

Uh oh!

etcimon commented May 22, 2014

Uh oh!

MartinNowak commented May 22, 2014

Uh oh!

Safety0ff commented May 22, 2014

Uh oh!

etcimon commented May 23, 2014

Uh oh!

MartinNowak commented May 23, 2014

Uh oh!

etcimon commented May 23, 2014

Uh oh!

MartinNowak commented May 23, 2014

Uh oh!

etcimon commented May 23, 2014

Uh oh!

MartinNowak commented May 23, 2014

Uh oh!

Uh oh!