SI-7149 Use a WeakHashSet for type uniqueness #2605

Merged
merged 3 commits into from Jun 4, 2013

7 participants

@JamesIry

We have an implementation of WeakHashSet in the compiler but it never cleans up its WeakReferences. The first commit in this PR replaces the implementation with one that does clean up. Along the way it modifies the interface to follow the standard scala.collection.mutable.Set interface so that it can eventually be moved into the main library should we choose to do so.

Currently type uniqueness is done via a HashSet[Type], but
that means the Types live through an entire compile session, even
ones that are used once. The result is a huge amount of unnecessarily
retained memory. This commit uses a WeakHashSet instead so that Types
and their WeakReferences are cleaned up when no longer in use.

Finally, perRunCaches is also updated to use the WeakHashSet instead of a HashSet of WeakReferences because the former was never cleaning up unused WeakReferences.

@gkossakowski gkossakowski was assigned May 28, 2013
@JamesIry

Cleaned up several typos in comments.

@gkossakowski
The Scala Programming Language member

Currently type uniqueness is done via a HashSet of WeakReferences, but
that means the WeakReferences live through an entire compile session.
Another commit uses the WeakHashSet instead so that the WeakReferences are
cleaned up when no longer in use.

I thought we were not using any form of WeakReferences in uniques set thus we were accumulating those types for the entire compiler run. If you look at the diff in 4eb4aa5 you'll see that there are no WeakReferences involved there. Am I missing something?

@JamesIry

You're right, my commit message is wrong.

@gkossakowski
The Scala Programming Language member

@JamesIry: could you update the commit and PR message?

James Iry added some commits May 17, 2013
James Iry SI-7150 Replace scala.reflect.internal.WeakHashSet
Replaces scala.reflect.internal.WeakHashSet with a version that
* extends the mutable.Set trait
* doesn't leak WeakReferences
* is unit tested
ad63f36
James Iry SI-7149 Use a WeakHashSet for type uniqueness
Currently type uniqueness is done via a HashSet[Type], but
that means the Types live through an entire compile session, even
ones that are used once. The result is a huge amount of unnecessarily
retained memory. This commit uses a WeakHashSet instead so that Types
and their WeakReferences are cleaned up when no longer in use.
746d85b
James Iry Modify perRunCaches to not leak WeakReferences
perRunCaches was using a HashMap of WeakReferences which meant it would
accumulate WeakReferences over time. This commit uses a WeakHashSet
instead so that the references are cleaned up.
b7e0fd2
@JamesIry

Cleaned up the commit message and added a comment to the WeakHashSet's hash avalanche mechanism.

@gkossakowski
The Scala Programming Language member

LGTM.

Great work!

I did memory profiling of this change. I'm comparing b88801f (parent commit of this PR) and 29babb5 (last commit in this PR). I used YourKit and custom probe to collect memory snapshot after different phases. See YourKit documentation for detailed explanation of probes.

I compiled Scala library with probes enabled like this:

ant -Djvm.opts="-agentpath:/Applications/YourKit.app/bin/mac/libyjpagent.jnilib=builtinprobes=none,probeclasspath=/Users/grek/scala/scala-master/probe-classes,probe=MemoryProbe" quick.lib

Now the numbers (for compiling the Scala library)

Total retained memory after phase (in mb):
=========================================
Phase       Before      After   Lost
typer       173         136     37 (21%)
erasure     264         178     86 (32%)
cleanup     314         225     89 (28%)

Number of retained :: objects after each phase (in k):
=====================================================
Phase       Before      After   Lost
typer       808         502     306 (37%)
erasure     1432        766     666 (46%)
cleanup     1737        1044    693 (39%)

Number of retained TypeHistory objects after each phase (in k):
==============================================================
Phase       Before      After   Lost
typer       202         123     79  (39%)
erasure     473         259     214 (45%)
cleanup     605         378     227 (37%)


Number of retained ClassArgsTypeRef objects after each phase (in k):
===================================================================
Phase       Before      After   Lost
typer       226         88      138 (61%)
erasure     396         78      318 (80%)
cleanup     419         88      331 (78%)

Great work, again!

@gkossakowski gkossakowski merged commit 803d451 into scala:master Jun 4, 2013

1 check passed

Details default pr-checkin-per-commit Took 81 min.
@xeno-by
The Scala Programming Language member

You got to be kidding me!!

@viktorklang

FANTASTIC

@ihji

Wow, the benchmark result is TOTALLY awesome. Great work!

@fractal

excellent job!

@jrudolph

👍

@etorreborre etorreborre commented on the diff Jun 7, 2013
...reflect/scala/reflect/internal/util/WeakHashSet.scala
-/** A bare-bones implementation of a mutable `Set` that uses weak references
- * to hold the elements.
+/**
+ * A HashSet where the elements are stored weakly. Elements in this set are elligible for GC if no other
@etorreborre
etorreborre added a line comment Jun 7, 2013

typo: eligible

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@etorreborre etorreborre commented on the diff Jun 7, 2013
...reflect/scala/reflect/internal/util/WeakHashSet.scala
*
- * This implementation offers only add/remove/test operations,
- * therefore it does not fulfill the contract of Scala collection sets.
+ * This Set implementation cannot hold null. Any attempt to put a null in it will result in a NullPointerException
+ *
+ * This set implmeentation is not in general thread safe without external concurrency control. However it behaves
@etorreborre
etorreborre added a line comment Jun 7, 2013

typo: implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment