Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 190 lines (151 sloc) 9.214 kB
df8a3d6 some internal documentation about weak dictionaries
duchier authored
1 -*- outline -*-
2
3 WEAK DICTIONARIES, WEAK POINTERS, AND FINALIZATION
4
5 Denys Duchier
6
7
8
9 Weak dictionaries serve a double purpose: to implement weak references
10 and to provide support for finalization. The view of finalization
11 adopted here was primarily influenced by the 1993 article "Guardians
12 in a Generation-Based Garbage Collector" (R. Kent Dybvig, Carl
13 Bruggeman, David Eby).
14
15 * Weak Dictionaries
16
17 A weak dictionary is a bit like a dictionary and a bit like a port.
18 Like a dictionary it associates keys to values and like a port it is
19 associated with a stream (the finalization stream). Unlike
20 dictionaries, the existence of an association key->value does not
21 cause the value to be retained. If the value becomes inaccessible
22 except through a weak dictionary, then this value becomes eligible for
23 finalization. When the GC notices that an entry in a weak dictionary
24 has become eligible for finalization it removes it from the table and
25 puts the pair key#value on the weak dictionary's finalization stream.
26
27 A weak dictionary is created as follows:
28
29 {WeakDictionary.new FinalizationStream WeakDict}
30
31 WeakDict is the weak dictionary and FinalizationStream is its
32 associated finalization stream. It is up to the programmer to
33 appropriately process entries that appear on the finalization stream:
34 module Finalize provides some abstractions for this purpose, in
35 particular a "guardian" abstraction. Sometimes a weak dictionary is
36 only desired in order to implement weak references; in that case, the
37 finalization stream is not needed, and, as an optimization, the
38 programmer can explicitly "close" it using WeadDictionary.close.
39
40 It is important to understand that, even if the key is still
41 reachable, this does not cause the value to be retained. The value is
42 retained only if it is reachable without going through a weak
43 dictionary.
44
45 ** Weak Pointers
46
47 Weak pointers are simply keys into weak dictionaries. If the value
48 corresponding to a key has been scheduled for finalization, then the
49 key no longer appears in the dictionary.
50
51 * The Contractual Aspect of Weak Dictionaries
52
53 An important aspect of a weak dictionary (this became apparent during
54 fruitful discussions with Seif and Per) is that it has a contract: it
55 is committed to sending to its finalization stream every one of its
56 entries as the value becomes unreachable. This means that it must
57 fulfill this contract also in the event that the weak dictionary
58 itself becomes inaccessible. What this means is that an inaccessible
59 weak dictionary with a non-closed stream must be kept alive as long as
60 it still has entries.
61
62 For example, suppose that you are developing a graphics manipulation
63 package and that you have created a new OZ_Extension to encapsulate
64 pointers to very large bitmaps. Each time you create an instance of
65 your extension, you also record it for finalization, normally with a
66 guardian created especially for this new abstract datatype: suppose
67 the guardian becomes unreachable; you would still like the memory
68 occupied by your huge bitmaps to be released once they become
69 unreachable. This would not happen if the weak dictionary in which
70 they are registered was allowed to disappear before the bitmaps have
71 been scheduled for finalization.
72
73 An obvious implication for the implementation is that the GC must be
74 able to locate all weak dictionaries regardless of whether they are
75 reachable or not. For this reason all weak dictionaries are recorded
76 in a list stored in the global variable `weakList' of the emulator.
77
78 * Garbage Collection
79
80 The way we can determine whether an entry is eligible for finalization
81 or not is by noticing whether it is reached by GC when started from
82 its usual set of roots. Thus the way to process weak dictionaries
83 during GC is to first let GC proceed as usual (except that it does not
84 recursively process the entries of weak dictionaries) and then to look
85 inside each weak dictionary to find out which entries have been
86 reached and which not; only at this point can we decide whether an
87 entry should be finalized or kept.
88
89 ** AM::gCollect in g-collect.cc
90
91 The main GC code contains the following stuff:
92
93 gCollectWeakDictionariesInit();
94 ...
95 gCollectWeakDictionariesPreserve();
96 cacStack.gCollectRecurse();
97 ...
98 gCollectWeakDictionariesContent();
99 weakReviveStack.recurse();
100 cacStack.gCollectRecurse();
101
102 these are actually documented in the source, but we will go through
103 them here again.
104
105 *** gCollectWeakDictionariesInit()
106
107 This is called at the very beginning of GC. The current value of
108 `weakList' is saved into `previousWeakList' and `weakList' is reset to
109 0. We will be able to iterate through `previousWeakList' during GC
110 whenever we need to perform an additional bit of processing on all
111 weak dictionaries that were still live at the start of GC.
112
113 Why do we reset `weakList' to 0? because when we gCollect a weak
114 dictionary in the normal fashion we record it again in `weaklist',
115 however we do not recursively process its entries: we will take care
116 of that in a second phase after the main GC phase has been completed
117 (all reachable memory has been traversed/marked). All weak
118 dictionaries live after GC will be recorded in `weakList'.
119
120 *** gCollectWeakDictionariesPreserve()
121
122 As explained above, each weak dictionary has a contract. An otherwise
123 unreachable dictionary that has not yet fully fulfilled its contract
124 must be preserved: it must be preserved as long as it might need to
125 forward some entry to its finalization stream. This is what the
126 procedure `g...Preserve()' does: it goes through `previousWeakList',
127 find each weak dictionary that needs to be preserved, and gCollect's
128 it.
129
130 A weak dictionary may contain a (finalization) stream, but, in order
131 not to break the invariants of the GC, garbage collection is not
132 recursive. Instead the contents of objects reached, are actually
133 collected in a second phase where `gCollectRecurse()' is called. This
134 is why now that we have possibly collected additional weak
135 dictionaries, it is important to also cause the recursive phase to
136 happen so that the stream are also traversed by GC. This is why we
137 have a call to `cacStack.gCollectRecurse()'.
138
139 *** gCollectWeakDictionariesContent()
140
141 All weak dictionaries that must be preserved have been traversed and
142 are recorded in `weakList'. Now we can process their entries to find
143 out which one are eligible for finalization. This is done by invoking
144 the `weakGC()' method od each weak dictionary in `weakList'.
145
146 **** weakGC()
147
148 It takes care of 2 things: (1) finding out the entries that should be
149 finalized, (2) creating the updated table on the new heap to replace
150 the old table on the old heap.
151
152 Each entry in the old table is examined: if the value is GC-marked,
153 the the value is still live and the entry added to the new table. If
154 the value is not GC-marked and the weak dictionary has a finalization
155 stream (i.e. has not been explicitly closed), then the pair key#value
156 is pushed on the `weakReviveStack' and added to the list named `list'.
157 Note that we initialize `list' with a future that will serve as the
158 new tail of the stream.
159
160 Note that neither key nor value has been gCollect'ed yet (well, it is
161 possible that the key has already been gCollect'ed if it is reachable
162 from somewhere else, but the value certainly has not since it is not
163 GC-marked). We don't want to gCollect them right away because this
164 would affect GC-marks and would therefore perturb the process that
165 decides whether to finalize an entry now or keep it around. Thus, in
166 order to gCollect them later, we push them onto the weakReviveStack.
167
168 During weakGC(), we accumulate the `list' of pairs `key#value' that
169 must be sent for finalization (this list ends with a future which is
170 to serve as the new tail of the finalization stream). When we are
171 done, we push `(stream,list)' on the `weakStack'. `stream' is the
172 pre-GC tail of the finalization stream (a future).
173
174 After we have weakGC'ed all preserved weak dictionaries, we process
175 the weakReviveStack: for each pair `key#value' we gCollect both the
176 `key' and the `value'. Notice that this means that a value that has
177 become eligible for finalization is preserved for (at least) one more
178 round so that the thread processing the finalization stream may apply
179 to it the actual finalization procedure. Thus a finalizable value
180 doesn't immediately disappear: rather it becomes reachable again, but
181 only from its finalization stream(s). However, it is no longer
182 recorded in any weak dictionary (unless the finalization procedure
183 decides to register it again).
184
185 At this point, for each entry `(stream,list)' on the `weakStack',
186 `list' is a list of `key#value' where both args have now been revived
187 (traversed by GC) and ending in a future which is the new tail of the
188 finalization stream post-GC. `stream' is the gCollect'ed old tail of
189 the finalization stream. We simply bind `stream' to `list'.
Something went wrong with that request. Please try again.