@@ -117,7 +117,58 @@ There may well be none in a stable release.
117
117
118
118
=item *
119
119
120
- XXX
120
+ Large hashes no longer allocate their keys from the shared string table.
121
+
122
+ The same internal datatype (C<PVHV>) is used for all of
123
+
124
+ =over 4
125
+
126
+ =item *
127
+
128
+ Symbol tables
129
+
130
+ =item *
131
+
132
+ Objects (by default)
133
+
134
+ =item *
135
+
136
+ Associative arrays
137
+
138
+ =back
139
+
140
+ The shared string table was originally added to improve performance for blessed
141
+ hashes used as objects, because every object instance has the same keys, so it
142
+ is an optimisation to share memory between them. It also makes sense for symbol
143
+ tables, where derived classes will have the same keys (typically method names),
144
+ and the OP trees built for method calls can also share memory. The shared
145
+ string table behaves roughly like a cache for hash keys.
146
+
147
+ But for hashes actually used as associative arrays - mapping keys to values -
148
+ typically the keys are not re-used in other hashes. For example, "seen" hashes
149
+ are keyed by object IDs (or addresses), and logically these keys won't repeat
150
+ in other hashes.
151
+
152
+ Storing these "used just once" keys in the shared string table increases CPU
153
+ and RAM use for no gain. For such keys the shared string table behaves as a
154
+ cache with a 0% hit rate. Storing all the keys there increases the total size
155
+ of the shared string table, as well as increasing the number of times it is
156
+ resized as it grows. B<Worse> - in any environment that has "copy on write"
157
+ memory for child process (such as a pre-forking server), the memory pages used
158
+ for the shared string table rapidly need to be copied as the child process
159
+ manipulates hashes. Hence if most of the shared string table is such keys that
160
+ are used only in one place, there is no benefit from re-use within the perl
161
+ interpreter, but a high cost due to more pages for the OS to copy.
162
+
163
+ The perl interpreter now disables shared hash keys for "large" hashes (that are
164
+ neither objects nor symbol tables). "Large" is a heuristic - currently the
165
+ heuristic is that sharing is disabled when adding a key to a hash triggers
166
+ allocation of more storage, and the hash has more than 42 keys.
167
+
168
+ This B<might> cause slightly increased memory usage for programs that create
169
+ (unblessed) data structures that contain multiple large hashes that share the
170
+ same keys. But generally our testing suggests that for the specific cases
171
+ described it is a win, and other code is unaffected.
121
172
122
173
=back
123
174
0 commit comments