-
Notifications
You must be signed in to change notification settings - Fork 7
/
component_report.dita
299 lines (297 loc) · 13 KB
/
component_report.dita
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE reference PUBLIC "-//OASIS//DTD DITA Reference//EN" "reference.dtd" >
<!--
Copyright (c) 2008, 2020 SAP AG and others.
All rights reserved. This program and the accompanying materials
are made available under the terms of the Eclipse Public License v1.0
which accompanies this distribution, and is available at
http://www.eclipse.org/legal/epl-v10.html
Contributors:
SAP AG - initial API and implementation
IBM Corporation - minor updates
-->
<reference id="ref_inspections_component_report" xml:lang="en-us">
<title>Component Report</title>
<shortdesc>Analyze a component for possible memory waste and
other inefficiencies.</shortdesc>
<prolog>
<copyright>
<copyryear year=""></copyryear>
<copyrholder>
Copyright (c) 2008, 2020 SAP AG and others.
All rights reserved. This program and the accompanying materials
are made available under the terms of the Eclipse Public License v1.0
which accompanies this distribution, and is available at
http://www.eclipse.org/legal/epl-v10.html
</copyrholder>
</copyright>
</prolog>
<refbody>
<section id="introduction">
<title>Introduction</title>
<p>A heap dump contains millions of objects. But which of those
belong to your component? And what conclusions can you draw from
them? This is where the Component Report can help.</p>
<p>
Before starting, one has to decide what constitutes a component.
Typically, a component is either a set of classes in a
<b>common root package</b>
or a set of classes loaded by the same
<b>class loader</b>
.
</p>
<p>
Using this root set of objects, the component report calculates a
customized retained set. This retained set includes all objects kept
alive by the root set. Additionally, it assumes that all objects
that have become
<i>finalizable</i>
actually have been finalized and that also all soft references have
been cleared.
</p>
</section>
<section id="run">
<title>Executing the Component Report</title>
<p> To run the report for a common root package, select the component
report from the tool bar and provide a regular expression to match
the package:</p>
<image href="component_report_package.png">
<alt>Regular expression to match common root package to be
used for the component report.</alt>
</image>
<p> Alternatively, one can group the class histogram by class loader
and then right-click the appropriate class loader and select the
component report:</p>
<image href="component_report_classloader.png">
<alt>Group histogram by class loader.</alt>
</image>
</section>
<section id="overview">
<title>Overview</title>
<p>The component report is rendered as HTML. It is stored in a ZIP
file next to the heap dump file.</p>
<image href="component_report_overview.png">
<alt>Overview section of the component report.</alt>
</image>
<p>
<ol outputclass="arrows">
<li>Details about the size, the number of classes, the
number of objects and the number of different class loaders.</li>
<li>The pie chart shows the size of the component relative to
the total heap size.
See <xref href="../tipsandtricks.dita#ref_tips__piechartlinks">Pie Chart Links</xref>
for links from pie charts.
</li>
<li>
The
<xref href="top_consumers.dita" scope="local">Top Consumers</xref>
section lists the biggest object, classes, class loader and
packages which are retained by the component. It provides a good
overview of what is actually kept alive by the component.
</li>
<li>
<xref href="retained_set.dita" scope="local">Retained Set
</xref>
displays all objects grouped by classes which are retained.
</li>
</ol>
</p>
</section>
<section id="strings">
<title>Duplicate Strings</title>
<p>
Duplicate Strings are a prime example for memory waste: multiple
char arrays with identical content. To find the duplicates, the
report
<xref href="group_by_value.dita" scope="local">groups</xref>
the char arrays by their value. It lists all char arrays with 10 or
more instances with identical content.
</p>
<p>
The content of the char arrays typically gives away ideas how to
reduce the duplicates:
<ul>
<li>
Sometimes the duplicate strings are used as
<b>keys or values in hash maps</b>
. For example, when reading heap dumps, MAT itself used to read
the char constant denoting the type of an attribute into memory.
It turned out that the heap was littered with many 'L's for
references, 'B's for bytes, and 'Z's for booleans, etc. By
replacing the
<codeph>char</codeph>
with an
<codeph>int</codeph>
, MAT could save some of the precious memory. Alternatively,
Enumerations could do the same trick.
</li>
<li>
When reading
<b>XML documents</b>
, fragments like
<codeph>UTF-8</codeph>
, tag names or tag content remains in memory. Again, think about
using Enumerations for the repetitive content.
</li>
<li>
Another option is
<xref format="html" scope="external"
href="https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#intern--">interning</xref>
the String. This adds the string to a pool of strings which is
maintained privately by the class
<codeph>String</codeph>
. For each unique string, the pool will keep on instance alive.
However, if you are interning, make sure do it
<b>responsibly</b>
: A big pool of strings will have maintenance costs and one cannot
rely on interned strings being garbage collected.
</li>
</ul>
</p>
</section>
<section id="emptycol">
<title>Empty Collections</title>
<p>Even if collections are empty, they usually consume memory
through their internal object array. Imagine a tree structure where
every node eagerly creates array lists to hold its children, but
only a few nodes actually possess children.</p>
<p>
One remedy is the lazy initialization of the collections: create the
collection only when it is actually needed. To find out who is
responsible for the empty collections, use the
<xref href="immediate_dominators.dita" scope="local">immediate
dominators</xref>
command.
</p>
</section>
<section id="colfillratio">
<title>Collection Fill Ratio</title>
<p>Just like empty ones, collections with only a few elements
also take up a lot of memory. Again, the backing array of the
collection is the main culprit. The examination of the fill ratios
using a heap dump from a production system gives hints to what
initial capacity to use.</p>
</section>
<section id="softref">
<title>Soft Reference Statistics</title>
<p>
Soft references are cleared by the virtual machine in response to
memory demand. Usually, soft references are used to implement
caches: keep the objects around while there is sufficient memory,
clear the objects if free memory becomes low.
<ul>
<li>Usually objects are cached, because they are expensive
to re-create. Across a whole application, soft referenced objects
might carry very different costs. However, the virtual machine
cannot know this and clears the objects on some least recently
used algorithm. From the outside, this is very unpredictable and
difficult to fine tune.</li>
<li>
Furthermore, soft references can impose a
<i>stop-the-world</i>
phase during garbage collection. Oversimplified, the GC marks the
object graph behind the soft references while the virtual machine
is stopped.
</li>
</ul>
The report shows:
<ul>
<li><wintitle>Comment</wintitle>
An example is the following:
<lines><systemoutput>A total of 217,035 java.lang.ref.SoftReference objects have been found, which softly reference 38,874 objects.
77,745 objects totalling 20.8 MB are retained (kept alive) only via soft references.
No objects totalling 0 B are softly referenced and also strongly retained (kept alive) via soft references.
</systemoutput></lines>
</li>
<li><wintitle>Histogram of Soft References</wintitle>
These are the reference objects which are instances of a type or subclass of <codeph>java.lang.ref.SoftReference</codeph></li>
<li><wintitle>Histogram of Softly Referenced</wintitle>
These are the objects which the referent fields point to.</li>
<li><wintitle>Only Softly Retained</wintitle>
These are all the objects retained the objects in the <wintitle>Only Softly Retained</wintitle>
table. All these objects could be freed in the next garbage collection cycle
if the VM was short of memory.</li>
<li><wintitle>Referents strongly retained by soft reference</wintitle>
These could indicate a possible memory leak, as the referent field can
never be cleared while there is a strong path from the soft references
to the referenced objects.</li>
<li>If there are objects in the <wintitle>Referents strongly retained by soft reference</wintitle>
table then the <xref href="reference_leak.dita"><cmdname>Reference Leak</cmdname></xref>
is run to examine the possible leaks in more detail.
<note>
This might not show the same problems because the initial report
<wintitle>Referents strongly retained by soft reference</wintitle> is done for
all references of a particular type whereas the <cmdname>Reference Leak</cmdname>
query operates on individual references.</note></li>
</ul>
</p>
</section>
<section>
<title>Weak Reference Statistics</title>
<p>
Weak references are cleared by the virtual machine when the
object referred to by the referent is no longer strongly reachable
or softly reachable
via another path.
Usually, weak references are used to retain extra data associated
with an object in a <codeph>java.util.WeakHashMap</codeph>
or to maintain a canonical mapping as the canonical object can
be retrieved if it is in use anywhere, but if no longer in use
then the mapping table will not keep it alive.
</p>
<p>The report follows the format of the <xref href="#ref_inspections_component_report__softref">Soft Reference</xref>
section above. An example of the comment is
<lines><systemoutput>A total of 620 java.lang.ref.WeakReference objects have been found, which weakly reference 436 objects.
No objects totalling 0 B are retained (kept alive) only via weak references.
Possible Memory Leak 301 objects totalling 7.1 KB are weakly referenced and also strongly retained (kept alive) via weak references.
</systemoutput></lines>
</p>
</section>
<section id="finalizer">
<title>Finalizer Statistics</title>
<p>
Objects which implement the
<codeph>finalize</codeph>
method are included in the component report, because those objects
can have serious implications for the memory of a Java Virtual
Machine:
<ul>
<li>
Whenever an object with finalizer is created, a corresponding
<codeph>java.lang.ref.Finalizer</codeph>
object is created. If the object is only reachable via its
finalizer, it is placed in the queue of the finalizer thread and
processed. Only then the next garbage collection will actually
free the memory. Therefore it takes at least two garbage
collections until the memory is freed.
</li>
<li>When using Sun's current virtual machine implementation,
the finalizer thread is a single thread processing the finalizer
objects sequentially. One blocking finalizer queue therefore can
easily keep alive big chunks of memory (all those other objects
ready to be finalized).</li>
<li>
Depending on the actual algorithm, finalizer may require a
<i>stop-the-world</i>
pause during garbage collections. This, of course, can have
serious implications for the responsiveness of the whole
application.
</li>
<li>Last not least, the time of execution of the finalizer is
up to the VM and therefore unpredictable.</li>
</ul>
</p>
</section>
<section id="mapcollision">
<title>Map Collision Ratios</title>
<p>This sections analyzes the collision ratios of hash maps. Maps
place the values in different buckets based on the hash code of the
keys. If the hash code points to the same bucket, the elements
inside the bucket are typically compared linearly.</p>
<p>High collision ratios can indicate sub-optimal hash codes.
This is not a memory problem (a better hash code does not save
space) but rather performance problem because of the linear access
inside the buckets.</p>
</section>
</refbody>
</reference>