/
performance
215 lines (174 loc) · 8.76 KB
/
performance
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
* Writing better performing .NET and Mono applications
<center>
Miguel de Icaza (miguel@novell.com)<br>
Ben Maurer (bmaurer@users.sourceforge.net)
</center>
The following document contains a few hints on how to improve
the performance of your Mono/.NET applications.
These are just guidelines, and you should still profile your
code to find the actual performance problems in your
application. It is never a smart idea to make a change with the
hopes of improving the performance of your code without first
measuring. In general, these guidelines should serve as ideas
to help you figure out `how can I make this method run faster'.
It is up to you to figure out, `Which method is running slowly.'
** Using the Mono profiler
So, how does one measure what method are running slowly? A profiler
helps with this task. Mono includes a profiler that is built
into the runtime system. You can invoke this profiler on your program
by running with the --profile flag.
<pre class="shell">
mono --profile program.exe
</pre>
The above will instruct Mono to instrument your application
for profiling. The default Mono profiler will record the time
spent on a routine, the number of times the routine called,
the memory consumed by each method broken down by invoker, and
the total amount of memory consumed.
It does this by asking the JIT to insert a call to the profiler
every time a method is entered or left. The profiler times the
amount of time elapsed between the beginning and the end of the
call. The profiler is also notified of allocations.
When the program has finished executing, the profiler prints the
data in human readable format. It looks like:
<pre class="shell">
Total time spent compiling 227 methods (sec): 0.07154
Slowest method to compile (sec): 0.01893: System.Console::.cctor()
Time(ms) Count P/call(ms) Method name
########################
91.681 1 91.681 .DebugOne::Main()
Callers (with count) that contribute at least for 1%:
1 100 % .DebugOne::Main(object,intptr,intptr)
...
Total number of calls: 3741
...
Allocation profiler
Total mem Method
########################
406 KB .DebugOne::Main()
406 KB 1000 System.Int32[]
Callers (with count) that contribute at least for 1%:
1 100 % .DebugOne::Main(object,intptr,intptr)
Total memory allocated: 448 KB
</pre>
At the top, it shows each method that is called. The data is sorted
by the total time that the program spent within the method. Then
it shows how many times the method was called, and the average time
per call.
Below this, it shows the top callers of the method. This is very useful
data. If you find, for example, that the method Data::Computate () takes
a very long time to run, you can look to see if any of the calls can be
avoided.
Two warnings must be given about the method data. First,
the profiler has an overhead associated with it. As such,
a high number of calls to a method may show up as consuming
lots of time, when in reality they do not consume much time
at all. If you see a method that has a very high number of
calls, you may be able to ignore it. However, do consider
removing calls if possible, as that will sometimes help
performance. This problem is often seen with the use
of built in collection types.
Secondly, due to the nature of the profiler, recursive calls
have extremely large times (because the profiler double counts
when the method calls itself). One easy way to see this problem
is that if a method is shown as taking more time than the Main
method, it is very likely recursive, and causing this problem.
Below the method data, allocation data is shown. This shows
how much memory each method allocates. The number beside
the method is the total amount of memory. Below that, it
is broken down into types. Then, the caller data is given. This
data is again useful when you want to figure out how to eliminate calls.
You might want to keep a close eye on the memory consumption
and on the method invocation counts. A lot of the
performance gains in MCS for example came from reducing its
memory usage, as opposed to changes in the execution path.
** Profiling without JIT instrumentation
You might also be interested in using mono --aot to generate
precompiled code, and then use a system like `oprofile' to
profile your programs.
** Memory Management in the .NET/Mono world.
Since Mono and .NET offer automatic garbage collection, the
programmer is freed from having to track and dispose the
objects it consumes (except for IDispose-like classes). This
is a great productivity gain, but if you create thousands of
objects, that will make the garbage collector do more work,
and it might slow down your application.
Remember, each time you allocate an object, the GC is forced
to find space for the object. Each object has an 8 byte overhead
(4 to tell what type it is, then 4 for a sync block). If
the GC finds that it is running out of room, it will scan every
object for pointers, looking for unreferenced objects. If you allocate
extra objects, the GC then must take the effort to free the objects.
Mono uses the Boehm GC, which is a conservative collector,
and this might lead to some memory fragmentation and unlike
generational GC systems, it has to scan the entire allocated
memory pool.
*** Boxing
The .NET framework provides a rich hierarchy of object types.
Each object not only has value information, but also type
information associated with it. This type information makes
many types of programs easier to write. It also has a cost
associated with it. The type information takes up space.
In order to reduce the cost of type information, almost every
Object Oriented language has the concept of `primitives'.
They usually map to types such as integers and booleans. These
types do not have any type information associated with them.
However, the language also must be able to treat primitives
as first class datums -- in the class with objects. Languages
handle this issue in different ways. Some choose to make a
special class for each primitive, and force the user to do an
operation such as:
<pre class="shell">
// This is Java
list.add (new Integer (1));
System.out.println (list.get (1).intValue ());
</pre>
The C# design team was not satisfied with this type
of construct. They added a notion of `boxing' to the language.
Boxing preforms the same thing as Java's <code>new Integer (1)</code>.
The user is not forced to write the extra code. However,
behind the scenes the <em>same thing</em> is being done
by the runtime. Each time a primitive is cast to an object,
a new object is allocated.
You must be careful when casting a primitive to an object.
Note that because it is an implicit conversion, you will
not see it in your code. For example, boxing is happening here:
<pre class="shell">
ArrayList foo = new ArrayList ();
foo.Add (1);
</pre>
In high performance code, this operation can be very costly.
*** Using structs instead of classes for small objects
For small objects, you might want to consider using value
types (structs) instead of object (classes).
However, you must be careful that you do not use the struct
as an object, in that case it will actually be more costly.
As a rule of thumb, only use structs if you have a small
number of fields (totaling less than 32 bytes), and
need to pass the item `by value'. You should not box the object.
*** Assisting the Garbage Collector
Although the Garbage Collector will do the right thing in
terms of releasing and finalizing objects on time, you can
assist the garbage collector by clearing the fields that
points to objects. This means that some objects might be
eligible for collection earlier than they would, this can help
reduce the memory consumption and reduce the work that the GC
has to do.
** Common problems with <tt>foreach</tt>
The <tt>foreach</tt> C# statement handles various kinds of
different constructs (about seven different code patterns are
generated). Typically foreach generates more efficient code
than loops constructed manually, and also ensures that objects
which implement IDispose are properly released.
But foreach sometimes might generate code that under stress
performs badly. Foreach performs badly when its used in tight
loops, and its use leads to the creation of many enumerators.
Although technically obtaining an enumerator for some objects
like ArrayList is more efficient than using the ArrayList
indexer, the pressure introduced due to the extra memory
requirements and the demands on the garbage collector make it
more inefficient.
There is no straight-forward rule on when to use foreach, and
when to use a manual loop. The best thing to do is to always
use foreach, and only when profile shows a problem, replace
foreach with for loops.