Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 432 lines (344 sloc) 14.019 kB
a85bd0c Initial commit
Jeff Muizelaar authored
1 ## statprof.py
9ae3cc6 @bos Track line numbers properly.
authored
2 ## Copyright (C) 2012 Bryan O'Sullivan <bos@serpentine.com>
a85bd0c Initial commit
Jeff Muizelaar authored
3 ## Copyright (C) 2011 Alex Fraser <alex at phatcore dot com>
4 ## Copyright (C) 2004,2005 Andy Wingo <wingo at pobox dot com>
5 ## Copyright (C) 2001 Rob Browning <rlb at defaultvalue dot org>
6
7 ## This library is free software; you can redistribute it and/or
8 ## modify it under the terms of the GNU Lesser General Public
9 ## License as published by the Free Software Foundation; either
10 ## version 2.1 of the License, or (at your option) any later version.
11 ##
12 ## This library is distributed in the hope that it will be useful,
13 ## but WITHOUT ANY WARRANTY; without even the implied warranty of
14 ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
15 ## Lesser General Public License for more details.
16 ##
17 ## You should have received a copy of the GNU Lesser General Public
18 ## License along with this program; if not, contact:
19 ##
20 ## Free Software Foundation Voice: +1-617-542-5942
21 ## 59 Temple Place - Suite 330 Fax: +1-617-542-2652
22 ## Boston, MA 02111-1307, USA gnu@gnu.org
23
24 """
25 statprof is intended to be a fairly simple statistical profiler for
26 python. It was ported directly from a statistical profiler for guile,
27 also named statprof, available from guile-lib [0].
28
29 [0] http://wingolog.org/software/guile-lib/statprof/
30
31 To start profiling, call statprof.start():
32 >>> start()
33
34 Then run whatever it is that you want to profile, for example:
35 >>> import test.pystone; test.pystone.pystones()
36
37 Then stop the profiling and print out the results:
38 >>> stop()
39 >>> display()
456e86a @bos Kill off trailing whitespace
authored
40 % cumulative self
41 time seconds seconds name
a85bd0c Initial commit
Jeff Muizelaar authored
42 26.72 1.40 0.37 pystone.py:79:Proc0
43 13.79 0.56 0.19 pystone.py:133:Proc1
44 13.79 0.19 0.19 pystone.py:208:Proc8
45 10.34 0.16 0.14 pystone.py:229:Func2
46 6.90 0.10 0.10 pystone.py:45:__init__
47 4.31 0.16 0.06 pystone.py:53:copy
48 ...
49
69b84b2 @bos Improve docs
authored
50 All of the numerical data is statistically approximate. In the
51 following column descriptions, and in all of statprof, "time" refers
52 to execution time (both user and system), not wall clock time.
a85bd0c Initial commit
Jeff Muizelaar authored
53
54 % time
55 The percent of the time spent inside the procedure itself (not
56 counting children).
57
58 cumulative seconds
59 The total number of seconds spent in the procedure, including
60 children.
61
62 self seconds
63 The total number of seconds spent in the procedure itself (not
64 counting children).
65
66 name
67 The name of the procedure.
68
69 By default statprof keeps the data collected from previous runs. If you
70 want to clear the collected data, call reset():
71 >>> reset()
72
fee07fa @bos Increase the default sampling frequency to 1000 Hz.
authored
73 reset() can also be used to change the sampling frequency from the
74 default of 1000 Hz. For example, to tell statprof to sample 50 times a
75 second:
a85bd0c Initial commit
Jeff Muizelaar authored
76 >>> reset(50)
77
78 This means that statprof will sample the call stack after every 1/50 of
79 a second of user + system time spent running on behalf of the python
80 process. When your process is idle (for example, blocking in a read(),
81 as is the case at the listener), the clock does not advance. For this
82 reason statprof is not currently not suitable for profiling io-bound
83 operations.
84
85 The profiler uses the hash of the code object itself to identify the
86 procedures, so it won't confuse different procedures with the same name.
87 They will show up as two different rows in the output.
88
89 Right now the profiler is quite simplistic. I cannot provide
90 call-graphs or other higher level information. What you see in the
91 table is pretty much all there is. Patches are welcome :-)
92
93
94 Threading
95 ---------
96
97 Because signals only get delivered to the main thread in Python,
98 statprof only profiles the main thread. However because the time
99 reporting function uses per-process timers, the results can be
100 significantly off if other threads' work patterns are not similar to the
101 main thread's work patterns.
102 """
103 from __future__ import division
104
105 import os
9ae3cc6 @bos Track line numbers properly.
authored
106 import signal
a85bd0c Initial commit
Jeff Muizelaar authored
107
02ddd9e @DurhamG Add a format option to group results by function.
DurhamG authored
108 from collections import defaultdict
7972fe8 @cyberdelia add profile contextmanager
cyberdelia authored
109 from contextlib import contextmanager
110
a85bd0c Initial commit
Jeff Muizelaar authored
111
7972fe8 @cyberdelia add profile contextmanager
cyberdelia authored
112 __all__ = ['start', 'stop', 'reset', 'display', 'profile']
a85bd0c Initial commit
Jeff Muizelaar authored
113
114
115 ###########################################################################
116 ## Utils
117
118 def clock():
119 times = os.times()
120 return times[0] + times[1]
121
122
123 ###########################################################################
124 ## Collection data structures
125
126 class ProfileState(object):
127 def __init__(self, frequency=None):
128 self.reset(frequency)
129
130 def reset(self, frequency=None):
131 # total so far
132 self.accumulated_time = 0.0
133 # start_time when timer is active
134 self.last_start_time = None
135 # total count of sampler calls
136 self.sample_count = 0
137 # a float
138 if frequency:
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
139 self.sample_interval = 1.0 / frequency
a85bd0c Initial commit
Jeff Muizelaar authored
140 elif not hasattr(self, 'sample_interval'):
fee07fa @bos Increase the default sampling frequency to 1000 Hz.
authored
141 # default to 1000 Hz
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
142 self.sample_interval = 1.0 / 1000.0
a85bd0c Initial commit
Jeff Muizelaar authored
143 else:
144 # leave the frequency as it was
145 pass
146 self.remaining_prof_time = None
147 # for user start/stop nesting
148 self.profile_level = 0
149 # whether to catch apply-frame
150 self.count_calls = False
151 # gc time between start() and stop()
152 self.gc_time_taken = 0
153
154 def accumulate_time(self, stop_time):
155 self.accumulated_time += stop_time - self.last_start_time
156
157 state = ProfileState()
158
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
159
9ae3cc6 @bos Track line numbers properly.
authored
160 class CodeKey(object):
161 cache = {}
162
163 __slots__ = ('filename', 'lineno', 'name')
164
165 def __init__(self, frame):
166 code = frame.f_code
a85bd0c Initial commit
Jeff Muizelaar authored
167 self.filename = code.co_filename
9ae3cc6 @bos Track line numbers properly.
authored
168 self.lineno = frame.f_lineno
169 self.name = code.co_name
170
171 def __eq__(self, other):
172 try:
173 return (self.lineno == other.lineno and
174 self.filename == other.filename)
175 except:
176 return False
177
178 def __hash__(self):
179 return hash((self.lineno, self.filename))
180
181 @classmethod
182 def get(cls, frame):
183 k = (frame.f_code.co_filename, frame.f_lineno)
184 try:
185 return cls.cache[k]
186 except KeyError:
187 v = cls(frame)
188 cls.cache[k] = v
189 return v
190
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
191
9ae3cc6 @bos Track line numbers properly.
authored
192 class CallData(object):
193 all_calls = {}
194
195 __slots__ = ('key', 'call_count', 'cum_sample_count', 'self_sample_count')
196
197 def __init__(self, key):
198 self.key = key
a85bd0c Initial commit
Jeff Muizelaar authored
199 self.call_count = 0
200 self.cum_sample_count = 0
201 self.self_sample_count = 0
202
9ae3cc6 @bos Track line numbers properly.
authored
203 @classmethod
204 def get(cls, key):
205 try:
206 return cls.all_calls[key]
207 except KeyError:
208 v = CallData(key)
209 cls.all_calls[key] = v
210 return v
a85bd0c Initial commit
Jeff Muizelaar authored
211
212
213 ###########################################################################
214 ## SIGPROF handler
215
216 def sample_stack_procs(frame):
217 state.sample_count += 1
9ae3cc6 @bos Track line numbers properly.
authored
218 key = CodeKey.get(frame)
219 CallData.get(key).self_sample_count += 1
a85bd0c Initial commit
Jeff Muizelaar authored
220
9ae3cc6 @bos Track line numbers properly.
authored
221 keys_seen = set()
a85bd0c Initial commit
Jeff Muizelaar authored
222 while frame:
9ae3cc6 @bos Track line numbers properly.
authored
223 key = CodeKey.get(frame)
224 keys_seen.add(key)
a85bd0c Initial commit
Jeff Muizelaar authored
225 frame = frame.f_back
9ae3cc6 @bos Track line numbers properly.
authored
226 for key in keys_seen:
227 CallData.get(key).cum_sample_count += 1
a85bd0c Initial commit
Jeff Muizelaar authored
228
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
229
a85bd0c Initial commit
Jeff Muizelaar authored
230 def profile_signal_handler(signum, frame):
231 if state.profile_level > 0:
232 state.accumulate_time(clock())
233 sample_stack_procs(frame)
234 signal.setitimer(signal.ITIMER_PROF,
235 state.sample_interval, 0.0)
236 state.last_start_time = clock()
237
238
239 ###########################################################################
240 ## Profiling API
241
242 def is_active():
243 return state.profile_level > 0
244
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
245
a85bd0c Initial commit
Jeff Muizelaar authored
246 def start():
69b84b2 @bos Improve docs
authored
247 '''Install the profiling signal handler, and start profiling.'''
a85bd0c Initial commit
Jeff Muizelaar authored
248 state.profile_level += 1
249 if state.profile_level == 1:
250 state.last_start_time = clock()
251 rpt = state.remaining_prof_time
252 state.remaining_prof_time = None
253 signal.signal(signal.SIGPROF, profile_signal_handler)
254 signal.setitimer(signal.ITIMER_PROF,
255 rpt or state.sample_interval, 0.0)
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
256 state.gc_time_taken = 0 # dunno
257
456e86a @bos Kill off trailing whitespace
authored
258
a85bd0c Initial commit
Jeff Muizelaar authored
259 def stop():
69b84b2 @bos Improve docs
authored
260 '''Stop profiling, and uninstall the profiling signal handler.'''
a85bd0c Initial commit
Jeff Muizelaar authored
261 state.profile_level -= 1
262 if state.profile_level == 0:
263 state.accumulate_time(clock())
264 state.last_start_time = None
265 rpt = signal.setitimer(signal.ITIMER_PROF, 0.0, 0.0)
266 signal.signal(signal.SIGPROF, signal.SIG_IGN)
267 state.remaining_prof_time = rpt[0]
f825408 @cyberdelia strict pep8 compliance
cyberdelia authored
268 state.gc_time_taken = 0 # dunno
269
456e86a @bos Kill off trailing whitespace
authored
270
a85bd0c Initial commit
Jeff Muizelaar authored
271 def reset(frequency=None):
69b84b2 @bos Improve docs
authored
272 '''Clear out the state of the profiler. Do not call while the
273 profiler is running.
274
275 The optional frequency argument specifies the number of samples to
276 collect per second.'''
a85bd0c Initial commit
Jeff Muizelaar authored
277 assert state.profile_level == 0, "Can't reset() while statprof is running"
9ae3cc6 @bos Track line numbers properly.
authored
278 CallData.all_calls.clear()
279 CodeKey.cache.clear()
a85bd0c Initial commit
Jeff Muizelaar authored
280 state.reset(frequency)
456e86a @bos Kill off trailing whitespace
authored
281
a85bd0c Initial commit
Jeff Muizelaar authored
282
7972fe8 @cyberdelia add profile contextmanager
cyberdelia authored
283 @contextmanager
284 def profile():
285 start()
286 try:
287 yield
288 finally:
289 stop()
290 display()
291
292
a85bd0c Initial commit
Jeff Muizelaar authored
293 ###########################################################################
294 ## Reporting API
295
296 class CallStats(object):
297 def __init__(self, call_data):
298 self_samples = call_data.self_sample_count
299 cum_samples = call_data.cum_sample_count
300 nsamples = state.sample_count
301 secs_per_sample = state.accumulated_time / nsamples
9ae3cc6 @bos Track line numbers properly.
authored
302 basename = os.path.basename(call_data.key.filename)
a85bd0c Initial commit
Jeff Muizelaar authored
303
02ddd9e @DurhamG Add a format option to group results by function.
DurhamG authored
304 self.lineno = call_data.key.lineno
305 self.filepath = call_data.key.filename
306 self.filename = basename
307 self.function = call_data.key.name
308 self.name = '%s:%d:%s' % (self.filename, self.lineno, self.function)
a85bd0c Initial commit
Jeff Muizelaar authored
309 self.pcnt_time_in_proc = self_samples / nsamples * 100
310 self.cum_secs_in_proc = cum_samples * secs_per_sample
311 self.self_secs_in_proc = self_samples * secs_per_sample
312 self.num_calls = None
313 self.self_secs_per_call = None
314 self.cum_secs_per_call = None
315
6015918 @bos Add support for display to non-stdout
authored
316 def display(self, fp):
317 print >> fp, ('%6.2f %9.2f %9.2f %s' % (self.pcnt_time_in_proc,
318 self.cum_secs_in_proc,
319 self.self_secs_in_proc,
320 self.name))
a85bd0c Initial commit
Jeff Muizelaar authored
321
322
02ddd9e @DurhamG Add a format option to group results by function.
DurhamG authored
323 class DisplayFormats:
324 ByLine = 0
325 ByMethod = 1
326
327
328 def display(fp=None, format=0):
69b84b2 @bos Improve docs
authored
329 '''Print statistics, either to stdout or the given file object.'''
330
6015918 @bos Add support for display to non-stdout
authored
331 if fp is None:
e9f5e2e @jedbrown Need to import sys to use sys.stdout
jedbrown authored
332 import sys
6015918 @bos Add support for display to non-stdout
authored
333 fp = sys.stdout
a85bd0c Initial commit
Jeff Muizelaar authored
334 if state.sample_count == 0:
6015918 @bos Add support for display to non-stdout
authored
335 print >> fp, ('No samples recorded.')
a85bd0c Initial commit
Jeff Muizelaar authored
336 return
337
02ddd9e @DurhamG Add a format option to group results by function.
DurhamG authored
338 if format == DisplayFormats.ByLine:
339 display_by_line(fp)
340 elif format == DisplayFormats.ByMethod:
341 display_by_method(fp)
342 else:
343 raise Exception("Invalid display format")
344
345 print >> fp, ('---')
346 print >> fp, ('Sample count: %d' % state.sample_count)
347 print >> fp, ('Total time: %f seconds' % state.accumulated_time)
348
349
350 def display_by_line(fp):
351 '''Print the profiler data with each sample line represented
352 as one row in a table. Sorted by self-time per line.'''
9ae3cc6 @bos Track line numbers properly.
authored
353 l = [CallStats(x) for x in CallData.all_calls.itervalues()]
a85bd0c Initial commit
Jeff Muizelaar authored
354 l.sort(reverse=True, key=lambda x: x.self_secs_in_proc)
355
6015918 @bos Add support for display to non-stdout
authored
356 print >> fp, ('%5.5s %10.10s %7.7s %-8.8s' %
357 ('% ', 'cumulative', 'self', ''))
456e86a @bos Kill off trailing whitespace
authored
358 print >> fp, ('%5.5s %9.9s %8.8s %-8.8s' %
6015918 @bos Add support for display to non-stdout
authored
359 ("time", "seconds", "seconds", "name"))
a85bd0c Initial commit
Jeff Muizelaar authored
360
361 for x in l:
6015918 @bos Add support for display to non-stdout
authored
362 x.display(fp)
a85bd0c Initial commit
Jeff Muizelaar authored
363
02ddd9e @DurhamG Add a format option to group results by function.
DurhamG authored
364 def get_line_source(filename, lineno):
365 '''Gets the line text for the line in the file.'''
366 lineno -= 1
367 fp = None
368 try:
369 fp = open(filename)
370 for i, line in enumerate(fp):
371 if i == lineno:
372 return line
373 except Exception:
374 pass
375 finally:
376 if fp:
377 fp.close()
378
379 return ""
380
381 def display_by_method(fp):
382 '''Print the profiler data with each sample function represented
383 as one row in a table. Important lines within that function are
384 output as nested rows. Sorted by self-time per line.'''
385 print >> fp, ('%5.5s %10.10s %7.7s %-8.8s' %
386 ('% ', 'cumulative', 'self', ''))
387 print >> fp, ('%5.5s %9.9s %8.8s %-8.8s' %
388 ("time", "seconds", "seconds", "name"))
389
390 calldata = [CallStats(x) for x in CallData.all_calls.itervalues()]
391
392 grouped = defaultdict(list)
393 for call in calldata:
394 grouped[call.filename + ":" + call.function].append(call)
395
396 # compute sums for each function
397 functiondata = []
398 for fname, samples in grouped.iteritems():
399 total_cum_sec = 0
400 total_self_sec = 0
401 total_percent = 0
402 for sample in samples:
403 total_cum_sec += sample.cum_secs_in_proc
404 total_self_sec += sample.self_secs_in_proc
405 total_percent += sample.pcnt_time_in_proc
406 functiondata.append((fname,
407 total_cum_sec,
408 total_self_sec,
409 total_percent,
410 samples))
411
412 # sort by total self sec
413 functiondata.sort(reverse=True, key=lambda x: x[2])
414
415 for function in functiondata:
416 print >> fp, ('%6.2f %9.2f %9.2f %s' % (function[3], # total percent
417 function[1], # total cum sec
418 function[2], # total self sec
419 function[0])) # file:function
420 function[4].sort(reverse=True, key=lambda i: i.self_secs_in_proc)
421 for call in function[4]:
422 # only show line numbers for significant locations ( > 1% time spent)
423 if call.pcnt_time_in_proc > 1:
424 source = get_line_source(call.filepath, call.lineno).strip()
425 if len(source) > 25:
426 source = source[:20] + "..."
427
428 print >> fp, ('%33.0f%% %6.2f line %s: %s' % (call.pcnt_time_in_proc,
429 call.self_secs_in_proc,
430 call.lineno,
431 source))
Something went wrong with that request. Please try again.