Skip to content
Newer
Older
100644 178 lines (100 sloc) 5.81 KB
7c0c5a7 @tmm1 Update README to markdown
authored Dec 22, 2010
1 # perftools.rb
2
3 google-perftools for ruby code
4 (c) 2010 Aman Gupta (tmm1)
5 http://www.ruby-lang.org/en/LICENSE.txt
6
7 ## Usage (in a webapp)
8
9 Use [rack-perftools_profiler](https://github.com/bhb/rack-perftools_profiler):
10
11 require 'rack/perftools_profiler'
12 config.middleware.use ::Rack::PerftoolsProfiler, :default_printer => 'gif'
13
14 Simply add `profile=true` to profile a request:
15
16 curl -o 10_requests_to_homepage.gif "http://localhost:3000/homepage?profile=true&times=10"
17
18 ## Usage (from Ruby)
19
20 Run the profiler with a block:
21
22 require 'perftools'
23 PerfTools::CpuProfiler.start("/tmp/add_numbers_profile") do
24 5_000_000.times{ 1+2+3+4+5 }
25 end
26
27 Start and stop the profiler manually:
28
29 require 'perftools'
30 PerfTools::CpuProfiler.start("/tmp/add_numbers_profile")
31 5_000_000.times{ 1+2+3+4+5 }
32 PerfTools::CpuProfiler.stop
33
34 ## Usage (externally)
35
36 Profile an existing ruby application without modifying it:
37
38 $ CPUPROFILE=/tmp/my_app_profile \
39 RUBYOPT="-r`gem which perftools | tail -1`" \
40 ruby my_app.rb
41
42 ## Profiler Modes
43
44 The profiler can be run in one of many modes, set via an environment
45 variable before the library is loaded:
46
47 * `CPUPROFILE_REALTIME=1`
48
49 Use walltime instead of cputime profiling. This will capture all time spent in a method, even if it does not involve the CPU.
50
51 For example, `sleep()` is not expensive in terms of cputime, but very expensive in walltime. walltime will also show functions spending a lot of time in network i/o.
52
53 * `CPUPROFILE_OBJECTS=1`
54
55 Profile object allocations instead of cpu/wall time. Each sample represents one object created inside that function.
56
57 * `CPUPROFILE_METHODS=1`
58
59 Profile method calls. Each sample represents one method call made inside that function.
60
61 The sampling interval of the profiler can be adjusted to collect more
62 (for better profile detail) or fewer samples (for lower overhead):
63
64 * `CPUPROFILE_FREQUENCY=500`
65
66 Default sampling interval is 100 times a second. Valid range is 1-4000
67
68 ## Reporting
69
70 pprof.rb --text /tmp/add_numbers_profile
71
72 pprof.rb --pdf /tmp/add_numbers_profile > /tmp/add_numbers_profile.pdf
73
74 pprof.rb --gif /tmp/add_numbers_profile > /tmp/add_numbers_profile.gif
75
76 pprof.rb --callgrind /tmp/add_numbers_profile > /tmp/add_numbers_profile.grind
77 kcachegrind /tmp/add_numbers_profile.grind
78
79 pprof.rb --gif --focus=Integer /tmp/add_numbers_profile > /tmp/add_numbers_custom.gif
80
81 pprof.rb --text --ignore=Gem /tmp/my_app_profile
82
83
84 For more options, see [pprof documentation](http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html#pprof)
85
86
87 ### Examples
88
89 #### pprof.rb --text
90
91 Total: 1735 samples
92 1487 85.7% 85.7% 1735 100.0% Integer#times
93 248 14.3% 100.0% 248 14.3% Fixnum#+
94
95 #### pprof.rb --gif
96
97 * Simple [require 'rubygems'](http://perftools-rb.rubyforge.org/examples/rubygems.gif) profile
98
99 * Comparing redis-rb [with](http://perftools-rb.rubyforge.org/examples/redis-rb.gif) and [without](http://perftools-rb.rubyforge.org/examples/redis-rb-notimeout.gif) SystemTimer based socket timeouts
100
101 * [Sinatra](http://perftools-rb.rubyforge.org/examples/sinatra.gif) vs. [Merb](http://perftools-rb.rubyforge.org/examples/merb.gif) vs. [Rails](http://perftools-rb.rubyforge.org/examples/rails.gif)
102
103 * C-level profile of EventMachine + epoll + Ruby threads [before](http://perftools-rb.rubyforge.org/examples/eventmachine-epoll+nothreads.gif) and [after](http://perftools-rb.rubyforge.org/examples/eventmachine-epoll+threads.gif) a [6 line EM bugfix](http://timetobleed.com/6-line-eventmachine-bugfix-2x-faster-gc-1300-requestssec/)
104
105 * C-level profile of a [ruby/rails vm](http://perftools-rb.rubyforge.org/examples/ruby_interpreter.gif)
106 * 12% time spent in re_match_exec because of excessive calls to rb_str_sub_bang by Date.parse
107
108
109 ## Installation
110
111 Just install the gem, which will download, patch and compile google-perftools for you:
112
113 sudo gem install perftools.rb
114
115 Or build your own gem:
116
117 git clone git://github.com/tmm1/perftools.rb
118 cd perftools.rb
119 gem build perftools.rb.gemspec
120 gem install perftools.rb
121
122
123 You'll also need graphviz to generate call graphs using dot:
124
125 sudo brew install graphviz ghostscript # osx
126 sudo apt-get install graphviz ps2pdf # debian/ubuntu
127
128 ## Advantages over ruby-prof
129
130 * Sampling profiler
131
132 * perftools samples your process using setitimer() so it can be used in production with minimal overhead.
133
134
135 ## Profiling the Ruby VM and C extensions
136
137 To profile C code, download and build an unpatched perftools (libunwind or ./configure --enable-frame-pointers required on x86_64).
138
139 Download:
140
141 wget http://google-perftools.googlecode.com/files/google-perftools-1.6.tar.gz
142 tar zxvf google-perftools-1.6.tar.gz
143 cd google-perftools-1.6
144
145 Compile:
146
147 ./configure --prefix=/opt
148 make
149 sudo make install
150
151 Profile:
152
153 export LD_PRELOAD=/opt/lib/libprofiler.so # for linux
154 export DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylib # for osx
155 CPUPROFILE=/tmp/ruby_interpreter.profile ruby -e' 5_000_000.times{ "hello world" } '
156
157 Report:
158
159 pprof `which ruby` --text /tmp/ruby_interpreter.profile
160
161
162 ## TODO
163
164 * Add support for heap profiling to find memory leaks (PerfTools::HeapProfiler)
165 * Allow both C and Ruby profiling
166 * Add setter for the sampling interval
167
168
169 ## Resources
170
171 * [GoRuCo 2009 Lightning Talk on perftools.rb](http://goruco2009.confreaks.com/30-may-2009-18-35-rejectconf-various-presenters.html) @ 21:52
172
173 * [Ilya Grigorik's introduction to perftools.rb](http://www.igvita.com/2009/06/13/profiling-ruby-with-googles-perftools/)
174
175 * [Google Perftools](http://code.google.com/p/google-perftools/)
176
177 * [Analyzing profiles and interpreting different output formats](http://google-perftools.googlecode.com/svn/trunk/doc/cpuprofile.html#pprof)
Something went wrong with that request. Please try again.