Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Newer
Older
100644 254 lines (184 sloc) 10.353 kB
b12d58b @zedshaw first commit
authored
1 = RFuzz HTTP Destroyer
2
3 RFuzz is the start of a Ruby based HTTP thrasher, destroyer, fuzzer, and client
4 based on the Mongrel project's HTTP parser and the statistical analysis of
5 being very mean to a web server.
6
7 At the moment is has a working and fairly extensive HTTP 1.1 client and some
8 basic statistics math borrowed from the Mongrel project.
9
10 == RubyForge Project
11
12 The project is hosted at:
13
14 http://rubyforge.org/projects/rfuzz/
15
16 Where you can file bugs and other things, as well as download gems manually.
17
18
19 == Motivation
20
21 The motivation for RFuzz comes from little scripts I've written during Mongrel
22 development to "fuzz" or attack the Mongrel code.
23
24 RFuzz will simply use the built-in ultra-correct HTTP client and a Ruby DSL to
25 let you write scripts that exploit servers, thrash them with random data, or
26 simply run simple test suites.
27
28 It may also perform analysis of performance data and work as a simply load or
29 pen testing tool. This is only a secondary goal though since there's plenty of
30 good tools for that.
31
32 == Installing
33
34 You can install RFuzz by simply using RubyGems:
35
36 sudo gem install rfuzz
37
38 It doesn't support windows unless you have build tools that can compile
39 modules against Ruby. No, you don't get this with Ruby One Click.
40
41
42 == RFuzz HTTP Client
43
44 It also comes from not being satisfied with the stock net/http library. While
45 this library is good for high-level HTTP access to resources, it is much too
46 abstract and protective to be used in a fuzzing tool.
47
48 In a tool such as RFuzz you need to have the following features in an HTTP
49 client library:
50
51 1. No protection from exceptions to analyze exactly what's happening.
52 2. Ability to "throttle" the client to simulate different kinds of request loads.
53 3. No threading or additional overhead to test the impact of threads, but thread safe.
54 4. Ability to encode the majority of the request as data elements for loading.
55 5. Fast and exact HTTP parser to validate the server's response is correct.
56 6. Tracks cookies between requests to keep session data going.
57
58 RFuzz::HttpClient supports all of these features already, with cookies being
59 the weakest right now.
60
61 === Using The Client
62
63 The client is designed that you create an RFuzz::HttpClient object once with
64 all the common parameters and the host you want to talk with, and then you call
65 a series of methods on the client object that match the HTTP methods GET, POST,
66 PUT, DELETE, and HEAD. You can add more methods if you like (see the documentation).
67
68 Here's a simple example:
69
70 require 'rfuzz/client'
71
72 cl = RFuzz::HttpClient.new("www.google.com", 80, :query => {"q" => "zed shaw"})
73
74 resp = cl.get("/search")
75 resp.http_body.grep(/zed/)
76 => ["<html><head><meta HTTP-EQUIV=\"content-type\" CONTENT=\"text/html;
77 charset=ISO-8859-1\"><title>zed shaw - Google Search</title><style><!--\n"]
78
79 resp = cl.get("/search", :query => {"q" => "frank"})
80 => ["<html><head><meta HTTP-EQUIV=\"content-type\" CONTENT=\"text/html;
81 charset=ISO-8859-1\"><title>frank - Google Search</title><style><!--\n"]
82
83 Notice that we made a client that actually had a default :query to just search for
84 my name (Zed Shaw) and then we only had to cl.get("/search"). In the second
85 query though we just set :query to something else (a search for "frank") and it
86 automatically overrides the parameters. This makes it possible to set common
87 parameters, cookies, and headers in blocks of requests to reduce repetition.
88
89 === Client Limitations
90
91 The client handles chunked encoding inside the parser but the code for it is
92 still quite nasty. I'll be attacking that and cleaning it up very soon.
93 Even with this it's able to efficiently parse chunked encodings without
94 many problems (but could be better).
95
96 It can't also parse cookies properly yet, so the above example kind of works, but the
97 cookie isn't returned right.
98
99 == Randomness Generator
100
101 RFuzz features a RandomGenerator class that uses the ArcFour random number
102 generation algorithm to generate lots of random garbage very fast in various
103 formats. RFuzz will use this to send the garbage it needs to the application
104 in an attempt to find forms that can't handle nastiness, badly implemented
105 servers, etc. It's amazing how many bugs you actually can find by sending
106 junk to an application.
107
108 The types of randomness you can generate are:
109
110 * words -- RFuzz includes a simple word list, but you can add your own.
111 * base64 -- Arrays of base64 encoded junk.
112 * byte_array -- Arrays of just junk.
113 * uris -- Arrays of URIs composed of words strung together with /.
114 * ints -- Random integers (with an allowed maximum).
115 * floats -- Random floats.
116 * headers,queries -- Hashes of key=value where the keys and values can be any of the above.
117
118 The ArcFour fuzzrnd random generator is in a C extension so it's small and fast.
119 A big advantage of fuzzrnd is that it generates the same stream of random bytes
120 for the same input seeds. This lets you set a seed and then if you find an
121 error replay the same attack but still have random data.
122
123 An example of using RandomGenerator is:
124
125 g = RFuzz::RandomGenerator.new(open("resources/words.txt").read.split("\n"))
126 h = g.headers(2,4,type=:ints)
127 => [{1398667391=>2615968266, 465122870=>2683411899, 2100652296=>4131806743,
128 158954822=>2544978312}, {3126281447=>2247028995, 269763016=>1444943723,
129 2401569363=>1661839605, 2811294153=>400252371}]
130
131 As you can see this produces 2 hashes consisting of 4 key=value pairs with integers in them. You can quickly replace type=:ints with type=:words and get:
132
133 => [{"Europeanizes"=>"Byronize's", "royalization's"=>"Americanizer's",
134 "celiorrhea"=>"unliteralized", "unvictimized"=>"doctrinize"},
135 {"pouder"=>"unchloridized", "chattelize"=>"unmodernize",
136 "uncrystallizability"=>"uncenter", "Egyptianization's"=>"ostracization's"}]
137
138 Using the included dictionary of words.
139
140 = Fuzzing Sessions And Statistics
141
142 The main way that you'll use RFuzz is to use the RFuzz::Session class to
143 perform RFuzz runs and store the results in various .csv files for analysis
144 later. RFuzz makes the stance that it shouldn't be used for analyzing the
145 data, but rather it should generate information that you can put through
146 a better tool. Examples of such tools are R, gnuplot, ploticus, or a spreadsheet.
147
148 The Session class is initialized in a similar fashion to the HttpClient, except
149 you can't set the :notifier (it's used to collect statistics about the requests).
150 Once you have a Session object you call it's Session#run method to do a run
151 of a set of samples and then put your tests inside a block.
152
153 When a run is done it saves the results to two CSV files so you can analyze them.
154
155 Here's a small sample of how Session is used:
156
157 require 'rfuzz/session'
158 include RFuzz
159 s = Session.new :host => "localhost", :port => 3000
160 s.run 5, :save_as => ["runs.csv","counts.csv"] do |c,r|
161 uris = r.uris(50,r.num(30))
162 uris.each do |u|
163 s.count_errors(:words) do
164 resp = c.get(u)
165 s.count resp.http_status
166 end
167 end
168 end
169
170 If you run this (having a server at localhost:3000) you'll find two
171 files in the current directory: runs.csv and counts.csv. These files
172 might look like this:
173
174 -- runs.csv --
175 run,name,sum,sumsq,n,mean,sd,min,max
176 0,request,0.517807,0.010310748693,50.0,0.01035614,0.0100491312529583,0.001729,0.074479
177 1,request,0.48696,0.010552774434,50.0,0.0097392,0.0108892135376889,0.001667,0.081887
178 2,request,0.322049,0.004898592637,50.0,0.00644098,0.00759199560893725,0.000806,0.057761
179 3,request,0.271233,0.004324191489,50.0,0.00542466,0.00763028964494234,0.000828,0.057182
180 4,request,0.27697,0.001659079814,50.0,0.0055394,0.00159611899203497,0.000791,0.010722
181
182 -- counts.csv --
183 run,404,200
184 0,46,4
185 1,41,9
186 2,48,2
187 3,42,8
188 4,49,1
189
190 You can then easily load these two files into any tool you want to analyze
191 the results.
192
193 === Counts vs. Samples vs. Runs
194
195 Something many people don't do correctly which RFuzz tries to implicitly
196 enforce is that doing just one run isn't as useful as doing a set of
197 runs. You might not be familiar with the terminology, so let's cover that
198 first.
199
200 * count -- Just a simple count of some variable during a run.
201 * sample -- A sample is the result of taking a measurement during a run.
202 * run -- This is a test that you perform and then collect counts and samples for.
203
204 In the above sample script, we are doing the following:
205
206 * 5 runs.
207 * That do GET requests for up to 50 randomly selected URIs.
208 * Counting errors, HTTP status codes.
209 * And gathers stats on the request timing (Session does this automatically).
210
211 If you were to structure this into a data structure it would like this:
212
213 [
214 ["run", "name", "sum", "sumsq", "n", "mean", "sd", "min", "max"],
215 [0, :request, 0.605363, 0.0149, 50.0, 0.0121, 0.0124, 0.00851, 0.095579],
216 [1, :request, 0.520827, 0.0116, 50.0, 0.0104, 0.0112, 0.00189, 0.088004],
217 ...
218 ]
219
220 Taking a look at this, we have run 0, run 1, ... and then each "row" has a
221 set of satistics we've gathered on the HTTP request (shown as "name"). These
222 statistics are actually generated from the random 50 URI requests we built
223 with this set of code:
224
225 uris = r.uris(50,r.num(30))
226
227 Which means that each row is the statistics collected as each request is made
228 from the 50 randomly generated URIs. If I were to write this out it'd be:
229
230 1. Generate 50 random URIs.
231 2. Request URIs 1-50, record how long each one takes.
232 3. Average (with standard deviation) the times for each request.
233 4. Store this as one "run".
234 5. Repeat until all the runs are done.
235
236 By doing this you cut down on the amount of information you need to analyze
237 to figure out if a server is behaving correctly. Instead of wading through
238 tons of data about each request, you just analyze the "meta-statistics" about
239 the runs.
240
241 === Sample Runs Reduce Error
242
243 The reason for doing a series of runs and analyzing their standard deviation (sd)
244 and means is that it reduces the chance that one long run was just done at the
245 wrong time or in the wrong situation. If you just ran a test once with the
246 same settings every time you might not find out until later that there was
247 some confounding element which made the test invalid.
248
249
250 == Source Code
251
252 You can also view http://www.zedshaw.com/projects/rfuzz/coverage/ for the rcov
253 generated coverage report which is also a decent source browser.
Something went wrong with that request. Please try again.