Skip to content

AJAX & JSON support? #50

Closed
sparticvs opened this Issue Aug 2, 2011 · 95 comments

7 participants

@sparticvs

Not sure if it exists, but can arachni follow ajax (and by relationship how about some jQuery stuff) to locate pages and addition information and in addition read JSON to determine locations of further possibly exploited areas.

@Zapotek Zapotek was assigned Aug 2, 2011
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 2, 2011

Unfortunately no, it can't interpret JS...not yet at least.

@Zapotek Zapotek closed this Aug 2, 2011
@sparticvs

Well in that case, this is a feature request :-D

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 4, 2011

Don't worry, this will be implemented as soon as possible.
I've been waiting for Johnson to be updated to work on 1.9.2 so it's going to take a while.

@Zapotek Zapotek reopened this Aug 4, 2011
@sparticvs

Alright, sounds good. Keep up the fantastic work :-D

@sparticvs sparticvs closed this Aug 4, 2011
@sparticvs sparticvs reopened this Aug 4, 2011
@sparticvs

(stupid close button)

@raesene
raesene commented Aug 26, 2011

have you thought about using the ruby racer for this? https://github.com/cowboyd/therubyracer

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2011

I didn't even know about it, this is great.
If/when I manage to hook it up to Taka[1] great things will happen. :D

Thanks very much for the suggestion.

[1] https://github.com/tenderlove/taka

@sparticvs

Yay. I like great things :-D

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 29, 2011

I played with it for a couple of days and the lib works great, also Taka provides some basic DOM although I'll need to implement the DOM Event specs myself -- which is the tricky part and a major pain.
So it'll take some time but it's certainly do-able.

Right now I'm working on the Arachni Grid, once I'm done it's DOM's turn.

@sparticvs

Ooooo.....what's Arachni Grid?

@virilefool

Just wanted to second this request. This would be super-useful.

Thanks for the fantastic work!

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Nov 18, 2011

There have been some good news on that front.
A university student has chosen as his final year project to add AJAX support to Arachni.

Let's all hope he does well. :)

@virilefool

Out of sheer curiosity, any sort of ETA? No idea how university projects work in other countries. :)

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Dec 1, 2011

Do idea, won't be any time soon though that's for sure.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jun 21, 2012

I just got a request for an update on this so here it is:

The student's advisor encouraged him to pursue another project for reasons which I can't recall now and so he did.

Once I was done with the High Performance Grid I was supposed to work on this myself but it got bumped in favor of a full test suite (which is coming along nicely I might add :) ) because at this point it is imperative that we have one.

So once the v0.4.1 milestone[1] is finished then I'll have to either re-write the WebUI in Rails or add JS support or both.

The problem with JS/DOM/AJAX support is that it will require a massive amount of time to implement and since I'm the only dev it makes sense to give preference to fixing old things than implementing grand new ones.
This sucks but look at what has happened now with having to implement the RSpec suite, I've got lots of issues being backed up and haven't released a new version in a long time but it absolutely had to be done.

There are some good news though, I've got a couple of people interested in helping out with rewriting the new WebUI, which should free me up to make some progress on the JS front.
More volunteers would mean more time for me to work on JS integration so if some of you would be interested in lending a hand as well that'd be great.

Bottom line, depending on how many people help out, JS integration should be comming in v0.4.2 or v0.4.3 -- barring any unforeseen circumstances.

[1] https://github.com/Arachni/arachni/issues?milestone=4

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 2, 2012

I just read this: http://blog.watchfire.com/wfblog/2012/07/announcing-xss-analyzer.html
I don't like companies marketing common sense as genius so let's let them know we're coming.

The new version will indeed include some very basic JS/AJAX functionality, it will be optional, extremely unstable and preliminary but it will get the ball rolling.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 3, 2012

So much for being dramatic, TheRubyRacer is overhauling their Ruby->JS conversion system.
This goes back to the v0.4.2 milestone, argh.

@tbif
tbif commented Jul 6, 2012

This might be a little overkill/peformance killing, but one idea could be integrating arachni with watir-webdriver/headless and use an existing browser's javascript engine to do all the work. Arachni could be set up as a proxy and all requests made from within the browser could be captured. At my work we're using HP WebInspect/AMP, and from what I gather they just launch 20 or so instances of Firefox in the background and let Firefox do all the work.

http://watirwebdriver.com/

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 6, 2012

That has crossed my mind but I'd rather do things properly and write my own DOM.
I know it's going to be a herculean task but it will pay off hugely -- and if I fail and end up looking like an idiot then I can revert to WATIR (and just delete these comments and make it look like it was my idea from the get go).

The problem with these big commercial products is that they are pressured by dozens of different factors and just have to settle with whatever gets the job done most of the time.
If they fuck up they can't just say:

Oups, our bad, we'll try something else next time.

Whereas I can and it has sort of paid off thus far.

Now from a S/W engineering standpoint, that's either:
1. Freaking crazy and a damn embarrassment for them -- spawning a couple dozen browsers in the background is hugely inefficient, resource intensive and inelegant. Downright atrocious if you ask me.
2. The hard, bold and right choice -- I'm trying to come up with something to justify this but I just circle back to 1.

Thankfully, a kind web developer and fellow security enthusiast has recently joined the team (I won't say his name until he gets off his ass and pushes the code to the repo, ya hear?) and has bravely taken it upon himself to write a kick-ass new WebUI, which should leave me free to work on the DOM/JS/AJAX implementation.

To sum up:
I'd rather take a stab at it myself but if I mess up then I'll go with WATIR -- either case this will happen so it's a win-win.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 13, 2013

Work on v0.5 has begun and this task's turn is coming up soon. I just found out that WATIR now supports PhantomJS and a week ago I came up with a solution which will allow me to use a more conventional way of testing (instead of writing my own DOM) without sacrificing performance.

The way I've got this figured out it:

  • Crawler stays quick and dumb, just like now.
  • Pages are audited just like now.
    • While the responses of the normal audit are being harvested, a new thread is spawned which is used to communicate with PhantomJS via WATIR and have it evaluate the JS/AJAX stuff on the page.
    • If there are any new, dynamic, elements found, they are queued to be audited as well.

So, I'm planning on using the high latency of the normal audit to hide the latency of the JS/AJAX analysis. And since pages are audited in series and take some time, just one instance of PhantomJS will be enough as its workload will be negligible.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 21, 2013

Some performance optimizations to make things smoother:

  • Pre-load page resources -- have static CSS and JavaScript files available ahead of time.
    • Crawl-time pre-load.
      • The crawler will have most likely visited these resources so they can be stored in the pre-load cache during the crawl.
    • Audit-time pre-load -- Can be in addition to the crawl-time one and also act as a fail-safe.
      • Queue requests for them along with the ones for the normal audit, but put them at the top of the queue.
      • Once all of them are available start the DOM/JS/AJAX analysis.
    • Once PhantomJS tries to load these during the DOM/JS/AJAX analysis, serve them from the pre-load cache.
  • Cache static resources like CSS and JS files for the duration of the scan -- careful with this one, needs more research.
  • Disable graphics altogether -- make this optional or just allow users to do it via the regular exclusion filters.
@Zapotek Zapotek added a commit that referenced this issue Jul 28, 2013
@Zapotek Zapotek Arachni::Browser: Added #trigger_events
[Issue #50]
8842102
@Zapotek Zapotek added a commit that referenced this issue Jul 28, 2013
@Zapotek Zapotek Arachni::Browser#trigger_events: Updated specs
[Issue #50]
6379b43
@Zapotek Zapotek added a commit that referenced this issue Jul 29, 2013
@Zapotek Zapotek Arachni::Browser#cookies: Normalize paths
[Issue #50]
bbf6fd3
@Zapotek Zapotek added a commit that referenced this issue Jul 29, 2013
@Zapotek Zapotek Added Browser#wait_for_pending_requests
[Issue #50]

Enables the browser to wait for HTTP requests (like AJAX etc.) to complete
before moving on.
f8b7795
@Zapotek Zapotek added a commit that referenced this issue Jul 30, 2013
@Zapotek Zapotek Browser: Updated to capture page snapshots on changes
[Issue #50]

Every time an event is triggered (or a JS linked is clicked), a snapshot of
the changed page is captured so that it can be re-analyzed and audited from
that state.
ef94ce6
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 30, 2013

You'll all be glad to know that today I run the first successful audit with full-browser coverage -- AJAX audit/discovery and everything. So, things are really coming along.

@Zapotek Zapotek added a commit that referenced this issue Jul 30, 2013
@Zapotek Zapotek Page: Added #has_javascript?
[Issue #50]
e24e1ec
@Zapotek Zapotek added a commit that referenced this issue Jul 30, 2013
@Zapotek Zapotek Added RPC::Server::Browser and RPC::Client::Browser
[Issue #50]

RPC::Server::Browser allows to off-load the overhead of DOM/JS/AJAX analysis to
a forked process and perform the analysis in parallel with the operations of the
Framework -- the audit in this case.
4d1ca25
@Zapotek Zapotek added a commit that referenced this issue Jul 31, 2013
@Zapotek Zapotek Page#dom_body=: Overrides the response body when parsing
[Issue #50]

Also represents the calculated DOM body of the page instead of the original
response #body.
4cf58f2
@Zapotek Zapotek added a commit that referenced this issue Jul 31, 2013
@Zapotek Zapotek Browser: Updated to use Page#dom_body
[Issue #50]

Also updated #goto to capture the initial state of the page.
6357854
@treadie
treadie commented Jul 31, 2013

Awesome progress, I appreciate your efforts

@enricostano

Cool stuff! Any plan of a tutorial covering this? Thanks!

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Jul 31, 2013

I've only pushed the supporting libraries thus far but in a couple of days (or even today maybe) I'll push the updated framework and some instructions on how to run the code from the v0.5 branch so you can start testing it.

@Zapotek Zapotek added a commit that referenced this issue Aug 1, 2013
@Zapotek Zapotek Browser#load: Restores Page state by replaying its #transitions
[Issue #50]

Also added #explore_deep_and_flush which traverses through the entire DOM event
tree taking snapshots of each state.
f730947
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 2, 2013

@enricostano Instructions: https://github.com/Arachni/arachni/tree/v0.5#how-to-run-the-code

It goes without saying that the v0.5 dev codebase is un-optimized, unstable and buggy.

@Zapotek Zapotek added a commit that referenced this issue Aug 5, 2013
@Zapotek Zapotek Browser: Fixed bug causing failure under HTTPS
[Issue #50]
1302c67
@enricostano

thanks @Zapotek ! 😃

@treadie
treadie commented Aug 7, 2013

I tried the v0.5 branch out, and it seemed to work well, good work. However did run into a few issues. (environment was built using your scripts + instuctions)

  1. I had this same issue #362 but that was on the first box I tried it on, moved boxes and have not seen it since.

  2. It got stuck in a loop when it received a 401 response, this is just a snippet of what i was seeing. The request resulting in the 401 would have been a JSON req. if you need more info, let me know.

 [*] Auditing: [HTTP: 200] https://192.168.61.3/users/login
 [~] Identified as: apache
 [~] DOM depth: 1 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

 [*] Auditing: [HTTP: 401] https://192.168.61.3/
 [~] Identified as: apache
 [~] DOM depth: 1 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

 [*] Auditing: [HTTP: 200] https://192.168.61.3/users/login
 [~] Identified as: apache
 [~] DOM depth: 1 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

 [*] Auditing: [HTTP: 401] https://192.168.61.3/
 [~] Identified as: apache
 [~] DOM depth: 1 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

  1. I was unable to send it through a proxy using the --proxy argument. my cmd was simply ./arachni http://target.com --proxy="192.168.0.1:8080" . no auth was needed on the proxy.
[*] Initialising...
 [*] Waiting for plugins to settle...
 [-] HTTP: #<Ethon::Errors::InvalidOption: The option: proxy_username is invalid.
Please try proxyuserpwd instead of proxy_username.>
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy.rb:234:in `block in set_attributes'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy.rb:231:in `each_pair'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy.rb:231:in `set_attributes'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy/http/actionable.rb:79:in `setup'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy/http/get.rb:17:in `setup'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/bundler/gems/ethon-465cc54e7752/lib/ethon/easy/http.rb:39:in `http_request'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/easy_factory.rb:51:in `get'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/addable.rb:19:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/memoizable.rb:40:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/cacheable.rb:8:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/block_connection.rb:30:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/stubbable.rb:21:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/before.rb:26:in `add'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/runnable.rb:18:in `block (2 levels) in run'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/runnable.rb:17:in `map'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/runnable.rb:17:in `block in run'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/runnable.rb:15:in `loop'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/runnable.rb:15:in `run'
 [-] HTTP: /home/Ubuntu/.rvm/gems/ruby-1.9.3-p194/gems/typhoeus-0.6.3/lib/typhoeus/hydra/memoizable.rb:50:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/typhoeus/hydra.rb:24:in `block in run'
 [-] HTTP: <internal:prelude>:10:in `synchronize'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/typhoeus/hydra.rb:50:in `synchronize'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/typhoeus/hydra.rb:24:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:553:in `hydra_run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:156:in `block in run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `call'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `exception_jail'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:154:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:507:in `method_missing'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/spider.rb:143:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:792:in `audit'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:200:in `block in run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `call'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `exception_jail'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:200:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/ui/cli/cli.rb:101:in `block in run'
 [-] HTTP: 
 [-] HTTP: Parent:
 [-] HTTP: Arachni::HTTP::Client
 [-] HTTP: 
 [-] HTTP: Block:
 [-] HTTP: #<Proc:0x851f204@/home/Ubuntu/arachni/lib/arachni/http/client.rb:154>
 [-] HTTP: 
 [-] HTTP: Caller:
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `exception_jail'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:154:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/http/client.rb:507:in `method_missing'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/spider.rb:143:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:792:in `audit'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:200:in `block in run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `call'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/utilities.rb:430:in `exception_jail'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/framework.rb:200:in `run'
 [-] HTTP: /home/Ubuntu/arachni/lib/arachni/ui/cli/cli.rb:101:in `block in run'
 [-] HTTP: --------------------------------------------------------------------------------


@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 7, 2013
  1. Turns out there is a fix, I'll update Arachni to use it.
  2. Haven't really thrown any real weirdness at the browser yet and those kinds of loops are to be expected at this point. Will try to sort it out today.
  3. The proxy thing should have been resolved now by 72cf543.
@treadie
treadie commented Aug 7, 2013

damn.. you fixed No.3 only 7 minutes prior to me asking about it.. I should learn to wait a little longer before I complain. :) cheers

update: maybe i should learn how to read the time better..

@treadie
treadie commented Aug 7, 2013

Still proxy issues.. similar but different.

HTTP: #<Ethon::Errors::InvalidOption: The option: proxy_type is invalid. Please try proxytype instead of proxy_type.>

I just made the change myself in request.rb and its now all good. you might want to do the same :)

@Zapotek Zapotek added a commit that referenced this issue Aug 7, 2013
@Zapotek Zapotek HTTP::Request#to_typhoeus: proxy_type =>proxytype
[Issue #50]
37a8fd7
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 7, 2013

2060203 should take care of the interpreter crash.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 7, 2013

@treadie 06b2501 may have fixed the inf loop bug, could you try it and let me know? Cheers

@Zapotek Zapotek added a commit that referenced this issue Aug 7, 2013
@Zapotek Zapotek Browser#load: Uses the Page#cookiejar
[Issue #50]
3b6bff7
@treadie
treadie commented Aug 7, 2013

hmm.. half way there. it was stuck doing this for more than 30min (LAN connection and only scanning a login page)

 [*] Auditing: [HTTP: 401] https://192.168.61.3/
 [~] Identified as: apache
 [~] DOM depth: 2 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Mixed Resource: Checking...
 [~] Backup files: Backing out, couldn't extract filename from: https://192.168.61.3/
 [~] Session fixation: No login-check has been set, cannot continue.
 [*] CSRF: Looking for CSRF candidates...
 [*] CSRF: Simulating logged-out user.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [*] CSRF: Found 0 context irrelevant forms.
 [*] CSRF: Found 1 CSRF candidates.
 [~] CSRF: Skipping already audited form 'login_form' at 'https://192.168.61.3/'
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

 [*] Auditing: [HTTP: 401] https://192.168.61.3/
 [~] Identified as: apache
 [~] DOM depth: 2 (Limit: 10)
 [~] Starting DOM/JS/AJAX analysis in the background.
 [*] Mixed Resource: Checking...
 [~] Backup files: Backing out, couldn't extract filename from: https://192.168.61.3/
 [~] Session fixation: No login-check has been set, cannot continue.
 [*] CSRF: Looking for CSRF candidates...
 [*] CSRF: Simulating logged-out user.
 [*] Harvesting HTTP responses...
 [~] Depending on server responsiveness and network conditions this may take a while.
 [*] CSRF: Found 0 context irrelevant forms.
 [*] CSRF: Found 1 CSRF candidates.
 [~] CSRF: Skipping already audited form 'login_form' at 'https://192.168.61.3/'
 [~] DOM/JS/AJAX analysis resulted in:
 [~]   * 1 page variations
 [~]   * 0 new paths

I also noticed upon exit that the all the query strings that have an Arachni payload in them are getting added as separate pages for the site. This is making the list of pages on the site HUGE!.

At this point in time i cant really get you any more details, as it is I'm sitting in my car doing this as I only have a spare few minutes.

Cheers

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 7, 2013

I'm afraid I am going to need more info when you get a chance or, ideally, access to the webapp. Or, if you could, write up a small Sinatra webapp that simulates that behavior so that I'll be able to reproduce it and debug it.

@treadie
treadie commented Aug 8, 2013

Ok. got time to investigate this afternoon, I think I have figured out the issue, and its the application causing the issue not a bug in Arachni.

Essentially in the body of every response the server was setting the time. I'm guessing this resulted in a different page variation as far as Arachni goes. So I don't think it would have been an infinite loop, however it would have gone on for ages until it reached its depth of 10.

Trick for young players i guess. But in saying that if you don't think this would be the cause of the issue, let me know and I will investigate further.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 8, 2013

Hm, I'll try to reproduce it, see if I can find a way around this. Thanks for looking into it man.

@Zapotek Zapotek added a commit that referenced this issue Aug 8, 2013
@Zapotek Zapotek Added BrowserCluster
[Issue #50]

Maintains a pool of `Arachni::Browser` instances and distributes the analysis
workload of multiple resources.
020350e
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 9, 2013

@treadie Now that I got the BrowserCluster to a working point I'll see what's going on with your issue. Maybe I should ignore text content and only see if nodes change... or something.

@treadie
treadie commented Aug 14, 2013

@Zapotek need me to look any further into this? I had a few other issues also. I was scanning a relatively big & slow site the other day. I was getting a fair few timed out requests and then blam... it just says Killed in the terminal. no warning, no cleanup, just dead..here is a snippet (including cmd run at bottom). It happened around 3 or 4 times before I gave up.

 [-] HTTP: Request timed-out! -- ID# 33634
 [*] Blind SQL injection (timing attack): Analyzing response #33634...
 [-] HTTP: Request timed-out! -- ID# 33635
 [*] Blind SQL injection (timing attack): Analyzing response #33635...
 [-] HTTP: Request timed-out! -- ID# 33639
 [*] Blind SQL injection (timing attack): Analyzing response #33639...
 [-] HTTP: Request timed-out! -- ID# 33636
 [*] Blind SQL injection (timing attack): Analyzing response #33636...
 [-] HTTP: Request timed-out! -- ID# 33637
 [*] Blind SQL injection (timing attack): Analyzing response #33637...
 [-] HTTP: Request timed-out! -- ID# 33638
 [*] Blind SQL injection (timing attack): Analyzing response #33638...
Killed
blue@blue:~/arachni/bin$ ./arachni --http-req-limit="4" --user-agent="Mozilla/5.0 (Windows NT 6.2; WOW64; rv:22.0) Gecko/20100101 Firefox/22.0" --report="html:outfile=/home/blue/Desktop/arachni-site.com.html" https://site.com
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 14, 2013

Try running dmesg and inspecting the output, this process was killed by the OS, probably ate all the memory somehow -- I'm guessing the browsers.

@treadie
treadie commented Aug 14, 2013

yeah good call I didn't even think of that, but your right.

[116230.957950] [18815]  1000 18815   119758     6518   0       0             0 firefox
[116230.957952] [20049]  1000 20049   190801   150777   0       0             0 ruby
[116230.957954] [20061]  1000 20061    41715      133   0       0             0 phantomjs
[116230.957955] [20081]  1000 20081    41715      136   0       0             0 phantomjs
[116230.957957] [20101]  1000 20101    41715      136   0       0             0 phantomjs
[116230.957959] [20121]  1000 20121    41715      127   0       0             0 phantomjs
[116230.957961] [20141]  1000 20141    50005     1568   0       0             0 phantomjs
[116230.957963] Out of memory: Kill process 20049 (ruby) score 343 or sacrifice child
[116230.957968] Killed process 20049 (ruby) total-vm:763204kB, anon-rss:603108kB, file-rss:0kB

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 14, 2013

There's an idea to add a TTL, in form of pages analyzed, for each browser to avoid this situation. So that will most likely take care of the issue.

Will let you know once I implement it.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 14, 2013

I'm talking about avoiding/resetting memory leaks above, I'll most likely have to also find a way to limit the memory allowance for browser processes in general.

@Zapotek Zapotek added a commit that referenced this issue Aug 20, 2013
@Zapotek Zapotek Framework: Now lazy-loads the BrowserCluster
[Issue #50]
f20da77
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 22, 2013

@treadie Sorry, I was working on all the stuff above so it took me some time to look into the inf pages thing caused by printing the time.

That's indeed the culprit and it looks like ignoring the text of the nodes is the way to go.

@Zapotek Zapotek added a commit that referenced this issue Aug 23, 2013
@Zapotek Zapotek BrowserCluster: Now ignores text and paragraph nodes in page dedup
[Issue #50]

Also added Arachni::Page::DOM to simplify handling DOM snapshots.
c5ad611
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 23, 2013

@treadie c5ad611 should take care of the issue, if not there are further optimization that can be made.

I'll now move to the TTL feature for worker Browsers in order to fix the memory leak.

@treadie
treadie commented Aug 23, 2013

Yep, that's fixed it. Cheers.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 25, 2013

I think I just fixed the RAM issue with 89c3c07, could you give it a try and let me know? Hopefully, I won't have to implement the TTL or be able to enforce a bigger one.

@Zapotek Zapotek added a commit that referenced this issue Aug 25, 2013
@Zapotek Zapotek Browser#close => Browser#shutdown
[Issue #50]

Re-appropriated #close to just close the page instead of kill everything.
8bf8d59
@treadie
treadie commented Aug 25, 2013

I no longer have access to app that was triggering the issue, however I do have another AJAX heavy app I can run it against. to check out. I know this isn't ideal, but its the best I can do at the moment. I will let you know how it goes.

Also, was doing a bit of testing on the weekend, using just the crawler & trainer against WIVET, to check out its coverage. I was continually getting the following 3 errors. http://pastebin.com/nXdCGBnP . I was using this release c5ad611

Cheers

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 25, 2013

No worries, let me know how it works. As for WIVET, thank you for reminding me, it's time to evaluate it using a standard-ish benchmark..

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 25, 2013

Damn, I broke something, don't bother testing it yet.

@Zapotek Zapotek added a commit that referenced this issue Aug 25, 2013
@Zapotek Zapotek Undoing 89c3c07
[Issue #50]
d091b72
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

@treadie I'm going to make the distribution of the page analysis more granular so best wait till then to test it further as if you find a bug in the current code I won't be able to track it to my work-in-progress codebase.

@treadie
treadie commented Aug 26, 2013

NP, just let me know.

@treadie
treadie commented Aug 26, 2013

Just noticed this if you need another site to test/crawl/scan/whatever http://testhtml5.vulnweb.com

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

Yeah just found about that one yesterday, it's what prompted me to update the distribution style, it was taking too long to analyze it.

@Zapotek Zapotek added a commit that referenced this issue Aug 26, 2013
@Zapotek Zapotek BrowserCluster, Browser: Updated to distribute element/event pairs in…
…stead of just pages

[Issue #50]
b7dcded
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

OK, the thing is decently fast now. Before, it was whole pages that were distributed across the cluster, which was cool if you had enough regular elements to audit in order to make some progress while the browsers analyze the page to find snapshots with new elements. But, for sites like the ones you mentioned, that are just one page full of JS it was taking a long time with nothing to do but wait for the browsers.

Now, there are events of elements of pages that are distributed, which means that analysis of even a single page can be distributed in the most granular way possible across the browser workers. One worker could be firing :onclick and another :onhover on the same element at the same time, which is cool, heh. :)

I also up'ed the number of browsers from 5 to 10 for some extra schnell, just to see what happens.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

@treadie You can test it now btw.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

Getting some errors from a few WIVET cases, elements disappearing from the page cache, probably removed due to a triggered event, looking into it.

Getting 81% coverage right now which is not half-bad.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

85% after I fixed a bug in the crawler.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

88% after some path extractor updates. :)

@treadie
treadie commented Aug 26, 2013

Go for the 90's!! I will test as soon as I get the chance. Nice Work!

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

Just did, 92%, although I won't push this yet, need to add a few tests for the bugfix.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

Consider overriding:

  • setTimeout -- This way Arachni will know that it has to wait for a bit because something exciting might happen, use good judgment though as there needs to be a cut-off at some point.
  • EventTarget.addEventListener -- This way we'll know exactly which events the DOM anticipates and not have to throw the book at it, analysis will end up being much, MUCH faster.

References:

  1. override-addEventListener.html
  2. overriding a global function in javascript
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 26, 2013

I forgot to explain how:

Before a page is loaded by the Browser, manipulate its HTML code in order to add a preloaded, custom JS script which will override and keep track of all the stuff I mentioned above.
Then, grab that data via Watir#execute_script with a call to the interface of that custom script.

@Zapotek Zapotek added a commit that referenced this issue Aug 27, 2013
@Zapotek Zapotek Browser: Added EVENTS_PER_ELEMENT hash
[Issue #50]
fbd5673
@Zapotek Zapotek added a commit that referenced this issue Aug 27, 2013
@Zapotek Zapotek Page: has_javascript? => has_script?
[Issue #50]

Also, updated to take into account events and javascript: in href and action
attributes.
cec81e1
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 27, 2013

92%-er is pushed, covering another relatively simple case now before getting into the JS prototype override weirdness.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 27, 2013

94% with e0d2a7e, one more to go -- the last 2 cases need SWF so they won't be supported.

@treadie
treadie commented Aug 29, 2013

I'm using the e0d2a7e release however it results in even more of the errors i was getting earlier, and only 24% coverage. was getting 84% before the e0d2a7e release. I will play a bit more to figure out why this is happening.

Also I have noticed that occasionally the PhantomJS user agent slips into a few of the requests instead of the Arachni or user defined one.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 29, 2013

I've been running that benchmark every now and then since I pushed and each time it hits 94%.
Could you create a gist with the errors? Also, which WIVET version are you using?

As for the UA, yeah I've noticed, forget to sort it out though, thanks for reminding me.

@treadie
treadie commented Aug 29, 2013

So I set the --http-req-limit=1 all errors are gone, and coverage looks to be good (91% at the moment but the crawl is still running)

WIVET v3 (from the OWASP BWA live CD) < this could also be the cause of the errors, but may have to look into it a little more. I think the liveCD/VMware combo is just getting smashed by the crawler, and cant respond quick enough.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 29, 2013

Even so, could you provide the errors? I'm curious now...

@treadie
treadie commented Aug 29, 2013

See this paste http://pastebin.com/LGWyfjXa of the whole run. The captcha module was just to stop all the other modules running without affecting the crawl.

@Zapotek Zapotek added a commit that referenced this issue Aug 29, 2013
@Zapotek Zapotek Browser: Synchronize retrieval of responses
[Issue #50]
c65602e
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 29, 2013

My bad, had synchronized response storage but forgot to sync the retrieval, should work now.
Also, you should run it like this:

-m trainer -gp -e 100\.php

You need at least one audit module for case 1_12c3b (the trainer is your best shot), exclude the 100.php logout link and only audit links and forms to keep stats tight and only use one session. Arachni's coverage is a complementary operation and uses both the crawl and the audit.

If you see an error from auditor.rb#log don't worry, fixing it now.

@treadie
treadie commented Aug 30, 2013

Grabbed the latest from Git this morning and similar errors. http://pastebin.com/A5zxmcVg . Also with --http-req-limit=1 I still don't get the errors. I still haven't had a good chance to test things properly, and might not get a change till the weekend. I will keep you updated.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

This is a simple race condition but it becomes a bitch when you can't actually reproduce it. I'll see what I can do.

@treadie
treadie commented Aug 30, 2013

Let me know if there is anything you need or want me to test etc.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

That's a good idea, I could push some debugging code to print out what's going on in your env. Give me 10mins.

@treadie
treadie commented Aug 30, 2013

no problem.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

Meantime, do you mind contacting me at tasos dot laskos @t gmail?

@Zapotek Zapotek added a commit that referenced this issue Aug 30, 2013
@Zapotek Zapotek Browser#clear_responses: Synchronize operation
[Issue #50]
89f96bc
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

JS event interception implemented by 01300b3, now only events with listeners are triggered during DOM analysis resulting in greater performance and decreased workload.

@treadie Will definitely make a difference in the scans times you've been seeing.

@Zapotek Zapotek added a commit that referenced this issue Aug 30, 2013
@Zapotek Zapotek Browser: Added JS prototype overrides for timers
[Issue #50]
7117d9d
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

Hit 96% WIVET coverage, rest 4% needs SWF support which won't be implemented.

@treadie
treadie commented Aug 30, 2013

Just did the latest pull. Awesome performance increase, and my previous issues are gone now also. Only 94% for me though.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

Yeah I just noticed there's an intermittent bug...swings back and forth between 94% and 96%.

@Zapotek Zapotek added a commit that referenced this issue Aug 30, 2013
@Zapotek Zapotek BrowserCluster now updated the Framework sitemap
[Issue #50]
6233a28
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 30, 2013

@treadie I need some more info when you find some time please:

  • Do you get a lot of 94%s with your setup?
  • Do you only get 94%s?
  • Which case fails?
  • Is the failing case the same every time?

Ignore the last 2 cases that need SWF.

Thanks in advance.

@treadie
treadie commented Aug 31, 2013

pulled the latest code, and after running for i in $(seq 1 20), do bin/arachni -m trainer -gp -i wivet -e 100\.php -e offscanpages http://192.168.61.19/wivet --user-agent=test-$i 2>&1 | tee -a ~/output.log ; done

  • all 94% except the last which was 92% for some reason
  • as above, I have not seen a 96% yet
  • "unattached js function document.location" is missed every time (in the last it also missed "link created thru xhr response")
  • yes (with the exception of the 20th)

There is the occasional error though, I will email you the whole log output but this is the first line of the errors:
[-] Error Message => ''undefined' is not a function {evaluating 'arguments[0].events()')'
and
[-] Error Message => 'Element does not exist'

Something else odd. I ran this through twice with the exact same results. not sure why the 20th scan dropped 2% each time.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 31, 2013

Thanks a lot man. Though, I don't have an "unattached js function document.location" case, what's its URI?

@Zapotek Zapotek added a commit that referenced this issue Aug 31, 2013
@Zapotek Zapotek Browser: Overrode UA in #handle_request
[Issue #50]
bf7b262
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 31, 2013

Also, there's the possibility that it's WIVET's problem, I refreshed a finished scan's stats and from 94% it jumped to 96%. And the fact that you never got a 96% on a lower-spec system even after multiple runs leads me to believe that WIVET can't cope sometimes and may have a couple of race conditions of its own somewhere in there.

I'm assuming that the failed cases pass when attempted on their own, right?

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 31, 2013

@treadie Also, do you mind trying one more thing?
Open arachni/lib/arachni/browser_cluster.rb and set the :pool_size to 1 in DEFAULT_OPTIONS.

If you get 96% then we'll know WIVET is messing up when stressed, I'll try to see if there are any session issues on Arachni's end in the interim.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Aug 31, 2013

Session is fine, although when using 1 browser the errors go away. I was assuming that the errors were benign because I was still getting full coverage but there may be something more to them. Looking into it.

@Zapotek Zapotek added a commit that referenced this issue Aug 31, 2013
@Zapotek Zapotek Browser: Updated to rescue but print exceptions
[Issue #50]
9bc14cc
@Zapotek Zapotek added a commit that referenced this issue Aug 31, 2013
@Zapotek Zapotek Browser#{preloade, cache}: Save responses
[Issue #50]
1f2d573
@Zapotek Zapotek added a commit that referenced this issue Aug 31, 2013
@Zapotek Zapotek Browser: Improved error handling
[Issue #50]
afd7ade
@treadie
treadie commented Aug 31, 2013

link to the "unattached js function document.location" is /wivet/innerpages/9_26dd2e.php and it does work when requesting it directly.

:pool_size set to 1 still results in a 94% scan, missing the same link.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Sep 1, 2013

You sure it's WIVET v3? These are the v3 cases: http://www.webguvenligi.org/wivet/

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Sep 1, 2013

You should get 96% now. :)

@treadie
treadie commented Sep 1, 2013

Haha cheers, you shouldn't get any more inconsistent results either! .. I can confirm 96% also.. well done

@Zapotek Zapotek added a commit that referenced this issue Sep 1, 2013
@Zapotek Zapotek script path extractor: Updated specs
[Issue #50]
bbe2451
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Sep 1, 2013

@treadie You shouldn't be getting any exceptions now, maybe some errors about time outs and stuff but everything should be nicely handled.

@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Sep 1, 2013

96% on 24 consecutive scans -- was going for 50 but the 25th hit a port number conflict and died, heh.

@Zapotek Zapotek added a commit that referenced this issue Sep 20, 2013
@Zapotek Zapotek Browser: Increased Watir->Browser com timeout
[Issue #50]
3877318
@Zapotek Zapotek added a commit that referenced this issue Sep 23, 2013
@Zapotek Zapotek BrowserCluster#wait_till_service_ready: " => '
[Issue #50]
467b8ae
@Zapotek Zapotek added a commit that referenced this issue Sep 25, 2013
@Zapotek Zapotek Browser#trigger_event: Improved error handling
[Issue #50]
5e59517
@Zapotek Zapotek added a commit that referenced this issue Sep 25, 2013
@Zapotek Zapotek Browser: Lowered WATIR_COM_TIMEOUT to 36000
[Issue #50]
40c43c7
@Zapotek
Arachni - Web Application Security Scanner Framework member
Zapotek commented Dec 17, 2013

Closing this to get it off my list since prototype functionality is ready.

@Zapotek Zapotek closed this Dec 17, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.