Skip to content

GSoC2012_PluginACT

thc202 edited this page Jan 18, 2020 · 4 revisions

Execute Summary


This page is a web log of the work carried out in the Enhanced AJAX Integration in ZAProxy project of the OWASP organization in the Google Summer of Code 2012.


In this GSoC project, we made the following contributions:

  • Developed a plugin to fully integrate a widely known spider called Crawlax in ZAP. The release of the plugin add-on is available to download on the zap-extensions site. You can find its official getting started guide with information regarding its installation and use in the Getting Started pagehelp page.
  • Enhanced the Crawljax spidering capabilities. Before this GSoc project, Crawljax got a score of 10% in the WIVET benchmarking project. We improved its score from 10% up to 72%. as shown in this video and also in the testing wiki page. The improvements have been sent to the crawljax dev. team and are also available on github.
  • Added support for setting local proxies in crawljax when using the Chrome and HTMLUnit web browsers.
  • Added support in ZAP to allow extensions appending their custom icons in the Sites Nodes Tree, as specified in this thread.
  • Researched different ways of including extension-specific dependencies in ZAP and established a standard one in the ant build file of ZAP, which we documented in the wiki.



Table of Contents





Introduction


The OWASP Zed Attack Proxy (ZAP) is a penetration testing tool for finding vulnerabilities in web applications. It is widely used by the security community, and it was recently elected tool of the year by a widely known security blog.

Asynchronous JavaScript and XML (AJAX) is a group of interrelated web development techniques used on the client-side to create asynchronous web applications. It gained a lot of popularity since the Web 2.0.

One of the main features of ZAP is a crawler that inspects all pages of the targeted site, as a result of which, a map of the site is generated. This map will be later used to find vulnerabilities in each one of the pages. Unfortunately, ZAP does not support crawling dynamic generated links with AJAX.

On the other hand, there is a OWASP tool called AJAX Crawling Tool (ACT) that performs this task. My objective is to develop a plugin for ZAP to facilitate users scanning AJAX based web pages. This plugin will call ACT to accomplish this, and the resulting information will be properly integrated with the ZAP interface and features.




Project goals


After discussing the idea with the pertinent OWASP project leaders, we have set the following goals:

  • Develop a plugin for ZAP to improve its integration with ACT.
  • Improve the ACT command line invocation capabilities.
  • Design a system to mark the HTTP requests made by ACT and implement also this system in ACT to recognize what URLs come from ACT.
    These were the main goals. Nevertheless, there are a few requirements that we will need to comply with:
  • The plugin will be easy to use by a developer with no security training.
  • The code will need to be very clean to facilitate its comprehension.
  • Both of the tools will be able to be updated separately without breaking integration.
  • The information provided by ACT will be completely integrated with the ZAP user interface. Links will appear in the site tree, also marked as spidered and shown in the history tab.


Timeline


The project is planned to be completed in 4 phases to accomplish the goals of Section 2. Below we show its timeline.

  • March-April: Setting up the Eclipse environment for developing ZAP, cloning the repository, compiling it, getting in touch with the development team and potential mentors, discussing the ideas with them and writing the proposal (already done).
  • April-May: Getting to know the code, defining the requirements and specifications of the project start writing the prototype and modifications to ACT.
  • May-June: Develop the functional specifications and the resulting prototype. Get feedback from the mentor and community.
  • June-August: Performing the modifications needed, refining the code, carry out the needed tests and documentation.


Weekly Progress Log


#13 Week (08-14 to 08-20)

  • Soft "pencils down" date.
  • Cleaning up code.
  • Writing documentation such as the getting started page and some code notes.
  • Performing final tests.
  • Planning future work.

#12 Week (08-06 to 08-13)

  • Added gear button in AJAX Spider panel that opens the Options Panel.
  • Set up a bodgeit local testing environment & tested AJAX Spider (works well). Made a few screnshoots to put in the wiki.
  • Fixed a log4j logging level issues in Crawljax.
  • Sent all the enhancements made so far to the crawljax development team.
  • Added new Parameters in the Options panel to choose as many threads & browser windows as wanted, which speeds up the process dramatically.
  • Written the AJAX Spider help documentation.
  • Translated messages.properties into Spanish and removed some unnecessary labels in Messages.properties.
  • Tested a new OWASP penetration testing tool for web applications called Xelenium.
  • Tested all the previous changes and committed them into trunk.
  • Released the alpha 1 version of AJAX Spider in the zap-extensions site.

#11 Week (07-31 to 08-05)

  • Changed the zaproxy extensions way of handling custom icons according to the specifications described at the dev. group thread.
  • Created a standard way of including dependencies in ZAP extensions and documented it in the wiki.
  • Committed the ZAP core changes to trunk and branch 1.4.
  • Added support in crawljax for client proxy configuration for HtmlUnit.
  • Added a 'stop' button in the ajax spider panel extension to stop crawljax.
  • Added a checkbox in the options panel to switch to htmlunit.
  • Cloned the crawljax codebase to a github repo to facilitate tracking my changes, to allow other people contributing, and to learn about git.
  • Tested the extension with java 7 because of this.

#10 Week (07-24 to 07-30)

  • Changed the dependencies packing mechanism to have those included in the plugin jar by using jarjar.
  • Merged zaproxy needed resources and features into trunk and branch 1.4.
  • Working on allowing crawljax to support other web browsers such as IE and htmlunit.
  • Working on the extension-specific icons functionalities.

#09 Week (07-17 to 07-23)

  • Fixed the plugin icon's access from ZAP bug.
  • Created a new spider panel according to https://groups.google.com/forum/?fromgroups#!topic/zaproxy-develop/T5xdooTz1Zg
  • Fixed some exceptions in crawljax.
  • Created new methods in HistoryReference to allow plugins using their own icons in the sites tree. It can be used by instantiating History ref as follows: HistoryReference(session,"/org/path/to/icon", msg, clearWhenManual?);
  • Working on including the jbrofuzz code in zap
  • Also working on some issues of my svn client that forbid me from properly merging my branch.

#08 Week (07-10 to 07-16)

  • Working on profiling the plugin and detecting the memory issues that were crashing zaproxy when running the plugin.
  • Created a method that checks if chromedriver is available in the system. It work in any platform.
  • Fixed a bug that forbid the plugin from reading some labels in the plugin messages file.
  • Created a new dialog that alerts the users if chromedriver is not available and shows were to download it.
  • Fixed the bug about extensions being loaded twice by removing zap.jar from the lib and locating it outside of the plugin folder.
  • Started writing the new panel interface according to the new specifications described in the dev. group https://groups.google.com/forum/?fromgroups#!topic/zaproxy-develop/T5xdooTz1Zg
  • Working on fixing a bug in the icons that does not allow zaproxy to read the icons stored in the plugin jar file.
  • Finished implementing the spider Spidering Filter feature to ignore specific URLs.
  • Implemented a filter for the new proxy of the plugin to ignore specific URLs.
  • Not a very productive week, I did my thesis defense and had to do many university-related work. Fortunately, I finished it and from now on I can work 100% of the time on ZAP.

#07 Week (07-03 to 07-09)

  • Added Compatibility in the plugin for Chrome and IE.
  • Added new Check-boxes in the Option Menu Panel to choose the willing web-browser
    • By default -> Firefox, if you select more than one-> Firefox. When opening it again you see the last selected.
  • Tested the new version of the Selenium drivers and replaced the old ones with those (in both crawljax and the plugin).
  • Crawjlax did not had support to configure a proxy in Chrome, I added support for it.
  • Could not test the proxy thing with IE (I do not have any Windows box right now..)
  • Making a filter to ignore specifig URL patterns in crawljax (still working on it, crawljax has some known issues https://github.com/crawljax/crawljax/issues/58)
  • Added the thc202 patch to the Spanish translation of ZAP and merged it to the trunk.
  • Changed the messages.properties location to the plugin class path.
  • Changed the icon images location to the plugin class path (also modified the build.xml to include them in the jar file).

#06 Week (06-25 to 07-02)

  • Got in touch with the crawljax developers to plan the integration of the current improvements.
  • Finished implementing the logic of the ajax proxy.
  • Replaced the User-Agent check by an instantiation of the historyRefernce class with specific constant for the ajax plugin.
  • Added a new feature to the plugin and to crawljax to allow deeper but slower analysis.
  • Added a new option in the Options panel to set normal or deeper analysis.
  • Added a new option in the Options panel to modify the configuration of the AJAX proxy.
  • Cleaning and commenting the code to carry out the first commit of the plugin.
  • Set up a public testing environment in a public server to allow people testing the plugin on wivet http://caos.uab.es/~gruiz/test/wivet/
  • Committed both zap-exts and zaproxy to their corresponding branches.
  • Created a brief howto about compiling and running the plugin at

#05 Week (06-18 to 06-24)

  • Got in touch with ZAP developers to solve the spider specific icon issue of the plugin prototype.
  • Improved the results of ACT in WIVET up to 73% of the site so far.
  • Working on being able to directly instantiate the act and crawljax classes from zaproxy insted of performing a system call.
  • Fixed some translation issues in ZAP (I have to post the diff to the list to validate them).
  • Blogged & made a video about the current status at http://t.co/LJotsbBZ
  • created a new branch in the zaproxy repository called 'spiderAjax' and committed the changes.
  • I have to commit the plugin also, but there are many libraries and dependencies changed and there are no branches in that repository. I don't want to commit to avoid breaking something.
  • improved the Spanish translation of ZAP http://goo.gl/gDX2H and posted to the list http://goo.gl/lVrxQ to have it reviewed.

    #04 Week (06-11 to 06-17)

  • Working on improving the crawljax spidering capabilities.
  • In the beginning it was only capable to crawl 5 links of wivet. With the new improvements it spiders 37 pages.
  • Got in touch with the developers to include those.
  • The state machine of crawljax does not handle well the dynamic DOM states of wivet, working on fixing it.
  • Working hard on improving ACT crawling capabilities, current results available at: GSoC2012_PluginACT#Current_Results
  • Stucked on the ZAP plugin, working on showing a specific icon in ACT-crawled links and also on showing the results in its own tab.

#03 Week (06-04 to 06-10)

  • Checked out the ACT main line from
  • Went over the ACT code
  • made some changes to the pom.xml file to fix some dependency issues1
  • Understanding and testing the ACT code.
  • Read Ali Mesbah, Arie van Deursen, and Stefan Lenselink (2011). Crawling Ajax-based Web Applications through Dynamic Analysis of User Interface State Changes. ACM Transactions on the Web (TWEB).

#02 Week (05-28 to 06-03)

  • Added an AJAX history tab for the plugin in the user interface.
  • Decided we will test ACT on Wivet
  • Set up a testing environmnet with wivet https://github.com/bedirhan/wivet
  • Started testing ACT on wivet
  • Working on improving the ACT crawling capabilities.
  • Read documentation about Maven
  • Tested Crawljax in wivet
  • Identified links that is not able to crawl and started working on it.

#01 Week (05-21 to 05-27)

  • Developing the UI of the plugin
  • Added the button+icon in the attack menu
  • Added a option panel to facilitate users reviewing the configuration and changing it
  • Analyzed code regarding external applications.
  • Developed in the prototype a code that instantiates the InvokeAppWorker class and executes ACT as external application. Maybe we could extend the method to have more flexibility?

#00 Week (Before Coding Period)

  • Started working on the plugin to integrate ACT and ZAProxy:
  • Started coding the user-interface part of the ZAP plugin.
  • I set up my development environment and checked out the ZAProxy code.
  • I started Testing the current [available extensions](https://github.com/zaproxy/zap-extensions/).
  • I got in touch with the ZAProxy development community to get some feedback regarding some design decissions.
  • Made the gsoc proposal


AJAX Spider Plugin Code


The code of the plugin among its javadoc can be found at the zap-extensions repo, and it contains the following classes:

  • AjaxProxyParam: Contains the set of attributes and methods needed to store the configuration of the local proxy used by the spider ajax plugin.
  • PopupMenuAjaxSite: creates the action in the attack menu to launch the crawler.
  • ChromeAlertDialog: alter window shown if the chrome driver is not available when selecting chrome as default web browser.
  • ExtensionAjax: Main class of the plugin, extends ExtensionAdaptor and instantiates the rest of the plugin's classes.
  • ProxyAjax: This class manages the ajax spider proxy server and contains all the needed methods to update, start and stop it.
  • Messages: It defines the default (English) variants of all of the internationalised messages of the plugin.
  • Messages_es_ES: It is the Spanish translation of the messages of the plugin.
  • SpiderFilter: It is called before the crawling, it checks the candidates and discards those according to the excluded ones in the filter and scope.
  • SpiderPanel: This class creates the Spider AJAX Panel where the found URLs are displayed. It has a button to stop the crawler and another one to open the option window.
  • OptionsAjaxSpider: It contains the set of information to display in the Option Window that will be later used by the local spider proxy and the crawler.
  • SpiderThread: This class instantiates crawljax and performs the spidering.
  • PopupMenuAjax: This class creates the action in the attack menu to launch the crawler.
  • lib/: This folder contains the set of dependencies of the plugin, which will be included in the final jar file.



Running the Prototype


Requirements: Firefox 6-12

Running the new zaproxy branch (already contains the plugin)

  • Checkout spiderAjax branch
  • You need to set the output folder to bin, also select the libs and select the main class
  • Run the new configuration

Compiling the plugin

  • Checkout https://github.com/zaproxy/zaproxy/tree/spiderAjax
    • -Window->Open Prespective->Other->SVN Repository
    • Right Click in the rep. tab->New->Repository Location
    • Write in the URL field: https://github.com/zaproxy/zaproxy/tree/spiderAjax
    • Checkout the spiderAjax folder as a new java project called "zap-exts"
  • Run the build.xml file with Ant.
  • Copy ${workspace}/zap-exts/build/zap-exts/zap-ext-spiderAjax-alpha-1.jar to ${workspace}/zaproxy/src/plugin/
  • Copy ${workspace}/zap-exts/build/zap-exts/zap-ext-spiderAjax-alpha-1.jar to ${workspace}/zaproxy/bin/plugin/

Potential tests

  • Visit with firefox http://caos.uab.es/~gruiz/test/wivet
  • In the site tab, press right click to "wivet" and lunch attack->spider ajax site
  • After finishing, go to Tools->Options->Spider AJAX options and martk "crawl in depth". Launch again the spider ajax and see the results.


TODO List


  • Add proxy configuration in IE.
  • The tag scanning is slow in crawljax. I tried to put many tags and crawljax crashes... Furthermore, when you have a huge table crawljax takes a lot of time to scan all tags. We could dynamically scan each page and parse what tags have links and set a new config clickElement(x) depending on the tags of that specific page that has links (instead of the current clickAllElements() or clickDefaultElements()). We can either do this in our plugin or in the crawljax code.
  • Improve the crawling capabilities (currently 72%-74% of wivet)



Design Decisions


  • ACT will be called as an external application in ZAP.
  • ACT (act13b.jar) will be included in the extension package. Therefore, it should be available in the plugin folder of ZAP. If not found it will be asked users their location.
  • The plugin will be run in the attack menu as 'Spider AJAX Site' (i.e. http://goo.gl/Psv8x).
  • Crawled links will appear in the Sites menu with the new icon (i.e. http://goo.gl/tHCIM).
  • Show new links in a new "AJAX Spider" tab or the existing spider should also change to use the History tab.

Other Notes


  • Run ACT as external application in ZAP:
    /usr/bin/java   -jar /Users/guifre/act13b.jar -u %url% -b firefox -p localhost:8080
  • Current action icon ( just the current one of the spider colorized to red with gimp) in /resource/icon/16/spiderAJAX.png backup: http://goo.gl/j5zKz
  • Main problems of Crawljax when spidering wivet:
    There is a page that creates new random links, and crawljax enters in an infitinte loop spidering it.
    <a href="2_2.php?<?php echo (rand()%1000)?>=<?php echo (rand()%1000)?>">click me</a>



WIVET Benchmarking


Current Results

The following results have been achieved by targeting the root of the wivet framework with ACT.

0% of WIVET

Current ACT


10% of WIVET

Modified CandidateElementExtractor class to spider links frame tags.


30% of WIVET

Modified getFramesCandidates to to analyze the code of pages in frame tags.


70% of WIVET

Modified the crawling specification to spider the following HTML tags:

a, td, span, div, tr, ol, li, radio, non, refresh, xhr, relative, link, self, form, select, input, option

Current results: 73% of WIVET
Added a sleep method when crawling meta tags to support these tags.



Issues Working on

1. 1_1.php A complex javascript function executed after a certain TimeOut

<script>
window.onload = function(){
// what what
setTimeout(showLink, 3000);
}
function showLink(){
var container = document.getElementById("container");
var alink = document.createElement("a");
alink.href = "../innerpages/1_1.php";
alink.innerHTML = "click me";
container.appendChild(alink);
}


Unknown end tag for </script>

2. 2_2.php self referencing link with random query string

<a href="2_2.php?<?php echo (rand()%1000)?>=<?php echo (rand()%1000)?>">click me

Unknown end tag for </a>

3. 8_1.php link in html comment

<!-- my comment with full link: http://aopcgr.uab.es:10001/innerpages/8_1.php -->

4. 8_2.php relative link in html comment

<!-- my comment with relative: innerpages/8_2.php -->

5. 9_2.php span onmouseout window.location.href

6. 9_6.php p onmouseout window.location.href

7. 9_10.php, 9_14.php, 9_18.php, 9_22.php selenium does not handle onmouseout events

<li onmouseout="genericGo(22)">click here

Unknown end tag for </li>



<tr onmouseout="genericGo(18)">

<td onmouseout="genericGo(14)">click here

Unknown end tag for </td>



<div onmouseout="genericGo(10)">click here

Unknown end tag for </div>



<p onmouseout="genericGo(6)">click here

Unknown end tag for </p>



<span onmouseout="go2()">click here

Unknown end tag for </span>

8. 12_4.php non referred link pattern 1

9. 16_2.php 302 redirection link in response body (not able to crawl $hiddenUrl)

function redirect($url, $hiddenUrl)
{
header("Location: $url");
echo 'This page has moved to <a href="'.$hiddenUrl.'">HERE :)

Unknown end tag for </a>

';
exit();
}
redirect('../innerpages/16_1.php', '../innerpages/16_2.php');

10. 17_2.php Some complex ajax request that requires a certain time between one and another one to be executed..

11. 18_1.php non referred link pattern 2








Clone this wiki locally