jÄk - jet Änother krawler.

jÄk (or jAEk, pron. Jack) is a web application crawler and scanner which uses dynamic JavaScript code analysis. jÄk installs hooks in JavaScript APIs in order to detect the registration of event handlers, the use of network communication APIs, and dynamically-generated URLs or user forms. It then builds and mantains a navigation graph to crawl and test web applications. For more details on the internals please have a look at my thesis and our paper

Requirements

jÄk is written in python (version 3) and it is based on PyQT5 (version 5.3 - 5.4). To store data, jÄk uses mongodb via the pymongo 3.x.x bindings. Please, install the required packaged using pip, the packages manager of your distribution, or follow the documentation. jÄk also requires cython.

Running jÄk

The current version of jÄk does not offer a command-line interface. To run jÄk, you will have to write some python code and get familiar with jÄk classes and libraries. The entry point to start using jÄk is crawler/example.py.

1. Configuration Objects

1.1 Users

jÄk can use user credential and perform user login. The URL of the login page and the credential can be configured via the object utils.user.User. For example:

user = User("Wordpress", 0, "http://localhost:8080/wp-login.php", login_data = {"log": "admin", "pwd": "admin"}, session="1")

Parameters:

Name of the MongoDB database name (it can be an arbitrary name)
(Deprecated) Privilege Level of the User (0 is ok)
URL of the login page with the HTML form
Login data for the user login, e.g., log and pwd are the form input field names
If you want to use the credentials in parallel runs of jÄk with the same database, set >1

1.2 Crawler and Attacker Configuration

url = "http://localhost/
[...]
crawler_config = CrawlConfig("jÄk", url, max_depth=3, max_click_depth=3, crawl_speed=CrawlSpeed.Fast)
attack_config = AttackConfig(url)

where:

max_depth is the maximum depth of the web application link tree;
max_click_depth is the maximum depth of click event that are fired;
crawl_speed specifies the time that the crawler waits after it loads a page or triggered an event. These are the possible values:
CrawlSpeed.Slow:
- wait after loading: 1 sec.
- wait after event: 2 sec.
CrawlSpeed.Medium:
- wait after loading: 0.3 sec.
- wait after event: 1 sec.
CrawlSpeed.Fast:
- wait after loading: 0.1 sec.
- wait after event: 0.5 sec.
CrawlSpeed.Speed_of_Lightning:
- wait after loading: 0.01 sec.
- wait after event: 0.1 sec.

1.3 Database

database_manager = DatabaseManager(user, dropping=True)

user is also an instance of the User class.

2 Setting up the Crawler

To run the crawler use:

crawler = Crawler(crawl_config=crawler_config, database_manager=database_manager)
crawler.crawl(user)

You can also setup an HTTP proxy between the crawler and the web application (e.g., localhost:8082):

crawler = Crawler(crawl_config=crawler_config, database_manager=database_manager, proxy="localhost", port=8082)
crawler.crawl(user)

Papers and further readings

C. Tschürtz. Improving Crawling with JavaScript Function Hooking [DE: Verbesserung von Webcrawling durch JavaScript Funktion Hooking].
G. Pellegrino, C. Tschürtz, E. Bodden, and C. Rossow. jÄk: Using Dynamic Analysis to Crawl and Test Modern Web Applications. Proceedings of Research in Attacks, Intrusions and Defenses (RAID) Symposium (RAID 2015). PDF

Contacts

C. Tschürtz [constantin dot tschuertz (at) gmail dot com]
G. Pellegrino [gpellegrino (at) cispa dot saarland]

License

See jÄk is released as General Public License version 3 or later (See LICENSE.txt).

Name		Name	Last commit message	Last commit date
Latest commit History 102 Commits
crawler		crawler
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

jÄk - jet Änother krawler.

Requirements

Running jÄk

1. Configuration Objects

1.1 Users

1.2 Crawler and Attacker Configuration

1.3 Database

2 Setting up the Crawler

Papers and further readings

Contacts

License

About

Releases

Packages

Contributors 2

Languages

License

ConstantinT/jAEk

Folders and files

Latest commit

History

Repository files navigation

jÄk - jet Änother krawler.

Requirements

Running jÄk

1. Configuration Objects

1.1 Users

1.2 Crawler and Attacker Configuration

1.3 Database

2 Setting up the Crawler

Papers and further readings

Contacts

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages