Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Branch: master
Fetching contributors…

Cannot retrieve contributors at this time

40 lines (23 sloc) 1.778 kB
LinkCrawl, this is an ongoing progress to create a single script that can be used for all matter of pentesting/research needs.
Some functionality that will be added over time is:
* Extract all links from a page (done)
* Extract all links with same origin of the page (done)
* Extract all text content from a given class (done)
* Recursive searching
* Run custom recursive searches for specific content on various sites
* Graphviz visualization of links/content
As an example of what linkcrawl should do, there is a enum_pastie script, which is an example how you can use linkcrawl to enumerate pastie.org to find interesting pasties. An ongoing effort is to minimse the needed code for custom crawlers.
Try some of the following queries: "123456 qwerty", "DB_PASSWORD", "phpmyadmin", "connect(", "exploit"
For example:
tony@enigma:~/2code/linkcrawl$ ./enum_pastie.py
[+] LinkCrawl: Enumerate pastie.org, enter query: 123456 qwerty
[+] Enter output file name: passwords
[+] Got links, dumping content to file, delimiter is: EOF EOF EOF
[+] Working: .............................................................................
[+] Enumeration done, output is in: passwords.html
After that, open the passwords.html in a web browser and enjoy :) More stuff will be added later on as i have time.
This script requires LXML (python-lxml) library. Make sure you install it before using the script.
Don't expect an extensive GUI or CLI implementation, this is mostly a header, so you can prototype your custom searchers/parsers/crawlers faster, CLI parameters will be minimal at best, i like to use this with ipython.
Usage is simple : ./linkcrawl.py [OPTION] {,http://}some.web.site
Options:
-o: print only those that have the same origin as the requested site
Jump to Line
Something went wrong with that request. Please try again.