Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

HTCRAWL

Htcrawl is nodejs module for the recursive crawling of single page applications (SPA) using javascript.
It uses headless chrome to load and analyze web applications and it's build on top of Puppetteer from wich it inherits all the functionalities.

With htcrawl you can roll your own DOM-XSS scanner with less than 60 lines of javascript (see domdig)!!

More infos at htcrawl.org.

SAMPLE USAGE

const htcrawl = require('htcrawl');
const crawler = await htcrawl.launch("https://htcrawl.org");

// Print out the url of ajax calls
crawler.on("xhr", e => {
  console.log("XHR to " + e.params.request.url);
});

// Start crawling!
crawler.start();

DOCUMENTATION

API documentation can be found at https://htcrawl.org/api/.

LICENSE

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or(at your option) any later version.

ABOUT

Written by Filippo Cavallarin. This project is son of Htcap (https://github.com/fcavallarin/htcap | https://htcap.org).

About

Htcrawl is nodejs module for the recursive crawling of single page applications (SPA) using javascript

Resources

Packages

No packages published