Very naive crawler implemented in clojure
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
src/naive_crawler
.gitignore
README.md
project.clj

README.md

Naive Crawler

A very naive crawler written a while ago in clojure, it takes an initial url and save this page content and every link found on it of the same domain recursive style and saves it to disk, until there is nothing left... not even memory ;)

For real web crawler take a look at Bixo, that uses java and hadoop, or if you intent to build your own a nice option is to use clj-sys/works and maybe neo4j.