webvis

A visualy stimulating web crawler. Displays a live updating graph of all domains visited.

How to Build

Run lein uberjar or lein bin

Usage

CLI

./target/webvis <URL> [-h --help] [-d --depth] [-b --blacklist] [-w --workers] [-c --concurrency]

The crawler will begin crawling at the url provided.

If a depth is specified, it will crawl no more than the specified depth, where the root domain is of depth 0. The depth is the number of domains crawled from the root domain, not the number of URLS crawled from the root URL.

The number of worker threads and the maximum number of concurrent requests can also be set. Usually only 1 worker is needed. A lower request concurrency is preferable so the crawler won't overload any servers. The default number of workers is 1 and the default maximum number of requests is 2.

Blacklisted domains will not be crawled.

REPL

To create a spider:

create-spider: [max-concurrent-reqs blacklist] [max-concurrent-reqs]

(def spider (create-spider 2 [facebook.com yahoo.com]))

The spider can then begin crawling with:

build-web: [spider url worker-count max-depth]

(build-web spider "http://example.com" 1 4)

A max depth of -1 will cause the spider to crawl forever.

To stop the spider from crawling:

(freeze! spider)

This will remove all workers. The spider will start back again once another worker is added.

(spawn-worker spider)

To kill a spider, rendering it forever unusable:

;; eek!
(kill! spider)

License

Distributed under the Eclipse Public License v1.0 (https://www.eclipse.org/legal/epl-v10.html)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
doc		doc
src/webvis		src/webvis
test/webvis		test/webvis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
project.clj		project.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

webvis

How to Build

Usage

CLI

REPL

License

About

Releases

Packages

Languages

License

ianbjorndilling/webvis

Folders and files

Latest commit

History

Repository files navigation

webvis

How to Build

Usage

CLI

REPL

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages