clj-scraper

A web-scraper for personal enjoyment and experiments with core/async. Supports two websites for your scraping pleasure.

Requirements

Leiningen
JDK >= 1.6

Building

$ lein uberjar

Usage

java -jar target/scraper-0.3.1-standalone.jar

Options

-c, --cache [dir]           cache files directory
-o, --output [dir]          downloaded images directory
-w, --workers [num]         number of download workers
-d, --debug                 display debug info
-s, --source [ngo|vrotmne]  handle of website to scrape
-S, --skip [num]            skip first num posts of LJ
-L, --list-only             save image urls, but don't download
-x, --exit-on-exist         exit the process if downloaded file exists
-h, --help                  print this help

Examples

$ java jar target/scraper-0.3.1-standalone.jar -w 20 -s ngo

License

Distributed under the Eclipse Public License, the same as Clojure.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
config		config
doc		doc
images		images
src/scraper		src/scraper
test/scraper		test/scraper
.deploy.sftp		.deploy.sftp
.gitignore		.gitignore
.history.sftp		.history.sftp
README.md		README.md
project.clj		project.clj
serve		serve

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

doc

doc

images

images

src/scraper

src/scraper

test/scraper

test/scraper

.deploy.sftp

.deploy.sftp

.gitignore

.gitignore

.history.sftp

.history.sftp

README.md

README.md

project.clj

project.clj

serve

serve

Repository files navigation

clj-scraper

Requirements

Building

Usage

Options

Examples

License

About

Releases

Packages

Languages

0rca/clj-scraper

Folders and files

Latest commit

History

Repository files navigation

clj-scraper

Requirements

Building

Usage

Options

Examples

License

About

Topics

Resources

Stars

Watchers

Forks

Languages