Web scraper for Scala
Scala
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
project
samples
src
.gitignore
.travis.yml
LICENSE
README.md
build.sbt

README.md

Chafe

Chafe is a web scraping library for Scala. It provides a DSL for fetching web pages, following links, extracting content, filling in forms and more.

Example

import org.chafed._
for {
  githubProject <- UserAgent GET("https://github.com/ofrasergreen/chafed")
  treeBrowser <- githubProject $(".tree-browser")
  readmePage <- treeBrowser click("README.md")
  readme <- readmePage click$("#raw-url")
} println(readme)

This uses Scala's for-comprehension to compose a set of actions to:

  1. Fetch Chafe's github project page.
  2. Extract the HTML for the project tree browser by using a CSS selector to find a tag with the tree-browser class.
  3. Follow the link containing the text "README.md".
  4. Follow the link to the "RAW" content using a CSS selector to find a tag with the raw-url ID.
  5. Print its content.

See samples for more examples.

Building and Installation

To use Chafe in your own sbt project, add the following to your build.sbt:

libraryDependencies += "org.chafed" %% "chafed" % "0.2"

Use sbt to build from source:

$ sbt clean update package

The finished jar will be in target.