Chafe is a web scraping library for Scala. It provides a DSL for fetching web pages, following links, extracting content, filling in forms and more.
import org.chafed._
for {
githubProject <- UserAgent GET("https://github.com/ofrasergreen/chafed")
treeBrowser <- githubProject $(".tree-browser")
readmePage <- treeBrowser click("README.md")
readme <- readmePage click$("#raw-url")
} println(readme)
This uses Scala's for-comprehension to compose a set of actions to:
- Fetch Chafe's github project page.
- Extract the HTML for the project tree browser by using a CSS selector to find a tag with the tree-browser class.
- Follow the link containing the text "README.md".
- Follow the link to the "RAW" content using a CSS selector to find a tag with the raw-url ID.
- Print its content.
See samples for more examples.
To use Chafe in your own sbt project, add the following to your build.sbt:
libraryDependencies += "org.chafed" %% "chafed" % "0.2"
Use sbt to build from source:
$ sbt clean update package
The finished jar will be in target.