Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
readability erlang port
Erlang Shell
Tree: 62d1b6aa15

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
src
.gitignore
Makefile
README.md
rebar
rebar.config
runme.sh

README.md

rdbl.erl - Erlang readability library

This is Erlang library to extract reasonable content and remove junk from html pages. Based on ideas from readability.js by arc90.

Installation

cd src/ && make

Examples

1> rdbl:simplify_url("http://www.somesite.at/internet/") -> simplified page text as string()

2> rdbl:simplify_url("http://www.somesite.at/internet/", "out.html") -> ok

3> rdbl:simplify_file("input.html", "out.html") -> ok

4> rdbl:simplify_page(HtmlPageText) -> PageTextSimplified

See other examples in rdbl.erl.

Dependencies

Library uses mochiweb html library to parse HTML-content (included). Only following files from mochiweb needed: mochinum.erl mochiutf8.erl mochiweb_charref.erl mochiweb_html.erl

Something went wrong with that request. Please try again.