An R package for assembling data frames from HTML tables (fka htmltable)
Clone or download
Latest commit 1608afd Dec 10, 2015
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R v.0.7.0 Dec 10, 2015
man v.0.7.0 Dec 10, 2015
tests v.0.6.2 Aug 6, 2015
vignettes v.0.7.0 Dec 10, 2015
.Rbuildignore Rbuildignore update Jul 30, 2015
.travis.yml travis rm fail on warnings Aug 28, 2015
DESCRIPTION v.0.7.0 Dec 10, 2015
LICENSE updated LICENSE for CRAN Jan 13, 2015
NAMESPACE v.0.7.0 Dec 10, 2015
NEWS v.0.7.0 Dec 10, 2015
README.md v.0.6.2 Aug 6, 2015
cran-comments.md adapt to v.0.5.0 release Jan 14, 2015

README.md

htmltab: Hassle-free HTML tables in R

HTML tables are a valuable data source but extracting and recasting these data into a useful format can be tedious. htmltab is a package for extracting structured information from HTML tables. It is similar to readHTMLTable() of the XML package but provides two major advantages. First, the function automatically expands row and column spans in the header and body cells. Second, users are given more control over the identification of header and body rows which will end up in the R table. Additionally, the function preprocesses table code, removes unneeded parts and so helps to alleviate the need for tedious post-processing.

Installation

The package is available from CRAN and Github. For the stable release version, download from CRAN:

install.packages("htmltab")

For the developer version, download from my GitHub repo. You can install the package directly from inside R:

install.packages("devtools")
devtools::install_github("crubba/htmltab")

Usage

To see htmltab in action, take a look at the case studies in the package vignette, this blog post or consult the package manual.

Travis status

travis status

Report issues

If you experience problems with htmltab, I would like to hear about it to improve the project. Please use my github repo to report the issue.