Skip to content

tabulizer: Bindings for Tabula PDF Table Extractor Library #42

@leeper

Description

@leeper
    1. What does this package do? This page wraps the Tabula Java library, which can (very accurately) extract tables from PDF documents. It also implements some lower level utilities for working with PDF documents (metadata and text extraction, image conversion, split/merge). It should be useful for extracting scientific data, especially tabular data, from PDFs, such as from scientific articles or agency reports.
    1. Paste the full DESCRIPTION file inside a code block below.
Package: tabulapdf
Type: Package
Title: Bindings for Tabula PDF Table Extractor Library
Version: 0.1.11
Date: 2016-05-07
Authors@R: c(person("Thomas J.", "Leeper", role = c("aut", "cre"),
                    email = "thosjleeper@gmail.com"))
Maintainer: Thomas J. Leeper <thosjleeper@gmail.com>
Description: Bindings for the Tabula <http://tabula.technology/> java library, which can extract tables from PDF documents.
License: MIT + file LICENSE
URL: https://github.com/leeper/tabulapdf
BugReports: https://github.com/leeper/tabulapdf/issues
Imports:
    graphics,
    grDevices,
    utils,
    tools,
    tabulizerjars,
    rJava,
    png
Suggests:
    testthat,
    knitr
RoxygenNote: 5.0.1

    1. URL for the package (the development repository, not a stylized html page) https://github.com/leeper/tabulizer
    1. What data source(s) does it work with (if applicable)? n/a
    1. Who is the target audience? Data scientists stuck with other peoples' data
    1. Are there other R packages that accomplish the same thing? If so, what is different about yours? The closest thing will be pdftools, which is a libpoppler wrapper. tabulizer has some overlap but the core functionality - table extraction - is not supported by pdftools.
    1. Check the box next to each policy below, confirming that you agree. These are mandatory.
    • This package does not violate the Terms of Service of any service it interacts with.
    • The repository has continuous integration with Travis CI and/or another service
    • The package contains a vignette
    • The package contains a reasonably complete README with devtools install instructions
    • The package contains unit tests
    • The package only exports functions to the NAMESPACE that are intended for end users
    1. Do you agree to follow the rOpenSci packaging guidelines? These aren't mandatory, but we strongly suggest you follow them. If you disagree with anything, please explain.
    • [] Are there any package dependencies not on CRAN? The package has some files in a dependent package, which can be put on CRAN once everything is ready to release.
    • Do you intend for this package to go on CRAN?
    • Does the package have a CRAN accepted license?
    • Did devtools::check() produce any errors or warnings? If so paste them below.
    1. Please add explanations below for any exceptions to the above:
    1. If this is a resubmission following rejection, please explain the change in circumstances.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions