Access Abbyy Cloud OCR from R
Easily OCR images, barcodes, forms, documents with machine readable zones, e.g. passports, right from R. Get the results in a wide variety of formats, from text files to detailed XMLs with information about bounding boxes, etc.
To get the latest version on CRAN:
To get the current development version from GitHub:
# install.packages("devtools") devtools::install_github("soodoku/abbyyR", build_vignettes = TRUE)
To get acquainted with some of the important functions, read the vignettes:
# Overview of the package vignette("introduction", package = "abbyyR") # some functions are used along with output vignette("example", package = "abbyyR") # how to scrape text from a folder of images vignette("wiscads", package = "abbyyR")
The final output quality varies by complexity of the layout to resolution to font face etc. To measure the final quality of ocr, you can measure the edit distance to `gold standard' coded sample using recognize. To do quick edit distance based search and replace to fix messy data, you can use turbo search and replace.
Scripts are released under the MIT License.
Contributor Code of Conduct
The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.