R Client for the Abbyy Cloud OCR
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
.github template for issues Mar 31, 2018
docs added docsearch Apr 9, 2018
inst add png for biz card, ignore tif of it Apr 12, 2017
man fixes #8 May 28, 2018
revdep revdep Apr 15, 2017
tests expect lint free passes May 10, 2017
.Rbuildignore added docsearch Apr 9, 2018
.gitattributes add gitattributes for appveyor Jun 23, 2016
.gitignore fix travis + add basics for secure key storage May 15, 2016
.lintr expect lint free passes May 10, 2017
.travis.yml add key and decrypt inst. from travis May 15, 2016
DESCRIPTION fix the bug in processTextField #8 May 30, 2018
LICENSE updated cran comments Feb 4, 2016
NAMESPACE new version, ldly, arg.match etc. Mar 7, 2017
NEWS.md fix the bug in processTextField #8 May 30, 2018
README.md pandoc issues fixed; badges back May 30, 2018
appveyor.yml adding appveyor Oct 10, 2015
cran-comments.md fixes #8 May 28, 2018


Access Abbyy Cloud OCR from R

Build Status Appveyor Build status CRAN_Status_Badge codecov Research software impact Github Stars

Easily OCR images, barcodes, forms, documents with machine readable zones, e.g. passports, right from R. Get the results in a wide variety of formats, from text files to detailed XMLs with information about bounding boxes, etc.

The package provides access to the Abbyy Cloud OCR SDK API. Details about results of calls to the API can be found here.


To get the latest version on CRAN:


To get the current development version from GitHub:

# install.packages("devtools")
devtools::install_github("soodoku/abbyyR", build_vignettes = TRUE)

Using abbyyR

To get acquainted with some of the important functions, read the vignettes:

# Overview of the package
vignette("introduction", package = "abbyyR")
# some functions are used along with output
vignette("example", package = "abbyyR")
# how to scrape text from a folder of images
vignette("wiscads", package = "abbyyR")

The final output quality varies by complexity of the layout to resolution to font face etc. To measure the final quality of ocr, you can measure the edit distance to `gold standard' coded sample using recognize. To do quick edit distance based search and replace to fix messy data, you can use turbo search and replace.


Scripts are released under the MIT License.

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.