Skip to content
This repository has been archived by the owner on Dec 16, 2021. It is now read-only.
/ slab Public archive

Tutorial for building an OCR web application

Notifications You must be signed in to change notification settings

jeremyevans/slab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

= Slab Tutorial

This tutorial is a walkthrough of building a simple OCR web application.
First, install the necessary gems:

  gem install -g Gemfile

Second, make sure tesseract (OCR program) is installed.  How you do this
depends on the operating system you are using.

Then, start the tutorial by going to the first commit:

  git checkout 0

After reviewing the README and changes in that commit, you can move to
the next commit:

  git checkout 1

= 0

Welcome to the Slab tutorial.  This tutorial is going to walk you through
building a simple ruby web application that does OCR (optical character
recognition) on uploaded files.

= 1

When designing a website, the most important thing to consider is data
storage.  In most cases, the data the application stores is much more
valuable than the application itself, so starting with the data design
is recommended.

We'll be using PostgreSQL for Slab's data storage.  We first add a
Rakefile and a rake task for creating the user and database for Slab.

= 2

We'll be using the ruby Sequel library for accessing PostgreSQL.
We first define the table used for data storage via a Sequel migration.

= 3

We'll be using the ruby Roda web framework.  We'll start with a simple
hello world in Roda.

= 4

We'll expand on our hello world to use a layout and have a page for
uploading a document image.

= 5

We'll start storing uploaded files, without OCRing them.  To do that,
we'll add a Sequel model.

= 6

Currently, we aren't showing any of the uploaded documents.  Let's start
doing that.

= 7

Let's actually start doing OCR on uploaded documents.  Unfortunately,
the OCR process takes a long time, so you don't want to do it while
handling a request.  Instead, we'll handle it using a background job
system, using PostgreSQL's LISTEN/NOTIFY.

= 8

Now that we can get the text of uploaded files, lets try adding a
search engine, so you can search for text and see the uploaded image.

= 9

Let's speed up our full text search using a full text index.

= 10

Next we'll add support for uploading document images using a
command line program instead of the web interface.

= 11

Now that we can upload a bunch of documents images at once, let's
make the OCR process multithreaded.

= 12

In order to make development easier, we'll add support for reloading
the application and the related models.

= 13

Next we'll start using the Bootstrap CSS framework so things look a
little less ugly. 

= 14

When we originally started displaying the uploaded images, we weren't
correctly setting the Content-Type when serving the image, and we
weren't handling image types that aren't viewable by the browser, such
as .tif documents.  Let's handle those cases now.

= 15

Our OCR program is so useful that other people want to use it, but they
only want to see documents they uploaded.  So let's go multiuser and add
an authentication framework.

= 16

When we went multiuser, our command line uploader broke, since it tries
to post without including the CSRF token.  Let's reallow command line
uploads using a token for authentication.

= 17

The last part of our OCR web application is also the most complex. We'll
be adding a websocket to our page that lists uploaded document images,
so that the page gets updated in real time as documents are uploaded.
We'll use the Roda websocket plugin in the backend, and we'll write the
front end in ruby using the Opal ruby-to-javascript compiler, using the
opal-browser library to make things easier.

About

Tutorial for building an OCR web application

Resources

Stars

Watchers

Forks

Packages

No packages published