Description
===========

Docker container for running OCRopus commands. This bundles up:

    - CUDA and PyTorch
    - A Python 2.7 Conda installation
    - The major modules from OCRopus3

PyTorch is currently at version 0.3.1, since OCRopus hasn't been converted yet
to PyTorch 0.4.

You need to set a Jupyter password (see `jupyter_config/README`) and
a VNC password (see `vnc_config/README`).

To clone this repository, use the `--recursive` flag:

    git clone --recursive git@github.com:tmbdev/ocropus3-docker.git
    cd ocropus3-docker
    ./build

The `ocropus3-docker` repository builds a Docker container that you can use to run OCRopus3 on any platform.

- `./build` -- build the container
- `./ocropy` -- run the container

The container automatically starts a VNC server for graphical output. Inside the container is a complete OCRopus3 installation.

# Docker Container

Some notes on the Docker container:

- the pytorch version is 0.3.1; OCRopus hasn't been ported to 0.4 yet
- the Python installation is in /opt/conda, separate from the regular Ubuntu installation

In [2]:
!head Dockerfile; echo ...; tail Dockerfile

FROM nvidia/cuda:9.0-base
#FROM nvidia/cuda:9.1-base
#FROM nvidia/cuda:9.2-devel-ubuntu18.04
MAINTAINER Tom Breuel <tmbdev@gmail.com>

ENV DEBIAN_FRONTEND noninteractive
ENV DEBCONF_NONINTERACTIVE_SEEN true

RUN apt-get -y update

...
ADD scripts/* /usr/local/bin/

RUN true \
    && echo ". /opt/conda/etc/profile.d/conda.sh" >> $HOME/.bashrc \
    && echo "conda activate base" >> $HOME/.bashrc \
    && chown -R $USER.$USER $HOME
RUN echo 'user ALL=(ALL:ALL) NOPASSWD:ALL' >> /etc/sudoers

USER $UID
ENTRYPOINT ["runcmd"]


In [4]:
!./build > log 2>&1 log
! head log; echo ...; tail log





Sending build context to Docker daemon  150.4MB
Step 1/67 : FROM nvidia/cuda:9.0-base
 ---> f3a8582463d4
Step 2/67 : MAINTAINER Tom Breuel <tmbdev@gmail.com>
 ---> Using cache
 ---> 488754540ac4
...
 ---> Using cache
 ---> e4114a076054
Step 66/67 : USER $UID
 ---> Using cache
 ---> 10c9fd428517
Step 67/67 : ENTRYPOINT ["runcmd"]
 ---> Using cache
 ---> 1ccdca692521
Successfully built 1ccdca692521
Successfully tagged ocropy:latest


Segmentation
============

Train a model with:

    ./ocropy ocroseg-train -d http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz

Models will be saved in the current directory.

You can view the training progress by connecting using VNC:

    xtightvncviewer :99

Models are saved in the current directory by default.

In [5]:
!./ocropy ocroseg-train --maxtrain 10 -d http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz

+ ocroseg-train --maxtrain 10 -d http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz
raw sample:
__key__ 'A001BIN'
__source__ 'http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz'
lines.png float32 (3300, 2592)
png float32 (3300, 2592)

preprocessed sample:
__key__ <type 'list'> ['A00BBIN']
__source__ <type 'list'> ['http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz
input float32 (1, 3300, 2592, 1)
mask float32 (1, 3300, 2592, 1)
output float32 (1, 3300, 2592, 1)

ntrain 0
model:
Sequential(
  (0): Conv2d(1, 10, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (1): BatchNorm2d(10, eps=1e-05, momentum=0.1, affine=True)
  (2): ReLU()
  (3): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
  (4): Conv2d(10, 20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (5): BatchNorm2d(20, eps=1e-05, momentum=0.1, affine=True)
  (6): ReLU()
  (7): MaxPool2d(kernel_size=(2, 2), stride=(2, 2), dilation=(1, 1), ceil_mode=False)
  (8): Conv2d

Line Recognition
================

Train a model with:

    ./ocropy ocroline-train -d http://storage.googleapis.com/lpr-ocr/uw3-dew-training.tgz \
            -t http://storage.googleapis.com/lpr-ocr/uw3-dew-testing.tgz

Models will be saved in the current directory.

You can view the training progress by connecting using VNC:

    xtightvncviewer :99

Models are saved in the current directory by default.

# Kubernetes

To run `ocropus3` on Kubernetes, you need to do the following:

- log into Google, set up Config.sh according to your project
- start up a Kubernetes cluster (`ku init`)
- submit your training job(s) (`kubectl apply -f ocroline-train.yaml`)

On GKE (Google Kubernetes Engine), you...

- write a job description in a YAML file
- use gs:// or http://storage.googleapis.com for your input shards
- save your models periodically to a Google storage bucket

In [1]:
!cat Config.sh

cluster=tmblearn
zone=us-central1-f
project=research-191823
image=gcr.io/$project/ocropy
cpu_machine=n1-standard-8
cpu_nodes=3
gpu_machine=n1-standard-16
gpu_nodes=2


In [3]:
!ku help

init -- initialize the cluster
daemonset -- start the NVIDIA daemonset
status -- cluster status
pods -- node list
stats -- node stats
build -- build the cloud image
kill -- kill the cluster
connect -- connect to a cluster
forward -- connect to a cluster
help -- display this help


In [4]:
!ku status

NAME      LOCATION       MASTER_VERSION  MASTER_IP       MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
tmblearn  us-central1-f  1.9.7-gke.1     104.198.252.27  n1-standard-8  1.9.7-gke.1   5          RUNNING
NAME          MACHINE_TYPE    DISK_SIZE_GB  NODE_VERSION
default-pool  n1-standard-8   100           1.9.7-gke.1
p100          n1-standard-16  100           1.9.7-gke.1


In [5]:
!ku stats

NAME      LOCATION       MASTER_VERSION  MASTER_IP       MACHINE_TYPE   NODE_VERSION  NUM_NODES  STATUS
tmblearn  us-central1-f  1.9.7-gke.1     104.198.252.27  n1-standard-8  1.9.7-gke.1   5          RUNNING

NAME         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.27.240.1   <none>        443/TCP   1d

      1 ocroline-train Running
      1 ocroseg-train Running


In [8]:
!cat ocroseg-train.yaml

apiVersion: batch/v1
kind: Job
metadata:
  name: ocroseg-train
spec:
  template:
    spec:
      containers:
      - name: ocroseg-train
        image: gcr.io/research-191823/ocropy
        workingDir: "/tmp"
        command: ["/usr/local/bin/runcmd"]
        args:
        - ocroseg-train
        - "-d"
        - "http://storage.googleapis.com/lpr-ocr/uw3-framed-lines.tgz"
        - "-o"
        - "ocroseg"
        resources:
          requests:
          limits:
            nvidia.com/gpu: "1"
            cpu: 12
            memory: "48000Mi"
      nodeSelector:
        cloud.google.com/gke-accelerator: nvidia-tesla-p100
      restartPolicy: Never
  backoffLimit: 4


In [9]:
!gsutil ls gs://lpr-ocr/ | grep uw3

gs://lpr-ocr/_uw3-patches.tgz
gs://lpr-ocr/uw3-dew-testing.tgz
gs://lpr-ocr/uw3-dew-training.tgz
gs://lpr-ocr/uw3-framed-lines-test.tgz
gs://lpr-ocr/uw3-framed-lines-train.tgz
gs://lpr-ocr/uw3-framed-lines.tgz
gs://lpr-ocr/uw3-framed-zones.tgz
gs://lpr-ocr/uw3-lines-dew.tgz
gs://lpr-ocr/uw3-lines.tgz
gs://lpr-ocr/uw3-pages-test.tgz
gs://lpr-ocr/uw3-pages-train.tgz
gs://lpr-ocr/uw3-zones.tgz
gs://lpr-ocr/_uw3-patches/
gs://lpr-ocr/uw3-lines-old/


In [1]:
!ku connect ocroline ls | head

connecting to: ocroline-train-7fshb
Miniconda2-latest-Linux-x86_64.sh  ol-000000665-010886.pt
ol-000000005-239618.pt		   ol-000000670-007518.pt
ol-000000010-182026.pt		   ol-000000675-007790.pt
ol-000000015-119865.pt		   ol-000000680-008311.pt
ol-000000020-094170.pt		   ol-000000685-007108.pt
ol-000000025-079694.pt		   ol-000000690-010020.pt
ol-000000030-048406.pt		   ol-000000695-006404.pt
ol-000000035-058577.pt		   ol-000000700-007747.pt
ol-000000040-039189.pt		   ol-000000705-007729.pt
