<a href="https://colab.research.google.com/github/IMG-PRCSNG/ctpn-with-nvcaffe/blob/master/ctpn_with_nvcaffe.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CTPN

CTPN is a deep learning based text detection algorithm for detecting text in natural images.

CTPN stands for `Connectionist Text Proposal Network`. It implements a combination of CNN and LSTM layers in its architecture to make proposals of where a text line exists (localisation) and confidence scores for the proposals. 

From the CNN parts, we get rich feature maps that provides information to the LSTM layers in predicting if the given sequence of fixed-width boxes belongs to a text line or not. 

There is also a vertical anchor mechanism to accont for textlines that span multiple vertical sequences to improve localisation accuracy.

Non maximum suppression is applied as post processing to combine / break down overlapping proposals.

 Links:

 - [Paper](https://arxiv.org/abs/1609.03605)
 - [Code](https://github.com/tianzhi0549/CTPN) 

# NVCaffe

[Caffe](https://github.com/BVLC/caffe) is a Deep Learning Framework written in `C++` by folks at Berkley AI Research / Berkely Vision and Learning Center. The framework lets you chain multiple layers together to form a Deep Neural Network.

[NVCaffe](https://github.com/nvidia/caffe) is a fork maintained by NVIDIA. It is fully compatible with BVLC caffe and offers additional features like
- Support for `FP16` in inference
- Support for latest `CUDA`(10+) and `CUDNN`(7+)
- Optimized multi-GPU training with `NCCL`
- Optimized memory management
- Experimental support for TensorRT layer
- and much more

# Running CTPN with NVCaffe

This project attempts to run the Text Detection Algorithm CTPN with the latest version of NVCaffe.

To showcase, we are going to run this on `Google Colab VM` with GPU Support.

Hit `Connect / Reconnect` on the toolbar (top-right) to connect to one now.


*Note: If you are familiar with docker, and would like to try on your own machine, you can directly use the github [package](https://github.com/IMG-PRCSNG/CTPN/packages/300024). It comes with nvcaffe and ctpn pre-installed. Usage Instructions: coming soon*

## Requirements

To run CTPN with the latest version of NVCaffe

- GPU (Google Colab provides free GPU time for experiments)

- Port the following custom layers to NVcaffe written
  - Reverse
  - Transpose
  - Lstm

- Compile and exeute CTPN module


The layer porting is done and is being maintained in this [repo](https://github.com/IMG-PRCSNG/caffe).
A pre-compiled version for `Ubuntu 18.04 + CUDA 10.1 + CUDNN 7.6` was extracted from the docker image in that repo and is shipped in this repo for making it simple to test on non docker environments like Colab.


*Note: If you are familiar with docker, you can use the pre-compiled NVCaffe (compiled for K, M, P, V, T series GPUs) from [here](https://github.com/IMG-PRCSNG/caffe/packages/299482).* 

*Check this [`Dockerfile`](https://github.com/IMG-PRCSNG/caffe/blob/caffe-0.17/Dockerfile) for contents and this [`Dockerfile.ctpn`](https://github.com/IMG-PRCSNG/CTPN/blob/master/Dockerfile.ctpn) for example usage*

## Warning

The following scripts are intended to be run in Colab / other disposable environments. The script makes changes to the root directory `/` and is not easily reversible. Use it at your own discretion in your own environments after reading what is inside the files.

## Setup

A broad overview steps required to run CTPN with NVCaffe are as follows:

1. Clone this [repo](https://github.com/IMG-PRCSNG/ctpn-with-nvcaffe.git) which has helper scripts to setup the colab environment
2. Initialise LFS
3. Install the system, python and NVCaffe dependencies
4. Copy the model to the correct folder
5. Run the bundled demo to check if everything is working

Before we begin let's switch our working directory to `/content`

In [None]:
%cd /content

### Cloning the Repo

This repo consists of

- Setup script to install dependencies using ansible for running caffe and CTPN on a ubuntu18.04 machine
- Pre-compiled tarball of NVCaffe compatible with the environment setup by the scripts in this repo.
- CTPN model from this [repo](https://github.com/tianzhi0549/CTPN)

In [None]:
# Run this once.
!git clone https://github.com/IMG-PRCSNG/ctpn-with-nvcaffe 
%cd /content/ctpn-with-nvcaffe

### Ansible

From the [wikipage](https://en.wikipedia.org/wiki/Ansible_(software)), 

*Ansible is an open-source software provisioning, configuration management, and application-deployment tool enabling infrastructure as code.*

We will be using `ansible playbooks` to manage our dependencies for this project

The main advantage comes from the fact that
- Easy to learn and use.
- Ansible is written in YAML. More human-readable than bash scripts.
- Most ansible modules are idempotent. You can run them over and over and ansible will ensure the desired state is present.
- Manage multiple environments with OpenSSH and Python

To install ansible


In [None]:
# Installing ansible with
!pip install ansible

### Git LFS

Git LFS stands for Git Large File Storage. It is useful for, well, storing large files. It checks out a pointer to the file when you clone the repo and do repo operations and will only fetch the actual file when you do a checkout.

This is particularly useful in large repos where there are multiple versions of DL models in VCS. Pulling / pushing all the versions will consume significant time, storage and bandwidth.

More info: [Git LFS Website](https://git-lfs.github.com/)

[`install-lfs.yml`](https://github.com/IMG-PRCSNG/ctpn-with-nvcaffe/blob/master/install-lfs.yml) consists of instructions to install and initialise LFS

We can run it as


In [None]:
!ansible-playbook install-lfs.yml --extra-vars=hosts=localhost

We can now checkout the lfs files present in this particular commit with

In [None]:
!git lfs fetch && git lfs checkout

### Dependencies

We have a lot of system and python dependencies for running Caffe and a few python dependencies for compiling and running CTPN

To set that up, we are going to run the [`install-dependencies.yml`](https://github.com/IMG-PRCSNG/ctpn-with-nvcaffe/blob/master/install-dependencies.yml) playbook which will

- Install system dependencies
- Clone CTPN repo
- Compile CTPN
- Copy Model to the path expected.

We can run that with

In [None]:
!ansible-playbook install-dependencies.yml --extra-vars=hosts=localhost

### Optional: All in One Script

There is a helper script which will do all the above. You can run it with `bash setup.sh`

## Drive Sync

Mounting your google drive folder will let you easily sync files between your Google Drive and the Colab VM Runtime. This comes in handy if you want to persist certain files beyond the lifecycle of a Colab Runtime and save time when re-starting a session.

This can be used for non-code files which don't usually reside in your VCS - like input and output images, model checkpoints, logs, config, etc

To mount your google drive, run the following cell and follow the instructions - This will sync files from VM to Google Drive. 

Optionally, You can setup Google Drive sync on your personal computer to sync these files and manipulate them directly from your Personal computer. A very good use case will be your configurations - you can quickly manipulate the configs and restart the process consuming it without having to manually push and pull.


Have a look at this [blog post](https://dev.to/kriyeng/8-tips-for-google-colab-notebooks-to-take-advantage-of-their-free-of-charge-12gb-ram-gpu-be4) for more tips on using Colab Notebooks for ML / DL experiments.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

# Now, anything you add to /content/gdrive/ will be be synced to GDrive,
# and will also be synced to your PC if you have setup GDrive <-> PC sync
# as well.

# Demo

Since we don't have control over the launch time environment variables in the Google Colab VM after a notebook has started, if we try to write `import caffe` directly, it won't be found as it was installed in a non-default location `/usr/local`.

(I haven't found a solution yet to relaunch the Notebook Server with new Environment variables)

Hence, we can
- set the environment variables
- execute a script to save the outputs
- view it later.

Here, I am writing it to the folder I created earlier `drive-sync` on Google drive, so that I can access the outputs even after the Colab VM is terminated.

In [None]:
!cd ctpn && \
  GLOG_minloglevel=2 \
    GLOG_logtostderr=1 \
    LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH \
    PYTHONPATH=/usr/local/python:$PYTHONPATH \
    python tools/demo_save.py --output_dir "/content/gdrive/My Drive/drive-sync"

We can view the output images in the following ways:

- Navigating to our Google Drive folder
- If we have enabled drive sync, we can view these files directly on our computer
- Or through the following snippet


In [None]:
%matplotlib inline
import cv2
from matplotlib import pyplot as plt
FOLDER="/content/gdrive/My Drive/drive-sync"
for i in range(1, 4):
  img = cv2.imread(f'{FOLDER}/output_img_{i}.jpg')
  plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
  plt.show()

# Future Work

- Now that CTPN works with NVCaffe, we can
  - Reduce memory footprint with FP16 computation
  - Measure and reduce time taken by the GPU part by wrapping the TensorRT compatible layers with the NVCaffe TRT Layer.


- Since most of the layers in the CTPN project is compatible with `TensorRT`, we can create a `TRT plan` file which we can serve with [Triton Inference Server](https://github.com/NVIDIA/triton-inference-server)

- For the CPU part, we can look at `multiprocessing` and `asyncio` to parallely handle the I/O bound pre-processing and CPU bound post processing.

- Expose the Text detection capabilities over an API.