# 📖 👆🏻 Printed Links Detection Using TensorFlow 2 Object Detection API

![Links Detector Cover](https://raw.githubusercontent.com/trekhleb/links-detector/master/articles/printed_links_detection/assets/01-banner.png)

## 📃 TL;DR

- Short problem statement
- Short solution
- What this article will be about
- Links to the demo
- GIF image or video of how the app works

_In this article we will try solve the issue of making the printed links (i.e. in a book) clickable via your smartphone camera._

We will use [TensorFlow 2 Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) to train a custom object detector model to find a position of the links in the text image.

The following links text recognition will be done by using [Tesseract](https://tesseract.projectnaptha.com/). The recognition part will not be covered in this article but you may see the complete code example in [links-detector repository](https://github.com/trekhleb/links-detector/blob/master/src/hooks/useLinksDetector.ts).   

> You may 🚀 [**Launch Links Detector demo**](https://trekhleb.github.io/links-detector/) from your smartphone to see the final result.

## 🤷🏻‍♂️ The Problem

So you read a book or a magazine and see the links like `https://tensorflow.org/` or `https://some-url.com/which/may/be/longer?and_with_params=true`, but you can't click on them since they are printed. To visit these links you need to start typing them character by character in the browser's address bar, which may be pretty annoying and error prone.

![Printed Links](https://raw.githubusercontent.com/trekhleb/links-detector/master/articles/printed_links_detection/assets/02-printed-links.jpg)

## 💡 Possible Solution

Similarly to QR-code detection we may try to "teach" the smartphone to _detect_ and _recognize_ printed links for us and also to make them _clickable_. This way you'll do just one click instead of multiple keystrokes. You operational complexity goes from `O(N)` to `O(1)`.

![Links Detector Demo](https://raw.githubusercontent.com/trekhleb/links-detector/master/articles/printed_links_detection/assets/03-links-detector-demo.gif)

## 🧩 Solution Breakdown

- Technichal details on how the solution may be achieved
- Why TensorFlow, why Object Detection API
- Problems that needs to be solved (custom objects, well-known models don't work, small objects, real-time)
- Serverless, working on Mobile solution (why no backend? why lightweight model?)
- Possible models review, why MobileNet
- The issue with the Dataset (there is none)
- I'm just learning and don't have too much of experience, wanted to experiment

The task may be split into two parts:

1. Links **detection** (the position of the links)
2. Links **recognition** (the text of the links)

## 📝 Creating the Dataset Manually

- Making pictures of the book
- What tools to use to add bounding boxes
- How to convert to protobuf
- Issues with custom dataset (fonts, colors, bolds, underlined, etc.)
- Train/test split approach

### 🌅 Preprocessing the data

- Data preprocessing: resize, crop square, color adjustment

### 🔖 Labeling the dataset

- How to use LabelImg

### 🗜 Exporting the dataset

- Protobuf (the way of storing the dataset)

## 📚 Generating the Dataset Automatically (?)

- Automated way of generating the dataset
- Train/test split approach

## 📖 Exploring the Dataset

- Preview images with detection boxes
- Number of images (why is this enough)
- Do we need to preprocess the images

## 🛠 Installing Object Detection API 

- What is object detection API
- Why it will simplify our lives
- How it may be used

In [None]:
!git clone --depth 1 https://github.com/tensorflow/models

Cloning into 'models'...
remote: Enumerating objects: 2305, done.[K
remote: Counting objects: 100% (2305/2305), done.[K
remote: Compressing objects: 100% (2000/2000), done.[K
remote: Total 2305 (delta 562), reused 953 (delta 282), pack-reused 0[K
Receiving objects: 100% (2305/2305), 30.60 MiB | 31.94 MiB/s, done.
Resolving deltas: 100% (562/562), done.


## ⬇️ Downloading Pre-Trained Model

- Model detection Zoo review
- What models we could use possibly
- Why I've picked the MobileNet model
- Diagram of the model architecture

## 🏄🏻‍♂️ Trying the Model (Inference)

- Show that model works for general purpose classes
- Show that model doesn't work for custom objects (links)

## 📈 Setting Up TensorBoard

- Why do we need it (for debugging)
- What we will monitor

## 👨‍🎓 Transfer Learning

- What is transfer learning
- Why don't we train the model from scratch
- Allows us to use small dataset

### ⚙️ Configuring the Detection Pipeline

- Performance issues: batch size
- Starting not from scratch: checkpoints

### 🏋🏻‍♂️ Model Training

- Error prone: saving checkpoints
- How many epochs
- Monitoring the performance while training

### 🚀 Evaluating the Model

- Checking how accurate our model is on test dataset
- Are we good with performance, should we save the model?
- It is not a general purpose anymore, does it recognize our custom objects?

## 🗜 Exporting the Model

- Saving the model to the file for further re-use
- Show the list of files, how the model looks like on dics
- What the size of the model

## 🚀 Evaluating the Exported Model

- Example of how to use the trained model

## 🗜 Converting the Model for Web

- What formats are sutable for the web
- Few words about Tensorflow.js
- Show list of exported files - how model looks like on disc
- What the size of the model
- Why it is split in chucnks and how they are connected (via model.json)

In [None]:
pip install tensorflowjs --quiet

[?25l[K     |█████▎                          | 10kB 26.6MB/s eta 0:00:01[K     |██████████▌                     | 20kB 12.7MB/s eta 0:00:01[K     |███████████████▊                | 30kB 9.5MB/s eta 0:00:01[K     |█████████████████████           | 40kB 8.3MB/s eta 0:00:01[K     |██████████████████████████▏     | 51kB 4.7MB/s eta 0:00:01[K     |███████████████████████████████▍| 61kB 5.3MB/s eta 0:00:01[K     |████████████████████████████████| 71kB 4.1MB/s 
[?25h[?25l[K     |███▏                            | 10kB 23.4MB/s eta 0:00:01[K     |██████▍                         | 20kB 20.6MB/s eta 0:00:01[K     |█████████▌                      | 30kB 16.4MB/s eta 0:00:01[K     |████████████▊                   | 40kB 14.5MB/s eta 0:00:01[K     |███████████████▉                | 51kB 11.1MB/s eta 0:00:01[K     |███████████████████             | 61kB 11.1MB/s eta 0:00:01[K     |██████████████████████▏         | 71kB 7.4MB/s eta 0:00:01[K     |██████████████████████

## 🤔 Conclusions

- I'm just an amatour
- Links to demo app
- Issues and limitations of this approach
- Links to my ML repositories that thy might like