# Train OCR text Detector quick example

For train datasets please download last version of ocr datasets [https://nomeroff.net.ua/datasets/](https://nomeroff.net.ua/datasets/). Unpack archive and rename to **./datasets/ocr** .
For examle
```bash
cd ./datasets/ocr
wget https://nomeroff.net.ua/datasets/autoriaNumberplateOcrEu-2019-02-19.zip
unzip autoriaNumberplateOcrEu-2019-02-19.zip
mv autoriaNumberplateOcrEu-2019-02-19 eu
```
or use your own dataset.

In [20]:
import os
import sys
import warnings
warnings.filterwarnings('ignore')

import keras
keras.backend.clear_session()

# change this property
NOMEROFF_NET_DIR = os.path.abspath('../')

DATASET_NAME = "eu"
VERSION = "2"
MODE = "gpu"
PATH_TO_DATASET = os.path.join(NOMEROFF_NET_DIR, "datasets/ocr/", DATASET_NAME)
RESULT_MODEL_PATH = os.path.join(NOMEROFF_NET_DIR, "models/", 'anpr_ocr_{}_{}-{}.h5'.format(DATASET_NAME, VERSION, MODE))

FROZEN_MODEL_PATH = os.path.join(NOMEROFF_NET_DIR, "models/", 'anpr_ocr_{}_{}-{}.pb'.format(DATASET_NAME, VERSION, MODE))

sys.path.append(NOMEROFF_NET_DIR)

from NomeroffNet.Base import OCR, convert_keras_to_freeze_pb

In [21]:
class eu(OCR):
    def __init__(self):
        OCR.__init__(self)
        # only for usage model
        # in train generate automaticly
        self.letters = ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z"]
        
        self.EPOCHS = 1

In [22]:
ocrTextDetector = eu()
model = ocrTextDetector.prepare(PATH_TO_DATASET, aug_count=0)

GET ALPHABET
Max plate length in "val": 9
Max plate length in "train": 9
Max plate length in "test": 8
Max plate length in train, test and val do match
Letters in train, val and test do match
Letters: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

EXPLAIN DATA TRANSFORMATIONS
Text generator output (data which will be fed into the neutral network):
1) the_input (image)
2) the_labels (plate number): FZH655 is encoded as [15, 35, 17, 6, 5, 5, 37, 37, 37]
3) input_length (width of image that is fed to the loss function): 30 == 128 / 4 - 2
4) label_length (length of plate number): 6
START BUILD DATA
DATA PREPARED


In [None]:
#model = ocrTextDetector.load(RESULT_MODEL_PATH)
#RESULT_MODEL_PATH

In [None]:
model = ocrTextDetector.train(mode=MODE, is_random=1)

In [24]:
ocrTextDetector.test(verbose=True)


RUN TEST

Predicted: 		 36959BB
True: 			 36939BB

Predicted: 		 WL6773O
True: 			 WWL67370

Predicted: 		 KR9X19A
True: 			 KR9X194

Predicted: 		 EZG2959
True: 			 FZG29591

Predicted: 		 7586O
True: 			 75860

Predicted: 		 DC8685BH
True: 			 DC685BH

Predicted: 		 BO145A
True: 			 B0146A

Predicted: 		 7DO04295
True: 			 TDO04295

Predicted: 		 51316KB
True: 			 51316K

Predicted: 		 61257P
True: 			 61257PE

Predicted: 		 7586O
True: 			 75860

Predicted: 		 LZA595896
True: 			 LZA59596

Predicted: 		 97116CK
True: 			 97116CX

Predicted: 		 PGCN671E
True: 			 PGN671FE

Predicted: 		 KI44317
True: 			 TKI44317

Predicted: 		 JA88459
True: 			 RJA88459

Predicted: 		 13160C
True: 			 13160CH

Predicted: 		 56618TB
True: 			 56808TB

Predicted: 		 L0US04127
True: 			 LOS04127

Predicted: 		 S019886
True: 			 SO149886

Predicted: 		 L04624
True: 			 SL04624

Predicted: 		 OGBX332
True: 			 OGBX335

Predicted: 		 AJ267P
True: 			 VJ267P

Predicted: 		 760JRT
True: 			 76PJR7

Predict

In [10]:
ocrTextDetector.save(RESULT_MODEL_PATH, verbose=True)

SAVED TO /mnt/data/var/www/html2/js/nomeroff-net/models/anpr_ocr_ru_3-cpu.h5


In [13]:
RESULT_MODEL_PATH

'/mnt/data/var/www/html2/js/nomeroff-net/models/anpr_ocr_eu_2-cpu.h5'

In [23]:
model = ocrTextDetector.load(RESULT_MODEL_PATH)

### Convert keras OCR  .h5 model to .pb graph

In [25]:
import keras
keras.backend.clear_session()
model = ocrTextDetector.load(RESULT_MODEL_PATH)
convert_keras_to_freeze_pb(model, FROZEN_MODEL_PATH)

OUTPUT: softmax_eu/truediv
INPUT: the_input_eu
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
the_input_eu (InputLayer)       (None, 128, 64, 1)   0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 128, 64, 16)  160         the_input_eu[0][0]               
__________________________________________________________________________________________________
max1 (MaxPooling2D)             (None, 64, 32, 16)   0           conv1[0][0]                      
__________________________________________________________________________________________________
conv2 (Conv2D)                  (None, 64, 32, 16)   2320        max1[0][0]                       
______________________________________________________________