How to do CTC Decode on text-recognition output? #113

Nosferath · 2019-05-09T18:52:20Z

I am using the text-recognition model to perform OCR. The model has an output that I have been unable to decode.
This is the class I'm using to generate the model output.

class TextRecognizer:
    def __init__(self):
        model_xml = 'models/text-recognition-0012.xml'
        model_bin = 'models/text-recognition-0012.bin'
        
        plugin = IEPlugin(device='CPU')
        net = IENetwork(model=model_xml, weights=model_bin)
        plugin.add_cpu_extension("/home/claudio/inference_engine_samples_build/intel64/Release/lib/libcpu_extension.so")
        
        supported_layers = plugin.get_supported_layers(net)
        not_supported_layers = [l for l in net.layers.keys() if l not in supported_layers]
        if len(not_supported_layers) != 0:
            print('thing')
        
        print("Preparing input blobs")
        self.input_blob = next(iter(net.inputs))
        self.out_blob = next(iter(net.outputs))
        
        print("Loading model to the plugin")
        self.exec_net = plugin.load(network=net)
        del net
        
    def process(self, img_source):
        img = img_source.copy()
        input_width = 120
        input_height = 32
        img_height = img.shape[0]
        img_width = img.shape[1]
        # rw = img_width/float(input_width)
        # rh = img_height/float(input_height)
        #img = cv2.resize(img, (input_width, input_height))
        plot_gray(img)
        blob = cv2.dnn.blobFromImage(img, 1.0, (input_width, input_height))
        tt.tic()
        
        res = self.exec_net.infer()
        res = res[self.out_blob]
        tt.toc("Infer")
        #print(res.shape)
        #print(res)
        return res

And this is the function I wrote for CTC Decoding.

symbols = "0123456789abcdefghijklmnopqrstuvwxyz#"

def ctc_decoder(data, alphabet):
    result = ""
    prev_pad = False
    num_classes = len(alphabet)
    for i in range(data.shape[0]):
        symbol = alphabet[np.argmax(data[i])]
        if symbol != alphabet[-1]:
            if len(result) == 0 or prev_pad or (len(result) > 0 and symbol != result[-1]):
                prev_pad = False
                result = result + symbol
        else:
            prev_pad = True
    return result

The text detection demo outputs the expected text, but with my decoder I only get nonsense. How do I properly use and decode the model?

The text was updated successfully, but these errors were encountered:

Nosferath · 2019-05-09T21:22:20Z

Solved. I was missing the argument in this line:

res = self.exec_net.infer({self.input_blob: blob})

Nosferath closed this as completed May 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to do CTC Decode on text-recognition output? #113

How to do CTC Decode on text-recognition output? #113

Nosferath commented May 9, 2019 •

edited

Loading

Nosferath commented May 9, 2019

How to do CTC Decode on text-recognition output? #113

How to do CTC Decode on text-recognition output? #113

Comments

Nosferath commented May 9, 2019 • edited Loading

Nosferath commented May 9, 2019

Nosferath commented May 9, 2019 •

edited

Loading