Using calamari models in OCR4all #16

alexander-winkler · 2019-07-16T05:57:32Z

Hello!
I have trained a calamari model using the calamari-train (v0.3.5) command. Since I'd like to use OCR4all in order to keep track of the project I tried to copy the model into the ocr4 models-directory

project/
└── 0
    ├── 0.ckpt.data-00000-of-00001
    ├── 0.ckpt.index
    ├── 0.ckpt.json
    ├── 0.ckpt.meta
    ├── checkpoint

During the recognition process I get the following error message:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 547, in _ConvertFieldValuePair
    self.ConvertMessage(value, sub_message)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 452, in ConvertMessage
    self._ConvertFieldValuePair(value, message)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 552, in _ConvertFieldValuePair
    raise ParseError('Failed to parse {0} field: {1}'.format(name, e))
google.protobuf.json_format.ParseError: Failed to parse network field: Failed to parse backend field: Message type "BackendParams" has no field named "shuffleBufferSize".
 Available Fields(except extensions):

Is there a way to use externally trained models in OCR4all?
Thanks in advance!

The text was updated successfully, but these errors were encountered:

ChWick · 2019-07-18T07:02:31Z

OCR4all currently uses an older version of Calamari (v0.3.3) which is why models trained with the newer version (v0.3.5) can not be used with OCR4all. You have three options:

Downgrade calamari to v0.3.3
Wait for OCR4all to upgrade Calamari to v0.3.5
"Hack your model" by removing the shuffleBufferSize field in your 0.ckpt.json file

Nesbi · 2019-07-18T07:11:28Z

It is noteworthy that OCR4all stable uses calamari v0.3.3.
nightly and dev use the newest calamari version.
But we are currently working on creating a new stable version which will include the new calamari version.

It is possible to pull the nightly or dev version from docker hub.
docker pull ls6uniwue/ocr4all:<tag_name>

But we'd recommend waiting for the next stable release, that is planned to arrive shortly.

Nesbi · 2019-09-11T12:01:20Z

The stable release includes this fix

alexander-winkler · 2021-05-07T15:18:10Z

Hello! Stragely, I run again in the same/similar error with a model trained in calamari (calamari-train v0.3.5) that I'd like to use in the current stable version. The above-mentioned 'hack' is not possible, as my *.ckpt.json has no shuffleBufferSize.

Traceback (most recent call last):
  File "/usr/local/bin/calamari-predict", line 33, in 
    sys.exit(load_entry_point('calamari-ocr==1.0.5', 'console_scripts', 'calamari-predict')())
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/scripts/predict.py", line 197, in main
    run(args)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/scripts/predict.py", line 95, in run
    ctc_decoder_params=ctc_decoder_params)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 228, in __init__
    data_preproc=data_preproc, processes=processes) for cp in checkpoints]
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 228, in 
    data_preproc=data_preproc, processes=processes) for cp in checkpoints]
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 102, in __init__
    ckpt = Checkpoint(checkpoint, auto_update=self.auto_update_checkpoints)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/checkpoint.py", line 27, in __init__
    self.update_checkpoint()
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/checkpoint.py", line 43, in update_checkpoint
    .format(self.version, Checkpoint.VERSION, __version__))
Exception: Downgrading of models is not supported (3 to 2). Please upgrade your Calamari instance (currently installed: 1.0.5)

Any help would be highly appreciated!

maxnth · 2021-05-07T15:59:00Z

Calamari >1.0 uses TensorFlow 2.x (compared to TensorFlow 1.x in the previous versions) and the weights couldn't be converted (a more detailed explanation can be found here)
This sadly means that models which were e.g. trained with calamari 0.3.5 can't be used with current calamari versions and therefore also not with current OCR4all versions.

maxnth · 2021-05-09T16:10:15Z

Addendum:
To help easing the transition from older calamari / OCR4all versions to newer ones we currently offer to retrain models with an up-to-date calamari version on our GPU server.

Nesbi closed this as completed Sep 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using calamari models in OCR4all #16

Using calamari models in OCR4all #16

alexander-winkler commented Jul 16, 2019

ChWick commented Jul 18, 2019

Nesbi commented Jul 18, 2019

Nesbi commented Sep 11, 2019

alexander-winkler commented May 7, 2021

maxnth commented May 7, 2021

maxnth commented May 9, 2021

Using calamari models in OCR4all #16

Using calamari models in OCR4all #16

Comments

alexander-winkler commented Jul 16, 2019

ChWick commented Jul 18, 2019

Nesbi commented Jul 18, 2019

Nesbi commented Sep 11, 2019

alexander-winkler commented May 7, 2021

maxnth commented May 7, 2021

maxnth commented May 9, 2021