Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using calamari models in OCR4all #16

Closed
alexander-winkler opened this issue Jul 16, 2019 · 6 comments
Closed

Using calamari models in OCR4all #16

alexander-winkler opened this issue Jul 16, 2019 · 6 comments

Comments

@alexander-winkler
Copy link

Hello!
I have trained a calamari model using the calamari-train (v0.3.5) command. Since I'd like to use OCR4all in order to keep track of the project I tried to copy the model into the ocr4 models-directory

project/
└── 0
    ├── 0.ckpt.data-00000-of-00001
    ├── 0.ckpt.index
    ├── 0.ckpt.json
    ├── 0.ckpt.meta
    ├── checkpoint

During the recognition process I get the following error message:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 547, in _ConvertFieldValuePair
    self.ConvertMessage(value, sub_message)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 452, in ConvertMessage
    self._ConvertFieldValuePair(value, message)
  File "/usr/local/lib/python3.6/dist-packages/google/protobuf/json_format.py", line 552, in _ConvertFieldValuePair
    raise ParseError('Failed to parse {0} field: {1}'.format(name, e))
google.protobuf.json_format.ParseError: Failed to parse network field: Failed to parse backend field: Message type "BackendParams" has no field named "shuffleBufferSize".
 Available Fields(except extensions): 

Is there a way to use externally trained models in OCR4all?
Thanks in advance!

@ChWick
Copy link

ChWick commented Jul 18, 2019

OCR4all currently uses an older version of Calamari (v0.3.3) which is why models trained with the newer version (v0.3.5) can not be used with OCR4all. You have three options:

  • Downgrade calamari to v0.3.3
  • Wait for OCR4all to upgrade Calamari to v0.3.5
  • "Hack your model" by removing the shuffleBufferSize field in your 0.ckpt.json file

@Nesbi
Copy link
Contributor

Nesbi commented Jul 18, 2019

It is noteworthy that OCR4all stable uses calamari v0.3.3.
nightly and dev use the newest calamari version.
But we are currently working on creating a new stable version which will include the new calamari version.

It is possible to pull the nightly or dev version from docker hub.
docker pull ls6uniwue/ocr4all:<tag_name>

But we'd recommend waiting for the next stable release, that is planned to arrive shortly.

@Nesbi
Copy link
Contributor

Nesbi commented Sep 11, 2019

The stable release includes this fix

@Nesbi Nesbi closed this as completed Sep 11, 2019
@alexander-winkler
Copy link
Author

Hello! Stragely, I run again in the same/similar error with a model trained in calamari (calamari-train v0.3.5) that I'd like to use in the current stable version. The above-mentioned 'hack' is not possible, as my *.ckpt.json has no shuffleBufferSize.

Traceback (most recent call last):
  File "/usr/local/bin/calamari-predict", line 33, in 
    sys.exit(load_entry_point('calamari-ocr==1.0.5', 'console_scripts', 'calamari-predict')())
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/scripts/predict.py", line 197, in main
    run(args)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/scripts/predict.py", line 95, in run
    ctc_decoder_params=ctc_decoder_params)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 228, in __init__
    data_preproc=data_preproc, processes=processes) for cp in checkpoints]
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 228, in 
    data_preproc=data_preproc, processes=processes) for cp in checkpoints]
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/predictor.py", line 102, in __init__
    ckpt = Checkpoint(checkpoint, auto_update=self.auto_update_checkpoints)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/checkpoint.py", line 27, in __init__
    self.update_checkpoint()
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/checkpoint.py", line 43, in update_checkpoint
    .format(self.version, Checkpoint.VERSION, __version__))
Exception: Downgrading of models is not supported (3 to 2). Please upgrade your Calamari instance (currently installed: 1.0.5)

Any help would be highly appreciated!

@maxnth
Copy link
Member

maxnth commented May 7, 2021

Calamari >1.0 uses TensorFlow 2.x (compared to TensorFlow 1.x in the previous versions) and the weights couldn't be converted (a more detailed explanation can be found here)
This sadly means that models which were e.g. trained with calamari 0.3.5 can't be used with current calamari versions and therefore also not with current OCR4all versions.

@maxnth
Copy link
Member

maxnth commented May 9, 2021

Addendum:
To help easing the transition from older calamari / OCR4all versions to newer ones we currently offer to retrain models with an up-to-date calamari version on our GPU server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants