Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction step using very deep neural networks feature of calamari #67

Closed
Tailor2019 opened this issue Feb 14, 2019 · 12 comments
Closed

Comments

@Tailor2019
Copy link

Hi,
I installed calamari-0.2.4 . Tried to test on this simple example ""https://user-images.githubusercontent.com/33478216/46499779-a909b480-c829-11e8-87f2-d4a34d84ab69.png""
by:
calamari-predict --checkpoint calamari_models/default/ModernEnglish.ckpt --files data.png

It returns this Error 👍
Found 1 files in the dataset
Traceback (most recent call last):
File "/home/pc/my_calamari_env/bin/calamari-predict", line 11, in
load_entry_point('calamari-ocr==0.2.4', 'console_scripts', 'calamari-predict')()
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/scripts/predict.py", line 151, in main
run(args)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/scripts/predict.py", line 61, in run
predictor = MultiPredictor(checkpoints=args.checkpoint, batch_size=args.batch_size, processes=args.processes)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 202, in init
self.predictors = [Predictor(cp, batch_size=batch_size, processes=processes) for cp in checkpoints]
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 202, in
self.predictors = [Predictor(cp, batch_size=batch_size, processes=processes) for cp in checkpoints]
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/predictor.py", line 100, in init
ckpt = Checkpoint(checkpoint, auto_update=self.auto_update_checkpoints)
File "/home/pc/my_calamari_env/lib/python3.5/site-packages/calamari_ocr-0.2.4-py3.5.egg/calamari_ocr/ocr/checkpoint.py", line 20, in init
self.json = json.load(f)
File "/usr/lib/python3.5/json/init.py", line 268, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.5/json/init.py", line 319, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 7 column 1 (char 6)

Thanks for your help :)

@Tailor2019
Copy link
Author

cap1
cap2
The first screenshot shows the input image and the pretrained Model I used for testing Calamari_ocr.
The second one shows the directory where Calamari_ocr is installed (I used the virtual environment "my_calamari_env" for the installation )
It is urgent. We hope you can reply as soon as possible

@ChWick
Copy link
Member

ChWick commented Feb 14, 2019

Unfortunately, I can not verify this on my machine.

The error occurs when loading the ckpt.json file. I guess there is something wrong with that file (possibly empty, please paste the file content). Please check if a model from https://github.com/Calamari-OCR/calamari_models/tree/master/antiqua_modern is working.

@Tailor2019
Copy link
Author

This is the content of the ckpt.json https://github.com/Calamari-OCR/calamari_models/blob/53c8523aa31d14a79b26d9126ee68f1781beaa61/default/ModernEnglish.ckpt.json
I have used all the models of this model
OCR/calamari_models/tree/master/antiqua_modern
But at each test it returns the same error that i reported before.
We hope that you help me to resolve this problem as soon as possible.
Likewise, i used the the version 0.2.3 it returns the dame error.

@ChWick
Copy link
Member

ChWick commented Feb 15, 2019

Also with the older files I can successfully predict and not reproduce your error:

calamari-predict --checkpoint ~/Downloads/ModernEnglish.ckpt.json --files ~/Downloads/aaa.png 
Found 1 files in the dataset
Upgrading from version 0
2019-02-15 10:06:27.476984: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Renaming B to cnn_lstm/B.
Renaming B/Adam to cnn_lstm/B/Adam.
Renaming B/Adam_1 to cnn_lstm/B/Adam_1.
Renaming Minimum/ExponentialMovingAverage to cnn_lstm/Minimum/ExponentialMovingAverage.
Renaming W to cnn_lstm/W.
Renaming W/Adam to cnn_lstm/W/Adam.
Renaming W/Adam_1 to cnn_lstm/W/Adam_1.
Renaming beta1_power to cnn_lstm/beta1_power.
Renaming beta2_power to cnn_lstm/beta2_power.
Renaming conv2d/bias to cnn_lstm/conv2d/bias.
Renaming conv2d/bias/Adam to cnn_lstm/conv2d/bias/Adam.
Renaming conv2d/bias/Adam_1 to cnn_lstm/conv2d/bias/Adam_1.
Renaming conv2d/kernel to cnn_lstm/conv2d/kernel.
Renaming conv2d/kernel/Adam to cnn_lstm/conv2d/kernel/Adam.
Renaming conv2d/kernel/Adam_1 to cnn_lstm/conv2d/kernel/Adam_1.
Renaming conv2d_1/bias to cnn_lstm/conv2d_1/bias.
Renaming conv2d_1/bias/Adam to cnn_lstm/conv2d_1/bias/Adam.
Renaming conv2d_1/bias/Adam_1 to cnn_lstm/conv2d_1/bias/Adam_1.
Renaming conv2d_1/kernel to cnn_lstm/conv2d_1/kernel.
Renaming conv2d_1/kernel/Adam to cnn_lstm/conv2d_1/kernel/Adam.
Renaming conv2d_1/kernel/Adam_1 to cnn_lstm/conv2d_1/kernel/Adam_1.
Renaming cudnn_lstm/opaque_kernel to cnn_lstm/cudnn_lstm/opaque_kernel.
Renaming cudnn_lstm/opaque_kernel/Adam to cnn_lstm/cudnn_lstm/opaque_kernel/Adam.
Renaming cudnn_lstm/opaque_kernel/Adam_1 to cnn_lstm/cudnn_lstm/opaque_kernel/Adam_1.
Renaming cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/cudnn_compatible_lstm_cell/bias to cnn_lstm/cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/cudnn_compatible_lstm_cell/bias.
Renaming cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/cudnn_compatible_lstm_cell/kernel to cnn_lstm/cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/cudnn_compatible_lstm_cell/kernel.
Renaming cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/fw/cudnn_compatible_lstm_cell/bias to cnn_lstm/cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/fw/cudnn_compatible_lstm_cell/bias.
Renaming cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/fw/cudnn_compatible_lstm_cell/kernel to cnn_lstm/cudnn_lstm/stack_bidirectional_rnn/cell_0/bidirectional_rnn/fw/cudnn_compatible_lstm_cell/kernel.
Successfully upgraded checkpoint version to 1
Using CUDNN compatible LSTM backend on CPU
Loading Dataset: 100%|████████████████████████████| 1/1 [00:00<00:00, 46.45it/s]
Data Preprocessing: 100%|█████████████████████████| 1/1 [00:00<00:00, 70.74it/s]
Prediction: 100%|█████████████████████████████████| 1/1 [00:00<00:00,  6.85it/s]
Prediction of 1 models took 0.1836857795715332s
Average sentence confidence: 0.00%
All files written

What operating system and python version are you using? (I successfully tested Ubuntu with python 3.5 and 3.6)
Please check if the following dummy code is running in your venv:

import json
json.load(open("PATH_TO_ckpt.json", 'r'))

@Tailor2019
Copy link
Author

The first command is passed successfully But the second returns the same error that cause my problem
I'm using python 3.5.2 with the operating system Ubunto 16.04.5 TLS lenovo tty1

@ChWick
Copy link
Member

ChWick commented Feb 15, 2019

I am very glad, that the very same error occurred. This shows that this is not an issue with Calamari but rather with your operating/file system since only python commands of the standard library are used. It also might be an encoding issue (utf-8 as default is required).

Can you paste the output of

import sys
print(sys.getdefaultencoding())  # should be utf-8

and

print(open("PATH_TO_ckpt.json", 'r').read())

Possibly, you might also consider to upgrade your python version: testing 3.6 (custom repo required) or a newer minor release of 3.5.
e. g.

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get update
sudo apt-get install python3.6

@Tailor2019
Copy link
Author

After installing Calamari_ocr with Python3.6 this is the error occurred:
tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.
Aborted (core dumped)

@ChWick
Copy link
Member

ChWick commented Feb 17, 2019

The TensorFlow library was compiled to use AVX instructions, but these aren't available on your machine.

This means, that your computer (cpu) does not support the current official build of tensorflow, because the AVX instructions are not available by your cpu, in other words your cpu is too old. Two options: consider to install an older tensorflow version (pip install tensorflow==1.7) or compile tensorflow yourself.

What was the output of the upper commands? Did the json file load properly? Or can you open it successfully in a browser (e. g. firefox)?

@Tailor2019
Copy link
Author

Thanks for your help.
Yes the json file is loaded properly in the terminal.

@ChWick
Copy link
Member

ChWick commented Feb 18, 2019

So, the file is printed properly, however json.load (or json.loads`) does not work on the file's content. Therefore, the string/content must be corrupted. Since you did not answer all my questions, I assume, that the encoding ('utf8') is correct. The error message

json.decoder.JSONDecodeError: Expecting value: line 7 column 1 (char 6)
also says that the file is corrupted.

As conclusion, either the file is really corrupted, but since I assume that firefox loads it correctly, and the printed string is valid json (unfortunately, you did not send me the content or the file as I requested), something must be wrong with your python environment, and therefore has nothing to do with Calamari.

@ChWick ChWick closed this as completed Feb 18, 2019
@ChWick
Copy link
Member

ChWick commented Feb 21, 2019

@Tailor2019 Recently, another user had a similar error due to additional hidden files in the models directory. Please also check whether only that single models is loading and not a second (hidden) one.

@Tailor2019
Copy link
Author

Hi,
I'm trying to use all models ModernEnglish or antiqua_modern:
-The output of this instruction
import sys
print(sys.getdefaultencoding())
is "utf-8"
-This instruction "print(open("PATH_TO_ckpt.json", 'r').read())" shows the content of the ckpt file in the terminal which is not emty.
I'm in the same problem.Please what is the solution ?
Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants