Prediction API Error #270

Mageswaran1989 · 2021-07-01T12:22:55Z

I used cli to train on SROIE2019 dataset (original images are preprocessed into line images) with :

calamari-train \
--device.gpus 0 \
--trainer.gen SplitTrain \
--trainer.gen.validation_split_ratio=0.2  \
--trainer.output_dir /data/model_output \
--trainer.epochs 25 \
--early_stopping.frequency=1 \
--early_stopping.n_to_go=3 \
--train.images /data/*.jpg

Training went smooth and the logs are
train.log

After the training process, I am trying to load the model as mentioned here, however I get following error:

>>> predictor = Predictor.from_checkpoint(params=PredictorParams(), checkpoint='/data/model_output/best.ckpt')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/predict/predictor.py", line 31, in from_checkpoint
    keras.models.load_model(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/saving/save.py", line 206, in load_model
    return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/saving/hdf5_format.py", line 182, in load_model_from_hdf5
    model_config = json_utils.decode(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'

I tried loading pretrainined model from antiqua_historical, and again I got the same error:

>>> predictor = Predictor.from_checkpoint(params=PredictorParams(), checkpoint='/data/model_output/antiqua_historical/0.ckpt')
/usr/local/lib/python3.8/dist-packages/paiargparse/dataclass_json_overrides.py:78: RuntimeWarning: `NoneType` object value of non-optional type tfaip_commit_hash detected when decoding CalamariScenarioParams.
  warnings.warn(f"`NoneType` object {warning}.", RuntimeWarning)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/predict/predictor.py", line 26, in from_checkpoint
    ckpt = SavedCalamariModel(checkpoint, auto_update=auto_update_checkpoints)
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/savedmodel/saved_model.py", line 31, in __init__
    self.update_checkpoint()
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/savedmodel/saved_model.py", line 56, in update_checkpoint
    self._single_upgrade()
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/savedmodel/saved_model.py", line 88, in _single_upgrade
    update_model(self.dict, self.ckpt_path)
  File "/usr/local/lib/python3.8/dist-packages/calamari_ocr/ocr/savedmodel/migrations/version3_4to5.py", line 22, in update_model
    pred_model.load_weights(path + ".h5")
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/engine/training.py", line 2234, in load_weights
    hdf5_format.load_weights_from_hdf5_group(f, self.layers)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/keras/saving/hdf5_format.py", line 662, in load_weights_from_hdf5_group
    original_keras_version = f.attrs['keras_version'].decode('utf8')
AttributeError: 'str' object has no attribute 'decode'

The text was updated successfully, but these errors were encountered:

ChWick · 2021-07-01T12:29:05Z

This might be related to Tensorflow. Can you post your Tensorflow and h5py version? (tensorflow should be >= 2.3.0) I guess there is a mismatch.

Mageswaran1989 · 2021-07-01T12:36:56Z

tf.version
'2.4.2'

I am using docker to build the model with only calamari installed.

best.ckpt.zip

ChWick · 2021-07-01T12:55:47Z

I just setup a fresh venv (current calamari master) with python 3.8:

virtualenv -p python3.8 venv
git clone https://github.com/Calamari-OCR/calamari.git
source venv/bin/activate
pip install -U pip
pip install -e calamari

I had not problems loading your provided model using the predict script (calamari-predict).
Using tensorflow==2.4.2 and h5py==2.10.0. Maybe your h5py version is already 3.x?

Mageswaran1989 · 2021-07-01T13:00:31Z

Yup.... but I am installing only calamari using pip with Python version - 3.8.5 (default installation version.)

h5py 3.3.0

Mageswaran1989 · 2021-07-01T13:01:24Z

Dockerfile

FROM nvidia/cuda:11.1-cudnn8-runtime-ubuntu20.04 as runtime-image

ARG DEBIAN_FRONTEND=noninteractive
RUN ln -snf /usr/share/zoneinfo/$CONTAINER_TIMEZONE /etc/localtime && echo $CONTAINER_TIMEZONE > /etc/timezone
RUN mkdir -p /usr/share/man/man1/

RUN --mount=type=cache,target=/var/cache/apt \
    apt-get update && \
    apt-get install --no-install-recommends --no-install-suggests -y \
    build-essential \
    curl \
    ca-certificates p11-kit \
    python3-dev \
    python3-distutils \
    python3-venv \
    openjdk-11-jre-headless \
    tesseract-ocr \
    libtesseract-dev \
    libpq-dev \
    python3-pip \
    libgl1-mesa-glx &&\
    apt clean && rm -rf /var/lib/apt/lists/*

RUN pip3 install calamari-ocr
COPY ops/docker/ocr/requirements.txt /
RUN --mount=type=cache,target=/root/.cache/pip3 pip3 install --no-cache-dir -r /requirements.txt
RUN ln -s /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcusolver.so.11 /usr/local/cuda-11.1/targets/x86_64-linux/lib/libcusolver.so.10
ENV export LD_LIBRARY_PATH=/usr/local/cuda-11.1/lib64/:$LD_LIBRARY_PATH

ChWick · 2021-07-01T13:31:18Z

I guess the pip version is outdated (<21.x). Older pip version do not resolve version conflicts automatically, which is why a newer h5py version gets installed. When I install pip install calamari-ocr in a venv I receive h5py=2.10.0.

Please try to insert a

pip3 install -U pip setuptools

in your Dockerfile before you install calamari-ocr.

Mageswaran1989 · 2021-07-01T13:44:15Z

Thanks a lot @ChWick

Its working with h5py=2.10.0 and the predictions are also quite good :)

However, keeping predictor alive is running into error:

>>> raw_predictor = predictor.raw().__enter()__  # you can also wrap the following lines in a `with`-block
  File "<stdin>", line 1
    raw_predictor = predictor.raw().__enter()__  # you can also wrap the following lines in a `with`-block
                                             ^
SyntaxError: invalid syntax

If you are Ok for an example, I can put out a small example to train on SROIE2019 and do predictions on whole image using craft text detection and calamari for OCR.

ChWick · 2021-07-01T13:47:11Z

oups, this is is a typo in the docs. Should be:

raw_predictor = predictor.raw().__enter__()

Mageswaran1989 · 2021-07-01T13:48:31Z

>>> raw_predictor = predictor.raw().__enter__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Predictor' object has no attribute 'raw'

ChWick · 2021-07-01T14:14:28Z

Ah, this particular interface is rather new and not yet included in the current pip-release, its only in the master and will be included in the next release.

ChWick · 2021-07-01T14:18:24Z

I can draft a quick minor release if you rely on this feature, though! Just let me know

Mageswaran1989 · 2021-07-01T14:59:45Z

I hope there is not much difference between two methods, for now I am good with the working version.

It would be good to have in pip version in general :) if it is not much of a work. Thanks again.

ChWick closed this as completed Sep 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prediction API Error #270

Prediction API Error #270

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021 •

edited

Mageswaran1989 commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

Prediction API Error #270

Prediction API Error #270

Comments

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021 • edited

Mageswaran1989 commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021

ChWick commented Jul 1, 2021

Mageswaran1989 commented Jul 1, 2021

ChWick commented Jul 1, 2021 •

edited