Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: zero-size array to reduction operation maximum which has no identity #64

Closed
alexander-winkler opened this issue Jun 24, 2020 · 5 comments

Comments

@alexander-winkler
Copy link

Hello! During the recognition process the following error is thrown and the recognition effectively stops proceeding, while the Status still reads "Status: ERROR: The process is still running" (OCR4all ver 0.3.0, LAREX ver 0.3.1):

Process ForkProcess-3:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/datasets/input_dataset.py", line 99, in run
    out = self.apply_single(*data)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/datasets/input_dataset.py", line 119, in apply_single
    line, params = self.params.data_processor.apply([line], 1, False)[0]
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/data_processing/data_preprocessor.py", line 19, in apply
    processes=processes, progress_bar=progress_bar, max_tasks_per_child=max_tasks_per_child)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/utils/multiprocessing.py", line 32, in parallel_map
    out = list(map(f, d))
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/data_processing/data_preprocessor.py", line 50, in _apply_single
    data, params = proc._apply_single(data)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/data_processing/data_preprocessor.py", line 50, in _apply_single
    data, params = proc._apply_single(data)
  File "/usr/local/lib/python3.6/dist-packages/calamari_ocr-1.0.5-py3.6.egg/calamari_ocr/ocr/data_processing/center_normalizer.py", line 15, in _apply_single
    out, params = self.normalize(data, cval=np.amax(data))
  File "<__array_function__ internals>", line 6, in amax
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 2668, in amax
    keepdims=keepdims, initial=initial, where=where)
  File "/usr/local/lib/python3.6/dist-packages/numpy/core/fromnumeric.py", line 90, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: zero-size array to reduction operation maximum which has no identity

I have attached the PAGEXML and image files where the error occurs. I couldn't find anything suspicious here.

0125 nrm
0125 bin
0125.txt

@maxnth maxnth added the bug label Jun 24, 2020
@maxnth
Copy link
Member

maxnth commented Jun 24, 2020

Hello, the problem is caused by the following faulty text-line in the PAGE XML which for some reason only has two coordinate points.

<TextLine id="l6">
    <Coords points="911,399 911,459"/>
    <TextEquiv>
        <Unicode/>
    </TextEquiv>
    <TextEquiv index="1">
        <Unicode>VANAMTatione fieri dicam Au-</Unicode>
    </TextEquiv>
</TextLine>
This can also be seen on the canvas

Was this PAGE XML created by using OCR4all only or were any changes done to the PAGE XML manually?

@alexander-winkler
Copy link
Author

Oh, great, thank you! Maybe it's useful for you to know that the pagexml is the result of the new function that converts legacy data into pagexml.

@maxnth
Copy link
Member

maxnth commented Jun 24, 2020

Oh, great, thank you! Maybe it's useful for you to know that the pagexml is the result of the new function that converts legacy data into pagexml.

That's actually pretty useful, thank you.
Would it be possible for you to send us the legacy project (or all legacy files related to page 0125 in case the project is very large)?

@alexander-winkler
Copy link
Author

I hope this helps ..

0125.zip

@maxnth
Copy link
Member

maxnth commented Jun 24, 2020

Thank you for the files.
I sadly couldn't reproduce the broken coordinates when converting to latest with the supplied legacy files but we'll keep looking into this issue.

@maxnth maxnth closed this as completed Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants