Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad results in web interface and download not possible #60

Closed
stweil opened this issue Dec 9, 2017 · 1 comment
Closed

Bad results in web interface and download not possible #60

stweil opened this issue Dec 9, 2017 · 1 comment
Labels
Milestone

Comments

@stweil
Copy link
Member

stweil commented Dec 9, 2017

The web interface (for example https://digi.bib.uni-mannheim.de/ocr-fileformat/) shows damaged results when converting from hOCR to ALTO2.0 or ALTO2.1:

<?xml version="1.0" encoding="utf-8"?>
<alto xmlns="http://www.loc.gov/standards˺lto/ns-v2#"
      xmlns:xlink="http://www.w3.org�/xlink"
      xmlns:xsi="http://www.w3.org�/XMLSchema-instance"
      xsi:schemaLocation="http://www.loc.gov/standards˺lto/ns-v2# http://www.loc.gov/standards˺lto/v2˺lto-2-0.xsd">
   <Description>
      <MeasurementUnit>pixel</MeasurementUnit>

It should look like this:

<?xml version="1.0" encoding="utf-8"?>
<alto xmlns="http://www.loc.gov/standards/alto/ns-v2#"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
      xsi:schemaLocation="http://www.loc.gov/standards/alto/ns-v2# http://www.loc.gov/standards/alto/v2/alto-2-0.xsd">
   <Description>
      <MeasurementUnit>pixel</MeasurementUnit>

It is also not possible to download the result using the Download button.

According to user feedback, the web interface worked two months ago. In the meantime, we have upgraded the server software from Debian Jessie to Stretch. So the problem might be related to the change from PHP‌ 5.6 to PHP 7.0

@stweil stweil changed the title Bad results in web interface Bad results in web interface and download not possible Dec 9, 2017
@stweil stweil added the bug label Dec 9, 2017
@stweil stweil added this to the release 0.2.2 milestone Dec 9, 2017
@zuphilip
Copy link
Member

zuphilip commented Dec 9, 2017

The encoding problem is also shown in the screenshot which was created a year ago and the Dockerfile was using PHP 7.0 from the beginning.

zuphilip added a commit to zuphilip/ocr-fileformat that referenced this issue Dec 9, 2017
This seems to fix the encoding problem mentioned in UB-Mannheim#60
without any other negative effect.
@stweil stweil closed this as completed in deba1a6 Dec 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants