Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't set the headers #1

Open
ghost opened this issue Nov 11, 2020 · 3 comments
Open

Can't set the headers #1

ghost opened this issue Nov 11, 2020 · 3 comments

Comments

@ghost
Copy link

ghost commented Nov 11, 2020

Hey there,
Thank you for this image. It works great on desktop.
I am trying to use it on Google Cloud Run.
I deployed it after changing some of the environment variables and it was perfect.
It works perfectly on desktop. A screenshot from a Georgian language news channel OCR'ed perfectly. But there is an issue with the cloud version
image

I can't set the headers on the build that's uploaded to the cloud
For example I can't set the X-Tika-OCRLanguage header via postman. I can send the png file via binary (and not multiform) and it renders OCR on default languages which are eng+rus+ara. Please note that I have apt-get tesseract-oct-all on the dockerfile.
image

I only get this WARN in the logs section:
WARN Both org.apache.tika.server.resource.TikaResource#getHTML and org.apache.tika.server.resource.TikaResource#getText are equal candidates for handling the current request which can lead to unpredictable results

Do I need an extra Header to send to the Tika?

What do you think about this? This is a common problem in all the images I have came across in the wild.
Is this a CORS issue? Do you know a way to fix this? Maybe it requires a new config on the tika-config.template.xml?

Best,

@kujira-docker
Copy link
Owner

Thank you for shareing your information.
I am not very familiar with the Google Cloud, so this is a general opinion.

It seems not a Tika-Server related issue. It seems you are using reverse proxy because accessing your app via HTTPS (Tika-Server itself does not support HTTPS).
So you should check your server configuration to passing headers. I think you are in the state shown in Fig2 right now.

image
Fig1. Good OCR with localhost Tika-Server

image
Fig2. Bad OCR with HTTPS reverse proxy and passing no headers

image
Fig3. Good OCR with HTTPS reverse proxy and passing all headers

@ghost
Copy link
Author

ghost commented Nov 13, 2020

Thank you very much for the reply.
I resolved the issue by adding to the last line of the entrypoint
--cors http://localhost:3000
for localhost, but anyone can modify it to where their request is sent from.

@ghost
Copy link
Author

ghost commented Nov 15, 2020

After adding the --cors whateverClient line to the entrypoint
It certainly works for string put requests to /language/string api

const axios = require ('axios');

const putRequestConfig = {headers: {
        'Accept': 'text/plain',
        'Connection': 'keep-alive'
      }};
    
const lang = await axios.put(
'https://tika-5ev2qfbmha-ew.a.run.app/language/string', 
"console log which language is this text?", //Any text to request
putRequestConfig)

      .then(res=> console.log('The language of this text is ', res.data))  // Response to request
      .catch(console.log('Put request cannot work because of CORS in docker server'))

But I couldn't manage to put or post any image or pdf yet.
I must be doing something wrong.. but what...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant