Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docai-ocr-python] #758

Open
crypto-frog opened this issue Mar 3, 2024 · 4 comments
Open

[docai-ocr-python] #758

crypto-frog opened this issue Mar 3, 2024 · 4 comments
Assignees

Comments

@crypto-frog
Copy link

crypto-frog commented Mar 3, 2024

I have followed the tutorial exactly, and i keep getting this error. my key.json is located in home/jonathan_ruben_fernandes/key.json

Here is my error message. Nothing on stack overflow and openai cannot help me figure out the error:

/usr/bin/python /home/jonathan_ruben_fernandes/online_processing.py
jonathan_ruben_fernandes@cloudshell:~$ /usr/bin/python /home/jonathan_ruben_fernandes/online_processing.py
Traceback (most recent call last):
  File "/home/jonathan_ruben_fernandes/online_processing.py", line 2, in <module>
    from google.cloud import documentai
ImportError: cannot import name 'documentai' from 'google.cloud' (unknown location)
jonathan_ruben_fernandes@cloudshell:~$ 

from google.api_core.client_options import ClientOptions
from google.cloud import documentai


PROJECT_ID = "project-doc-ocr-416018"
LOCATION = "us"  # Format is 'us' or 'eu'
PROCESSOR_ID = "51d53e5ecbc3418d"  # Create processor in Cloud Console

# The local file in your current working directory
FILE_PATH = "Winnie_the_Pooh_3_Pages.pdf"
# Refer to https://cloud.google.com/document-ai/docs/file-types
# for supported file types
MIME_TYPE = "application/pdf"

# Instantiates a client
docai_client = documentai.DocumentProcessorServiceClient(
    client_options=ClientOptions(api_endpoint=f"{LOCATION}-documentai.googleapis.com")
)

# The full resource name of the processor, e.g.:
# projects/project-id/locations/location/processor/processor-id
# You must create new processors in the Cloud Console first
RESOURCE_NAME = docai_client.processor_path(PROJECT_ID, LOCATION, PROCESSOR_ID)

# Read the file into memory
with open(FILE_PATH, "rb") as image:
    image_content = image.read()

# Load Binary Data into Document AI RawDocument Object
raw_document = documentai.RawDocument(content=image_content, mime_type=MIME_TYPE)

# Configure the process request
request = documentai.ProcessRequest(name=RESOURCE_NAME, raw_document=raw_document)

# Use the Document AI client to process the sample form
result = docai_client.process_document(request=request)

document_object = result.document
print("Document processing complete.")
print(f"Text: {document_object.text}")
@holtskinner
Copy link
Collaborator

Hi @crypto-frog, just to clarify, did you install the Document AI client library?

pip install --upgrade google-cloud-documentai

Are you running this in a colab notebook or in an iPython interactive environment? I have seen this same behavior in certain colab notebooks even after installing.

@crypto-frog
Copy link
Author

crypto-frog commented Mar 4, 2024

Hi holtskinner ! Great to hear from you. I followed the tutorial at https://codelabs.developers.google.com/codelabs/docai-ocr-python#7 step by step, including this step:

pip3 install --upgrade google-cloud-documentai
pip3 install --upgrade google-cloud-storage
pip3 install --upgrade google-cloud-documentai-toolbox

I am using the google cloud CLI, everything is remote on the google cloud as in the tutorial. I tried creating a new project. It is installed, I checked. I tried reinstalling and requirement is satisfied. I have used AI to look at my code and the steps and can find nothing wrong.

Thank you !!!

@holtskinner
Copy link
Collaborator

My theory is that it's a python versioning issue where you're running the code on a different version than installed the libraries.

Try running:

/usr/bin/python -m pip install --upgrade google-cloud-documentai google-cloud-storage google-cloud-documentai-toolbox

Then try running again.

@crypto-frog
Copy link
Author

I ran the command - requirement already satisfied. Same error. I am not using any local resources to do this - everything is in google cloud and I am following the tutorial. I don't want to give up because usually google services work. Please help !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants