Seeking Assistance with Passing a PDF to Gemini-1.5-Pro in Multimodal Mode Using LangChain #20464
Replies: 4 comments 4 replies
-
To pass a PDF document to Gemini-1.5-Pro in multimodal mode using LangChain, you'll need to adjust your approach since the For local PDF files, you can use the For a local PDF: from langchain_community.document_loaders import PyPDFLoader
file_path = "path/to/your/document.pdf"
loader = PyPDFLoader(file_path)
documents = loader.load() For an online PDF: from langchain_community.document_loaders import OnlinePDFLoader
file_path = "http://example.com/your/document.pdf"
loader = OnlinePDFLoader(file_path)
documents = loader.load() After loading the document(s), you'll have a list of This approach should align more closely with the capabilities of Gemini-1.5-Pro and the LangChain framework, allowing you to effectively pass and analyze PDF documents in multimodal mode. Sources
|
Beta Was this translation helpful? Give feedback.
-
Is there any update on this? I reckon this should be turned into an issue/feature. |
Beta Was this translation helpful? Give feedback.
-
I created this issue here, using your example as the description: https://github.com/langchain-ai/langchain/issues/21430 |
Beta Was this translation helpful? Give feedback.
-
The PDF loader approach is not desired. I belive that OP and myself would prefer to use Vertex's bulit-in PDF parser that includes OCR among other things, while also ensuring that the embeddings format is compatible/native to the model. Has there been any updates on extending the https://cloud.google.com/vertex-ai/generative-ai/docs/samples/generativeaionvertexai-gemini-pdf Thanks! |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I am trying to pass a PDF document to a gemini-1.5-pro in multimodal mode, following a process similar to the one explained here. The documentation illustrates how to pass an image and query Gemini Pro Vision, but I want to pass a PDF directly instead.
Here is my attempt:
Unfortunately this code fails.
However, if I use the official Vertex AI library, I am able to do it. Here is part of my code:
This approach works, but I was hoping to make the LangChain method function similarly.
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
Beta Was this translation helpful? Give feedback.
All reactions