<a href="https://colab.research.google.com/github/cdelia/ai_colabs/blob/main/chatWithPdf.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**This notebook lets you chat about a pdf file.**


Let's install the requirements.

First we need to work around a compatibility problem:

In [None]:
# We currently need an older version of pydantic to fix an incompatibility with chromadb - check if this can be removed in the future
pip install -q pydantic==1.10.12


In [None]:
pip install -q langchain openai pypdf gradio chromadb tiktoken


Now let's enter an OpenAI key.
You can create one here: [page](https://platform.openai.com/account/api-keys)


---



In [None]:
import os
import getpass as getpass

os.environ['OPENAI_API_KEY'] = getpass.getpass("Enter OpenAI API Key: ")


In [None]:
!wget -O document.pdf https://www.ibm.com/annualreport/assets/downloads/IBM_Annual_Report_2020.pdf

In [None]:
import os
from langchain.agents.agent_toolkits import (create_vectorstore_agent, VectorStoreInfo, VectorStoreToolkit)
from langchain.document_loaders import  PyPDFLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.vectorstores import Chroma
import gradio as gr

apikey = os.getenv('OPENAI_API_KEY')

llm = OpenAI(model_name="gpt-4", temperature=0.9)
embeddings = OpenAIEmbeddings()

loader = PyPDFLoader('document.pdf')
pages = loader.load_and_split()
store = Chroma.from_documents(pages, embeddings, collection_name='documentContents')
vectorStore_info = VectorStoreInfo(
      name='pdf document',
      description="The pdf document to search",
      vectorstore=store
)
toolkit = VectorStoreToolkit(vectorstore_info=vectorStore_info)
agent_executor = create_vectorstore_agent(
      llm=llm,
      toolkit=toolkit,
      verbose=True
)

def answer(prompt):
    response = agent_executor.run(prompt)
    return response

demo = gr.Interface(fn=answer, inputs="text", outputs="text")

demo.launch()