Skip to content

AI to query a PDF document using Large Language Model.

Notifications You must be signed in to change notification settings

bloomverga/bookquery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

BookQuery

A project built to chat with PDF documents.

You have a PDF document you want to chat with in other to get insights, this tool is a Jupyter Notebook to consider as a Proof-Of-Concept using the following process:

  • Extract text from PDF using langchain.document_loaders UnstructuredPDFLoader,
  • Chunk the result data up into smaller documents using langchain.text_splitter RecursiveCharacterTextSplitter,
  • Create embeddings of the documents using Open AI Embeddings,
  • Make an indexation of the documents embeddings and store them into a pinecone database,
  • Query those documents.

For each user query, we process it as following:

  • Create embedding of the user query using Open AI Embeddings,
  • Make a semantic search in the indexed embeddings database,
  • Make a summary of the result using Langchain question answering and OpenAI GPT 3.5 Turbo model. This summary is the chatbot answer.

About

AI to query a PDF document using Large Language Model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published