Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 2.44 KB

README.md

File metadata and controls

41 lines (32 loc) · 2.44 KB

🤖BotPDF

A simple Large Language Model (LLM) chatbot project, where users can upload PDF files to receive tailored responses generated directly from the document contents. Built using open-source tools and technologies in a controlled, local environment since not a lot of people have access to OpenAI API keys and other paid options. This is also to eliminate reliance on cloud services and provide an easier local set-up that should just work.

What I learned

  • How to use other forms of input for the LLM other than text prompts
  • How we can customize context for the LLM via Retrieval Augmented Generation (RAG) without having to train your own LLM
  • How RAG works by utilizing vector embeddings, which is a way to represent the semantics of words in a numerical form
  • How to build a RAG pipeline using open-source tools and technologies

Tech Stack

Setup

After cloning the repository, follow the steps below to install the dependencies and run the app:

  • Run pip install -r requirements.txt
  • Download and install ollama
  • Run ollama pull llama2
  • Run ollama serve
  • Run streamlit run frontend_chatbot.py in the command line

Note: If you want to quickly run the app with an empty knowledge base (i.e. forget the previously uploaded PDFs), you can run reset.bat for Windows or reset.sh for Unix-like systems. Alternatively, you can manually delete the contents of the data folder and delete the chroma_db_data folder entirely.

Demo

Upload your PDF

pdf_upload.mp4

Ask away

pdf_query.mp4

Multiple uploads

multi_upload.mp4

Cross-reference your PDFs

crossref_pdf.mp4

Feedback

All manner of feedback is highly appreciated. I am relatively new to this and I would love to hear your comments and suggestions as part of the learning experience. Thank you!