Skip to content

An Open AI , Langchain and Python based application that uses embedding and vector databases to create context from pdf. Backend not Deployed

Notifications You must be signed in to change notification settings

Tathagat017/Pdf-Baba

Repository files navigation

Pdf-Baba

For the Application:

React Chakra UI Django Rest Framework Python

For the Additional Technologies:

Langchain OpenAI Faiss Embedding

A Open AI , Longchain and Streamlit application that utilizes the power of LangChain and OpenAI's language model to provide a conversational Q&A chatbot which build a conversation around pdf uploaded. Users can upload a PDF document, and the chatbot will answer questions about the document's content. The application allows you to chat with multiple PDF documents. You can ask questions about the PDFs using natural language, and the application will provide relevant responses based on the content of the documents. This app utilizes a language model to generate accurate answers to your queries. Please note that the app will only respond to questions related to the loaded PDFs.

Installation & Getting Started

Front-End (React.js) :

  1. Clone the repository: git clone <https://github.com/Tathagat017/Pdf-Baba.git>
  2. Navigate to folder : frontend_pdf/pdf_baba
  3. Install dependencies: npm install
  4. Start the guided tour: npm start

###Back-End (Django)

  1. Clone the repository: git clone <https://github.com/Tathagat017/Pdf-Baba.git>
  2. Navigate to folder : Django_Backend/PdfBabaBackend
  3. Install dependencies: pip: -r requirements.txt
  4. Start the guided tour: python manage.py runservert

Design Architecture

pdf_baba_architecture_flow

The chatbot works in several steps:

Upload PDF: You upload the desired PDF file that you want to ask questions about.

Text Extraction: The bot uses the PyPDF2 library to read the PDF file and extract text from it.

Text Splitting: The bot then splits the text into smaller chunks to overcome token limit issue and understand the content.

Embeddings Creation: Using OpenAIEmbeddings, the bot creates text embeddings from the chunks.

Document Search Creation: The bot then uses these embeddings to create a document search via the FAISS vectorstore.

Conversational Chain Creation: A LangChain ConversationalRetrievalChain is created using the OpenAI model and the document retriever.

User Query: Finally, you enter your query. The bot will provide a response based on the contents of the uploaded PDF, also citing the source sections from the PDF.

flow_chart2

###Login / Register (Protected Route :Front-End and Backend)

register

login_dark

login_light

Chat

pdf_baba_chat

CLICK ON IMAGE BELOW TO WATCH PRESSENTATION

Watch the video )

youtube

Watch the video

Action Endpoint Request Body Response
Register User POST /api/user/register username (String, required), email (String, required) User object with id, username, email
Log In User POST /api/user/login username (String, required),passwordr(required) User object with id, username, email, access token
Upload PDF's POST /api/pdf/uploadAll - pdf_files(required),token(required) in form-data key value pair
Upload Single PDF POST /api/pdf/upploadOne pdf_files(required),token(required) in form-data key value pair
Get All PDF's GET /api/list-pdfs - Array of pdf_names
POST user_question POST /api/pdf/AnswerQuestion - JSON Response with ChatBot Answer
DELETE All PDF's DELETE /api/deleteAll - Delete all Pdf's
DELETE ONE PDF POST /api/delete-one-by-name/ - pdf_files(required),token(required) in form-data key value pair

About

An Open AI , Langchain and Python based application that uses embedding and vector databases to create context from pdf. Backend not Deployed

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published