GitHub - Arbazkhan-cs/Retrieval-Augmented-Generation: 🔍📄 A Retrieval-Augmented Generation (RAG) project that enhances text generation by integrating relevant retrieved information. Get more accurate and context-aware outputs!

PDF Question Answering System

Overview

The PDF Question Answering System is a web application that allows users to upload PDF documents and ask questions about their content. The system extracts text from the PDFs, processes it, and leverages state-of-the-art language models to provide accurate and context-aware answers to user queries. https://huggingface.co/spaces/Arbazkhan-cs/Retrieval-Augmented-Generation

Features

PDF Text Extraction: Utilizes PyPDF2 to extract text from uploaded PDF documents.
Text Chunking: Employs RecursiveCharacterTextSplitter to split text into manageable chunks based on newline characters.
Vector Store Indexing: Uses FAISS for efficient text indexing and retrieval.
Embeddings Generation: Leverages HuggingFace's sentence-transformers for creating embeddings.
Natural Language Processing: Integrates the ChatGroq large language model (LLM) to handle and respond to natural language queries.
User-Friendly Interface: Built with Streamlit for easy interaction and query submission.

How It Works

Upload a PDF: Users upload a PDF document through the Streamlit interface.
Text Processing: The application extracts and splits the text into chunks.
Indexing: The text chunks are indexed using FAISS.
Query Submission: Users input questions related to the PDF content.
Answer Retrieval: The system retrieves the most relevant text chunks and generates answers using the LLM.

Setup and Installation

Prerequisites

Python 3.8 or higher
Pip package manager

Installation

Clone the Repository:

git clone https://github.com/Arbazkhan-cs/Retrieval-Augmented-Generation.git
cd your-repo

Install Dependencies:
```
pip install -r requirements.txt
```

Then, run the app using the following command:

streamlit run app.py

Technologies Used

Languages and Frameworks: Python, Streamlit
Libraries: PyPDF2, langchain, sentence-transformers, FAISS, dotenv
Models and Tools: Hugging Face LLMs, ChatGroq, RecursiveCharacterTextSplitter

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Contact

For any inquiries, please contact your email.

Feel free to customize the text to better fit your project's specifics and your preferences.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.env		.env
PDF Query Langchain.ipynb		PDF Query Langchain.ipynb
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PDF Question Answering System

Overview

Features

How It Works

Setup and Installation

Prerequisites

Installation

Technologies Used

Contributing

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Arbazkhan-cs/Retrieval-Augmented-Generation

Folders and files

Latest commit

History

Repository files navigation

PDF Question Answering System

Overview

Features

How It Works

Setup and Installation

Prerequisites

Installation

Technologies Used

Contributing

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages