A Retrieval-Augmented Generation (RAG) based search engine that allows users to upload documents (PDF/TXT) and query them for synthesized answers using LLM.
- Document ingestion: Upload multiple PDF and TXT files
- Embeddings: Uses OpenAI embeddings for vectorization
- Vector store: FAISS for efficient similarity search
- LLM: OpenAI GPT for answer synthesis
- API: FastAPI backend
- Frontend: Simple web interface for upload and query
-
Ensure Python 3.11+ is installed.
-
Clone or download the project.
-
Navigate to the project directory.
-
Create virtual environment:
py -3 -m venv venv -
Activate the environment:
call venv\Scripts\activate.bat -
Install dependencies:
pip install -r requirements.txt -
Set your OpenAI API key:
set OPENAI_API_KEY=your_api_key_here -
Run the application:
uvicorn main:app --reload -
Open your browser to
http://localhost:5000
- Upload documents using the upload form (select multiple PDF/TXT files).
- Once uploaded, enter a query in the query field and click Search.
- The synthesized answer will be displayed.
POST /upload: Upload files (multipart/form-data)POST /query: Query with JSON body {"query": "your question"}GET /: Serve the frontend
- Retrieval accuracy: Based on FAISS similarity search
- Synthesis quality: Depends on OpenAI LLM
- Code structure: Modular with separate files for ingestion and retrieval
- LLM integration: Uses LangChain for seamless integration"# Knowledge-base-Search-Engine"