Skip to content

uday98/EventBot

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

45 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Event Bot

A PDF document assistant with a FastAPI-based backend and a Streamlit-based frontend. The backend integrates with Google Gemini AI and Pinecone vector database, allowing users to upload PDF files and ask questions about their content using natural language via the frontend interface. Event bot(https://eventbot-pinecone-db.streamlit.app/)

πŸ“– Table of Contents

πŸš€ Features

  • PDF Document Processing: Upload and process PDF files into searchable vectors.
  • AI-Powered Q&A: Ask questions about uploaded PDFs using Google Gemini AI.
  • Vector Search: Efficient document retrieval using Pinecone vector database.
  • Resume Support: Special handling for resume/CV documents (filename-based user ID extraction).
  • Streamlit Frontend: User-friendly interface for uploading PDFs and interacting with the chatbot.
  • RESTful API: Clean REST endpoints for the backend, consumed by the frontend.
  • Health Monitoring: Built-in health checks and logging for the backend.

πŸ“‹ Prerequisites

  • Python 3.8 or higher
  • Google AI Studio account (for Gemini API key)
  • Pinecone account (for Pinecone API key and index)
  • Git

πŸ› οΈ Installation & Setup

For a quick start, follow these steps:

  1. Clone the Repository:

    git clone https://github.com/vijender883/Chatbot_Pinecone_flask_backend
    cd Chatbot_Pinecone_flask_backend
  2. Create and Activate Virtual Environment:

    • macOS/Linux:
      python3 -m venv venv
      source venv/bin/activate
    • Windows (Command Prompt):
      python -m venv venv
      venv\Scripts\activate

    For PowerShell or other shells, please refer to the detailed guide.

  3. Install Dependencies:

    pip install -r requirements.txt

For comprehensive instructions, including API key setup, environment configuration, running the application, deployment, and troubleshooting, please see our Detailed Installation and Setup Guide.

Running the Backend

To run the FastAPI backend server:

make run-backend
# Alternatively: uvicorn app:app --reload (or similar command based on your app structure)

The backend will typically start on http://localhost:8000 (FastAPI's default) or http://localhost:5000 if configured.

Running the Frontend

To run the Streamlit frontend application:

  1. Ensure the backend is running.
  2. Set the ENDPOINT environment variable if your backend is not on http://localhost:5000. For local development, you can add ENDPOINT=http://localhost:5000 to your .env file.
make run-frontend
# Alternatively: streamlit run src/frontend/streamlit_app.py

The frontend will typically be available at http://localhost:8501.

πŸ“‘ API Endpoints

The API routes are primarily defined in src/backend/routes/chat.py. The root / endpoint is in app.py.

Endpoint Method Description Request Body (Format) Success Response (JSON Example)
/ GET Basic API information and available endpoints. N/A {"message": "PDF Assistant Chatbot API", "version": "1.0.0", "endpoints": {"/health": "GET - Health check", ...}} (from chat.py if routed, or app.py's version)
/health GET Detailed health check of backend services. N/A {"status": "success", "health": {"gemini_api": true, ...}, "healthy": true}
/uploadpdf POST Uploads a PDF file for processing and vectorization. FormData: file (PDF file) {"success": true, "message": "PDF 'name.pdf' uploaded...", "filename": "name.pdf"}
/answer POST Asks a question about the processed PDF content. JSON: {"query": "Your question?"} {"answer": "AI generated answer."}

Note: The root endpoint / defined in app.py provides a simple welcome message. The one in chat.py (if chat_bp is mounted at root) offers more detail. The table reflects the more detailed one for completeness.

For deployment instructions, see the Detailed Installation and Setup Guide.

πŸ”§ Development

Project Structure

Chatbot_Pinecone_flask_backend/
β”œβ”€β”€ .env                   # Local environment variables (gitignored)
β”œβ”€β”€ .env.template          # Template for .env file
β”œβ”€β”€ .git/                  # Git version control directory
β”œβ”€β”€ .gitignore             # Specifies intentionally untracked files for Git
β”œβ”€β”€ README.md              # This guide
β”œβ”€β”€ Makefile               # Defines common tasks like running, testing, linting
β”œβ”€β”€ app.py                 # Main FastAPI application entry point for the backend (often named main.py or app.py)
β”œβ”€β”€ requirements.txt       # Python package dependencies for both backend and frontend
β”œβ”€β”€ requirements-dev.txt   # Development-specific dependencies (testing, linting)
β”œβ”€β”€ start.sh               # Shell script for starting the backend application (e.g., via Uvicorn with Gunicorn workers)
β”œβ”€β”€ src/                   # Main source code directory
β”‚   β”œβ”€β”€ backend/           # Source code for the FastAPI backend
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ agents/
β”‚   β”‚   β”‚   β”œβ”€β”€ base.py
β”‚   β”‚   β”‚   └── rag_agent.py
β”‚   β”‚   β”œβ”€β”€ config.py
β”‚   β”‚   β”œβ”€β”€ routes/
β”‚   β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”‚   └── chat.py
β”‚   β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”‚   └── orchestrator.py
β”‚   β”‚   └── utils/
β”‚   β”‚       └── helper.py
β”‚   └── frontend/          # Source code for the Streamlit frontend
β”‚       └── streamlit_app.py # Main Streamlit application file
└── tests/                 # Automated tests (primarily for the backend)
    β”œβ”€β”€ conftest.py
    β”œβ”€β”€ test_agents/
    └── test_routes/

Key Components

Backend:

  • app.py: Initializes the FastAPI app, includes routers, and defines the root (/) endpoint for the backend. (Filename might be main.py)
  • src/backend/routes/chat.py: Contains the FastAPI APIRouter for core API endpoints: /health, /uploadpdf, and /answer.
  • src/backend/agents/rag_agent.py: Implements the core RAG (Retrieval Augmented Generation) logic, including PDF processing, vector embedding, and question answering using Gemini and Pinecone.
  • src/backend/services/orchestrator.py: Acts as a layer between API routes and the RAGAgent.
  • src/backend/config.py: Manages application configuration, loading settings from environment variables.

Frontend:

  • src/frontend/streamlit_app.py: A Streamlit application providing the user interface. It interacts with the backend API to upload PDFs and get answers to questions.

Environment Variables

The application uses environment variables for configuration. These are typically defined in a .env file in the project root for local development. See .env.template for a list of required variables.

Backend Variables:

  • GEMINI_API_KEY: Your Google Gemini API key.
  • PINECONE_API_KEY: Your Pinecone API key.
  • PINECONE_INDEX_NAME: The name of your Pinecone index.
  • PINECONE_CLOUD: The cloud provider for your Pinecone index (e.g., aws).
  • PINECONE_REGION: The region of your Pinecone index (e.g., us-east-1).
  • APP_ENV: Set to development or production. This variable typically controls debug mode and other environment-specific settings. (Formerly FLASK_ENV)
  • PORT: Port for the backend server (defaults to 8000 for Uvicorn/FastAPI, but can be 5000 if configured).

Frontend Variables:

  • ENDPOINT: The URL of the backend API. For local development, this would typically be http://localhost:5000.

πŸ§ͺ Running Tests

(This section outlines general steps. Specific test setup might vary.)

For information on installing test dependencies, see the Detailed Installation and Setup Guide.

  1. Run Tests: Navigate to the project root directory and execute:
    pytest
    Pytest will automatically discover and run tests (typically files named test_*.py or *_test.py in the tests/ directory).

Refer to the tests/ directory and any specific test documentation or configuration files for more detailed instructions on running tests.

πŸ› οΈ Troubleshooting

For troubleshooting common installation and setup issues, refer to the Detailed Installation and Setup Guide.

Debug Mode (Local Development)

For more verbose error output locally:

  1. Set APP_ENV=development in your .env file. This often enables FastAPI's debug mode.
  2. Optionally, set LOG_LEVEL=DEBUG in .env for more detailed application logs.
  3. Run the app (e.g., uvicorn app:app --reload).

πŸ“Š Monitoring

Health Checks

The /health endpoint (see API Endpoints) provides detailed status of backend components. Regularly polling this endpoint can help ensure system availability.

Logs

  • Local Development: Logs are output to the console where python app.py is running. Adjust LOG_LEVEL in .env for desired verbosity.
  • Render Deployment: Access and monitor logs via the Render dashboard for your service. This is crucial for diagnosing issues in the production environment.

Key information to look for in logs:

  • Successful/failed PDF uploads and processing durations.
  • Question answering request details.
  • Errors from external services (Gemini, Pinecone).
  • Any unexpected application exceptions or tracebacks.

πŸ”’ Security

  • API Keys: Handled via environment variables (.env locally, Render's environment settings). Never hardcode keys. Ensure .env is in .gitignore.
  • File Uploads:
    • werkzeug.utils.secure_filename is used to sanitize filenames.
    • File type and size are validated as per ALLOWED_EXTENSIONS and MAX_FILE_SIZE in the app configuration.
  • Input Validation: Basic validation for presence of query in /answer and file in /uploadpdf. Sensitive inputs should always be validated and sanitized.
  • CORS: FastAPI handles CORS through CORSMiddleware. Ensure it's configured securely, especially in production, by specifying allowed origins, methods, and headers. For example:
    from fastapi.middleware.cors import CORSMiddleware
    
    app.add_middleware(
        CORSMiddleware,
        allow_origins=["https://your.frontend.domain.com"], # Or ["*"] for development
        allow_credentials=True,
        allow_methods=["*"],
        allow_headers=["*"],
    )
  • Error Handling: FastAPI has built-in support for returning structured JSON error responses (e.g., using HTTPException) and allows for custom exception handlers. This helps avoid exposing raw stack traces.
  • Dependency Management: Keep requirements.txt up-to-date. Regularly audit dependencies for vulnerabilities using tools like pip-audit or GitHub's Dependabot.
  • HTTPS: Render automatically provides HTTPS for deployed services.

πŸ“ License

This project is licensed under the MIT License. It's good practice to include a LICENSE file in the repository root with the full text of the MIT License.

🀝 Contributing

Contributions are welcome! Please adhere to the following process:

  1. Fork the Repository: Create your own fork on GitHub.
  2. Create a Branch: git checkout -b feature/your-new-feature or bugfix/issue-description.
  3. Develop: Make your changes.
  4. Test: Add and run tests for your changes using pytest.
  5. Commit: Write clear, concise commit messages.
  6. Push: Push your branch to your fork: git push origin your-branch-name.
  7. Pull Request: Open a PR against the main branch of the original repository. Clearly describe your changes and link any relevant issues.

πŸ“ž Support

If you encounter issues or have questions:

  • Check GitHub Issues: See if your question or problem has already been addressed.
  • Review Troubleshooting Section: The Detailed Installation and Setup Guide might have a solution.
  • Create a New Issue: If your issue is new, provide detailed information:
    • Steps to reproduce.
    • Expected vs. actual behavior.
    • Error messages and relevant logs.
    • Your environment (OS, Python version).
  • For Render-specific deployment issues, consult the Render documentation.

Happy coding! πŸš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.5%
  • Makefile 2.9%
  • Shell 0.6%