Skip to content

dparikh79/AI-Chatbot

Repository files navigation

AI-Chatbot

A multi-tenant Flask app that turns a stack of PDFs into a branded, embeddable chat widget. Upload your documents, get a one-line <script> tag you can paste into any website, and ship a grounded, retrieval-based assistant trained on your own content.

Built October 2023 while exploring how to package a small RAG pipeline behind a real product surface (auth, per-user state, CDN-served widget) rather than as a notebook demo.


Why I built it

I wanted to learn what it takes to wrap a language model into something a non-technical user could actually deploy. The interesting parts were not the model call. They were the boring product seams: per-user document isolation, persisting assistants across restarts, generating a customizable widget per tenant, and serving it from a CDN so the host site does not need to know anything about Flask.

The result is a small but end-to-end SaaS pattern for a "chatbot for your docs" service.

What it does

  1. User registers, logs in, gets a unique PIN (used as the widget's tenant key).
  2. User uploads one or more PDFs.
  3. The app extracts text with PyPDF2, sentence-splits it, and builds a TF-IDF index per assistant.
  4. User clicks "Create Chatbot" and gets a flash-message containing a <script> tag.
  5. That script (served from S3 via CloudFront) injects a floating chat widget into the host page.
  6. When a visitor sends a message, the widget POSTs to /chat/ with the tenant PIN; the server retrieves the best-matching passage via cosine similarity and asks GPT-3.5-turbo to answer using only that passage.

A strict system prompt forces the model to refuse anything outside the uploaded corpus, which is how grounding is enforced (no embeddings DB, no vector store, just TF-IDF retrieval and a guarded prompt).

Architecture

                        Browser (host site)
                              |
                              | <script defer src="cloudfront/<pin>_embedChatbot.js">
                              v
   +---------------------------------------------------+
   |  CloudFront  <----  S3 (per-tenant widget JS)     |
   +---------------------------------------------------+
                              |
                              | POST /chat/ {message, userPin}
                              v
   +---------------------------------------------------+
   |  Flask app (app.py)                               |
   |   - Flask-Login auth, Postgres user store         |
   |   - per-PIN Assistant dict (in-process)           |
   |   - pickled to S3 on create/delete                |
   +---------------------------------------------------+
                              |
                              v
   +---------------------------------------------------+
   |  Assistant  ->  PDFProcessor (PyPDF2 + regex)     |
   |             ->  Chatbot     (TF-IDF + cosine sim) |
   |             ->  OpenAI chat completions API       |
   +---------------------------------------------------+

Key files:

  • app.py (480 lines): routes, auth, S3/CloudFront glue, assistant lifecycle
  • assistant.py: orchestrates PDF -> sentences -> retrieval -> LLM
  • chatbot.py: TF-IDF index + GPT-3.5 call with grounding prompt
  • pdf_processor.py: PyPDF2 text extraction + basic cleanup
  • static/embedChatbot.js: ~12KB self-contained widget template (CSS, HTML, fetch loop, all scoped)
  • templates/: login, register, password reset, create-chatbot flows

Quickstart

You will need: Python 3.10+, Postgres, an S3 bucket fronted by CloudFront, and an OpenAI API key.

git clone https://github.com/dparikh79/AI-Chatbot.git
cd AI-Chatbot
python -m venv aichatbot_env
source aichatbot_env/bin/activate
pip install -r requirements.txt

Create a .env file at the repo root (do not commit it, see .gitignore):

# OpenAI
API_ENDPOINT=https://api.openai.com/v1/chat/completions
API_KEY=sk-...

# Flask
FLASK_SECRET_KEY=<long-random-hex>
DATABASE_URL=postgresql://user:pass@host:5432/aichatbot

# Local storage
BASE_UPLOAD_FOLDER=/absolute/path/to/uploads
ALLOWED_EXTENSIONS=pdf

# AWS (S3 + CloudFront for the embeddable widget)
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
AWS_BUCKET_NAME=your-bucket
CLOUDFRONT_DOMAIN=https://dxxxxxx.cloudfront.net

Then:

flask --app app.py run

Visit http://127.0.0.1:5000/, register, upload a PDF, click "Create Test Chatbot", and copy the generated <script> tag into any HTML page to see the widget appear.

Tech stack

  • Backend: Flask, Flask-Login, Flask-SQLAlchemy, Flask-CORS, gunicorn
  • Data: PostgreSQL (users, reset tokens), pickle on S3 (assistant cache)
  • Retrieval: scikit-learn TfidfVectorizer + cosine similarity
  • LLM: OpenAI gpt-3.5-turbo via raw HTTP (no SDK)
  • PDF: PyPDF2
  • Storage / CDN: AWS S3 + CloudFront (per-tenant widget JS)
  • Frontend: vanilla JS widget template, Jinja for the admin pages

What I would change today

This was an exploration, not a hardened product. If I rebuilt it now:

  • Swap TF-IDF for a real embedding model (OpenAI text-embedding-3-small or a local sentence-transformer) and a small vector store (FAISS, pgvector, or Chroma). Cosine sim over TF-IDF is brittle for paraphrased queries.
  • Move the assistant cache from a pickled S3 blob to Postgres or DynamoDB. Pickle + global dict + atexit save is fine for a prototype, not for concurrent workers.
  • Use a managed auth provider (Clerk, Auth0, or Cognito) instead of hand-rolled Flask-Login + SHA-256 password hashing. The current generate_password_hash(..., method="sha256") should be bcrypt or argon2.
  • Move conversation history out of a per-request local variable (currently each /chat/ call starts a fresh history, so the bot has no memory). Persist per-PIN history in Redis or Postgres with a TTL.
  • Containerize and deploy on Fargate or Lambda instead of a single gunicorn box.
  • Add streaming responses and a rate limiter (Flask-Limiter is already in requirements.txt but unused).

Limits and honest notes

  • No multi-turn memory: process_question creates a new conversation_history = [] every call. The bot forgets immediately. This was on my list to fix but the project moved on.
  • AWS coupling: the embeddable-widget path requires S3 + CloudFront. You can run the upload-and-chat flow locally without AWS by skipping /chatbot/create/ and using only /chat/ directly.
  • Single-process assumption: the assistants dict lives in process memory and is reloaded from a single pickle on S3. Running multiple gunicorn workers will see stale state.
  • Python pickle on S3 is a known footgun: anyone with write access to the bucket could ship arbitrary code into the app. Fine for solo dev, not for production.

License

MIT. See LICENSE.


Built by Dhruvil Parikh while learning how a RAG prototype becomes a deployable product surface.

About

Multi-tenant Flask app that turns PDFs into a branded, embeddable chat widget. TF-IDF retrieval + GPT-3.5, per-tenant JS served from S3/CloudFront.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors