Serverless Document Chat Application

This application allows users to ask natural language questions of any PDF document they upload. I've fully developed this solution to combine text generation and analysis capabilities of an LLM with a vector search of the document content. It leverages serverless services such as Amazon Bedrock for accessing foundational models, AWS Lambda to run LangChain, and Amazon DynamoDB for maintaining conversational memory.

Note This project incurs AWS costs. Refer to AWS Pricing for more details.

Key Features

Amazon Bedrock for serverless embeddings and inference.
LangChain to orchestrate a Q&A LLM chain.
FAISS for vector storage.
Amazon DynamoDB for serverless conversational memory.
AWS Lambda for serverless compute.
Frontend built with React, TypeScript, TailwindCSS, and Vite.
Local run support and deployment options with Vercel Hosting.
Amazon Cognito for authentication.

How It Works

A user uploads a PDF document through a static web frontend into an Amazon S3 bucket.
The upload triggers metadata extraction and document embedding, converting text to vectors for storage in S3.
When a user chats with a PDF document, a Lambda function retrieves relevant vector data and provides an intelligent response using an LLM.

Deployment Instructions

Prerequisites

AWS SAM CLI
Python 3.11 or later

Setup and Configuration

Clone this repository:

git clone https://github.com/mhrjdv/doc-chat.git

Configure Amazon Bedrock model and region parameters in backend/src/generate_response/main.py and backend/src/generate_embeddings/main.py to customize models if desired.

Update IAM permissions to allow model access in your preferred region:

GenerateResponseFunction:
  Type: AWS::Serverless::Function
  Properties:
    Policies:
      - Statement:
          - Sid: "BedrockScopedAccess"
            Effect: "Allow"
            Action: "bedrock:InvokeModel"
            Resource:
              - "arn:aws:bedrock:*::foundation-model/anthropic.claude-3-haiku"
              - "arn:aws:bedrock:*::foundation-model/amazon.titan-embed-text-v1"

Build and Deploy the Application:

cd backend
sam build
sam deploy --guided

Note the output details, which include important URLs and configuration values.

Local Frontend Setup

Create a .env.development file in the frontend directory with values from your deployment:

VITE_REGION=us-east-1
VITE_API_ENDPOINT=https://abcd1234.execute-api.us-east-1.amazonaws.com/dev/
VITE_USER_POOL_ID=us-east-1_gxKtRocFs
VITE_USER_POOL_CLIENT_ID=874ghcej99f8iuo0lgdpbrmi76k

Install frontend dependencies and start the local server:
```
npm ci
npm run dev
```
Access the application locally at http://localhost:5173.

Optional: Deploying Frontend with Vercel

For managed deployment using Vercel Hosting by cloning GitHub Repo.

Delete any secrets in AWS Secrets Manager.
Empty the Amazon S3 bucket created for this application.
Run sam delete from the backend directory to remove associated resources.

Security Considerations

While this project demonstrates serverless document chat capabilities, please review security best practices:

Review encryption options, especially for AWS KMS, S3, and DynamoDB.
Adjust API Gateway access logging, S3 access logging, and apply specific IAM policies as needed.
For advanced security needs, consider connecting AWS Lambda to a VPC using the VpcConfig setting.

License

This project is licensed under the MIT-0 License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Serverless Document Chat Application

Key Features

How It Works

Deployment Instructions

Prerequisites

Setup and Configuration

Local Frontend Setup

Optional: Deploying Frontend with Vercel

Security Considerations

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

Serverless Document Chat Application

Key Features

How It Works

Deployment Instructions

Prerequisites

Setup and Configuration

Local Frontend Setup

Optional: Deploying Frontend with Vercel

Security Considerations

License