FastAPI PDF Retrieval Augmented Generation (RAG) APIs

Introduction

The FastAPI server acts as the main backend API for full stack Chat PDF AI Assistant.

Highlights:

The custom "Read & Chunk" component to preprocessing PDF files, splitting pages to sentence and them merge them into chunks for embedding and indexing.
MongoDB to perform vector similarity and keyword search operations on the database for ingested vector
It uses the Jina framework for embedding text data and reranking for retrieval data.
The "api/hybrid_search" route handles hybrid search queries, combining traditional text search (BM25) with vector similarity search with Jina AI reranking.

Preview

www.chat-pdf-ai.com

Overall Architecture

Frontend Github: https://github.com/Nelsonlin0321/chat-pdf-ai-assistant

Details about PDF ingestion

The PDF file is first split into individual pages using a PDF PyPDF.
Each page is then processed using a textblob library to convert the page content into sentences.
The sentences from each page are merged into larger chunks with overlapping text between consecutive chunks. This overlapping helps maintain context during search and retrieval.
The chunked text data is then passed through an embedding pipeline using the Jina Embedding API. This step converts the textual data into high-dimensional vector representations.
The resulting vector embeddings, along with their corresponding text chunks, are ingested and stored in a vector database like MongoDB.
The original PDF file also is uploaded to an S3 storage service for front display.

Run Backend API Locally

python -m venv venv
source venv/bin/activate
pip3 install -r requirements.txt

source .env
uvicorn server:app --reload \
                  --reload-dir ./app
                  --host localhost
                  --port 8000
# or

sh boot.sh

Build Docker

image_name=rag-backend-api
docker build -t ${image_name}:latest -f ./Dockerfile .
docker run --env-file docker.env -p 8000:8000 -it --rm --name ${image_name} ${image_name}:latest

APIs Description

Run Recommender API Using Docker

image_name=rag-backend-api
docker build -t ${image_name}:latest .

docker.env

MONGODB_URL=
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
JINA_API_KEY=

image_name=rag-backend-api
docker run --env-file docker.env -p 8000:8000 -it ${image_name}:latest

Build AWS Lambda FastAPI Container

image_name=lambda-rag-backend-api
docker build -t ${image_name}:latest -f ./Dockerfile.aws.lambda  .

Test the Lambda

image_name=lambda-rag-backend-api
docker run --env-file docker.env -p 9000:8080 --name lambda-rag-backend-api -it --rm ${image_name}:latest

curl -XPOST "http://localhost:9000/2015-03-31/functions/function/invocations" -d '{
    "resource": "/api/health_check",
    "path": "/api/health_check",
    "httpMethod": "GET",
    "requestContext": {
    },
    "isBase64Encoded": false
}'

Push To ECR

source .env
account_id=932682266260
region=ap-southeast-1
image_name=lambda-rag-backend-api
repo_name=${image_name}
aws ecr get-login-password --region ${region} | docker login --username AWS --password-stdin ${account_id}.dkr.ecr.${region}.amazonaws.com

aws ecr create-repository \
    --repository-name ${repo_name} \
    --region ${region}

docker tag ${image_name}:latest ${account_id}.dkr.ecr.${region}.amazonaws.com/${repo_name}:latest

docker push ${account_id}.dkr.ecr.ap-southeast-1.amazonaws.com/${repo_name}:latest

Deploy To AWS with Infra Codes:

cd ./infra
terraform init
terraform apply

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
Infra		Infra
app		app
images		images
.Dockerignore		.Dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pylintrc		.pylintrc
Dockerfile		Dockerfile
Dockerfile.aws.lambda		Dockerfile.aws.lambda
LICENSE		LICENSE
README.md		README.md
boot.sh		boot.sh
deploy_ecr_image_to_lambda.py		deploy_ecr_image_to_lambda.py
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FastAPI PDF Retrieval Augmented Generation (RAG) APIs

Introduction

Highlights:

Preview

Overall Architecture

Details about PDF ingestion

Run Backend API Locally

Build Docker

APIs Description

Run Recommender API Using Docker

Build AWS Lambda FastAPI Container

Test the Lambda

Push To ECR

Deploy To AWS with Infra Codes:

About

Releases

Packages

Languages

License

Nelsonlin0321/chat-pdf-ai-assistant-fastapi

Folders and files

Latest commit

History

Repository files navigation

FastAPI PDF Retrieval Augmented Generation (RAG) APIs

Introduction

Highlights:

Preview

Overall Architecture

Details about PDF ingestion

Run Backend API Locally

Build Docker

APIs Description

Run Recommender API Using Docker

Build AWS Lambda FastAPI Container

Test the Lambda

Push To ECR

Deploy To AWS with Infra Codes:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages