#For unit tests
uv run --extra test pytest test_main.py
A FastAPI-based service for querying vector embeddings stored in PostgreSQL with pgvector. This API provides retrieval endpoints for RAG (Retrieval Augmented Generation) applications.
- Vector Similarity Search: Retrieve top-K most similar documents using cosine similarity
- Metadata Filtering: Filter results by document metadata
- RAG Query Engine: Synthesized responses using retrieved context
- AWS Bedrock Integration: Uses Amazon Titan embeddings
- Health Monitoring: Built-in health check endpoints
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐
│ Client │─────▶│ Query API │─────▶│ PostgreSQL + │
│ │ │ (FastAPI) │ │ pgvector │
└─────────────┘ └──────────────┘ └─────────────────┘
│
▼
┌──────────────┐
│ AWS Bedrock │
│ (Titan) │
└──────────────┘
- Python 3.11+
- PostgreSQL with pgvector extension
- AWS credentials configured for Bedrock access
- Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh- Navigate to the project directory:
cd query_api- Install dependencies (uv automatically manages the virtual environment):
uv sync- Run the application:
uv run python main.py- Clone the repository and navigate to the project directory:
cd query_api- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Run the application:
python main.pyCreate a .env file (optional, defaults are in config.py):
DB_NAME=embeddings
DB_USER=burrow
DB_PASSWORD=capstone
DB_HOST=burrow-serverless-wilson.cluster-cwxgyacqyoae.us-east-1.rds.amazonaws.com
DB_PORT=5432
TABLE_NAME=data_burrow_table
EMBED_DIM=1024
AWS_REGION=us-east-1The API will be available at http://localhost:8000.
Root endpoint with API information.
Health check endpoint showing service and database status.
Response:
{
"status": "healthy",
"database_connected": true,
"vector_store_initialized": true
}Retrieve top-K similar documents without synthesis.
Request:
{
"query": "What is machine learning?",
"top_k": 5,
"filters": {
"filters": [
{
"key": "source",
"value": "wikipedia",
"operator": "=="
}
],
"condition": "and"
}
}Response:
{
"nodes": [
{
"node_id": "abc123",
"text": "Machine learning is...",
"score": 0.89,
"metadata": {
"source": "wikipedia",
"date": "2024-01-15"
}
}
],
"query": "What is machine learning?",
"total_results": 5
}RAG query with synthesis (requires LLM configured).
Request:
{
"query": "Explain machine learning",
"top_k": 5,
"filters": null
}Response:
{
"response": "Machine learning is a branch of artificial intelligence...",
"source_nodes": [...],
"query": "Explain machine learning"
}docker build -t query-api .docker run -p 8000:8000 \
-e DB_PASSWORD=your_password \
-e AWS_ACCESS_KEY_ID=your_key \
-e AWS_SECRET_ACCESS_KEY=your_secret \
query-api# Authenticate to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com
# Build and tag
docker build -t query-api .
docker tag query-api:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/query-api:latest
# Push
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/query-api:latest- DB credentials
- AWS region
- Table configuration
Supported operators:
==: Equal>: Greater than<: Less than>=: Greater than or equal<=: Less than or equal!=: Not equalin: In listnin: Not in list
Example with multiple filters:
{
"query": "search term",
"top_k": 10,
"filters": {
"filters": [
{"key": "category", "value": "science", "operator": "=="},
{"key": "year", "value": 2020, "operator": ">="}
],
"condition": "and"
}
}All configuration is managed through config.py and can be overridden with environment variables:
| Variable | Default | Description |
|---|---|---|
| DB_NAME | embeddings | Database name |
| DB_USER | burrow | Database user |
| DB_PASSWORD | capstone | Database password |
| DB_HOST | burrow-serverless-wilson... | Database host |
| DB_PORT | 5432 | Database port |
| TABLE_NAME | burrow_table | Vector store table name (PGVectorStore adds "data_" prefix) |
| EMBED_DIM | 1024 | Embedding dimensions |
| AWS_REGION | us-east-1 | AWS region |
| BEDROCK_MODEL_ID | amazon.titan-embed-text-v2:0 | Bedrock model |
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
curl -X POST "http://localhost:8000/retrieve" \
-H "Content-Type: application/json" \
-d '{
"query": "test query",
"top_k": 3
}'- Verify database credentials
- Check security group rules for Aurora
- Ensure pgvector extension is installed
- Confirm table exists:
SELECT COUNT(*) FROM data_burrow_table; - Check embed_dim matches your Bedrock model output
- Important: If your table is named
data_X, setTABLE_NAME=X(without the "data_" prefix) because PGVectorStore automatically adds it
- Verify IAM permissions for Bedrock
- Check AWS credentials are configured
- Ensure model is available in your region
MIT