AI-powered document processing system for HR documents using AWS Bedrock, Strands Agents, and serverless architecture.
This project provides an automated solution for extracting structured data from HR documents (CVs, payrolls, contracts, lab results) using:
- AWS Bedrock with Nova Lite model for AI processing
- Strands Agents SDK for agentic workflows
- API Gateway for secure document uploads
- App Runner for containerized application hosting
- S3 + DynamoDB for storage
- Langfuse for observability (optional)
Client β API Gateway (API Key validation)
β
Lambda (generates presigned URL)
β
Client uploads to S3
β
S3 Event β Lambda β App Runner (Strands Agent)
β
Results saved to DynamoDB
- AWS Account with Bedrock access
- AWS CLI configured
- Node.js 18+ (for CDK)
- Python 3.11+
- Docker
- jq (for client script)
git clone <repository-url>
cd hr-reader-simple
cp .env.example .env
# Edit .env with your AWS profile and Langfuse keys (optional)# Install dependencies
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
# Deploy with Langfuse (optional)
./scripts/deploy/deploy_with_langfuse.sh
# Or deploy without Langfuse
./scripts/deploy/deploy.sh./scripts/service/manage_api_keys.sh create your-user-id
# Save the generated API keyexport API_URL="<your-api-gateway-url>"
export API_KEY="<your-api-key>"
# Test with lab results
./scripts/test/client_example.sh lab tests/sample_lab_result.txt
# Test with physical evaluation
./scripts/test/client_example.sh evaluacion_fisica tests/sample_evaluacion_fisica.txthr-reader-simple/
βββ app/ # Application code
β βββ agent.py # Strands agent configuration
β βββ api.py # FastAPI endpoints
β βββ config.py # Settings management
β βββ observability.py # Langfuse integration
β βββ tools.py # Agent tools (S3, DynamoDB, Textract)
βββ cdk/ # Infrastructure as Code
β βββ app.py # CDK app entry point
β βββ stacks/ # CloudFormation stacks
βββ lambda/ # Lambda functions
β βββ presigned_url/ # Generate upload URLs
β βββ process_document/ # Trigger processing
β βββ s3_notification_config/ # S3 event configuration
βββ scripts/ # Utility scripts
βββ deploy/ # Deployment scripts
β βββ deploy.sh
β βββ deploy_with_langfuse.sh
βββ service/ # Service management
β βββ manage_api_keys.sh
β βββ pause.sh
β βββ resume.sh
βββ test/ # Testing scripts
βββ client_example.sh
βββ client_test.sh
βββ demo_video.sh
# AWS
AWS_REGION=us-east-1
AWS_PROFILE=your-profile
# Bedrock
MODEL_ID=us.amazon.nova-lite-v1:0
# Langfuse (Optional)
LANGFUSE_ENABLED=true
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://cloud.langfuse.comlab- Laboratory results (blood tests, biochemistry, etc.)- Example:
tests/sample_lab_result.txt
- Example:
evaluacion_fisica- Physical evaluation (anthropometry, vital signs, body composition)- Example:
tests/sample_evaluacion_fisica.txt
- Example:
cv- Curriculum Vitaenomina- Payroll documentscontrato- Employment contracts
The system integrates with Langfuse for complete observability:
- LLM call tracing with token usage and costs
- Tool execution monitoring
- Agent reasoning cycles
- Performance metrics
Enable by setting LANGFUSE_ENABLED=true in .env and deploying with ./deploy_with_langfuse.sh.
./scripts/service/pause.sh./scripts/service/resume.shcd cdk
cdk destroy --all- API Gateway with API key authentication
- Presigned URLs with 5-minute expiration
- App Runner not publicly accessible
- S3 bucket encryption enabled
- DynamoDB encryption at rest
curl -X POST $API_URL/upload \
-H "x-api-key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"document_type": "lab",
"filename": "test.pdf",
"content_type": "application/pdf"
}'curl -X PUT "<presigned_url>" \
-H "Content-Type: application/pdf" \
--data-binary "@document.pdf"# Get specific document results
curl $APPRUNNER_URL/results/<document_key>
# List all unconsulted results
curl $APPRUNNER_URL/results# Run locally
source venv/bin/activate
uvicorn app.api:app --reload./scripts/deploy/deploy_with_langfuse.shThis project was created for a hackathon. Feel free to fork and adapt for your needs.
MIT License - See LICENSE file for details
- AWS for Bedrock and infrastructure services
- Strands team for the Agents SDK
- Langfuse for observability platform