SQS Microservice - Segment Relationship Processor

A production-ready microservice that consumes messages from AWS SQS, processes segment relationships using Neo4j graph database, and stores results in PostgreSQL. Built for high-throughput processing with support for multiple concurrent workers.

🌟 Features

AWS SQS Consumer: Reliable message consumption with automatic retry and error handling
Multi-Worker Support: Race condition protection for concurrent processing
Atomic Task Claiming: Prevents duplicate processing across multiple workers
Dragonfly Cache: High-performance caching for Neo4j alignment queries
Graph Database Integration: Neo4j for complex relationship traversal (BFS algorithm)
PostgreSQL Storage: Persistent storage of job status and results
Structured Logging: Production-ready logging with proper log levels
Database Migrations: Alembic integration for schema management
Error Recovery: Automatic retry mechanism with error tracking

🏗️ Architecture

┌─────────────────────────────────────────────────┐
│            SQS Consumer Workers                 │
│  (Multiple instances supported with atomic      │
│   task claiming to prevent duplicates)          │
├─────────────────────────────────────────────────┤
│  ┌──────────────────────────────────────────┐  │
│  │  1. Receive message from SQS             │  │
│  │  2. Atomically claim task (QUEUED→       │  │
│  │     IN_PROGRESS)                         │  │
│  │  3. Query Neo4j for related segments     │  │
│  │  4. Store results in PostgreSQL          │  │
│  │  5. Update job completion status         │  │
│  │  6. Send completion message to SQS       │  │
│  └──────────────────────────────────────────┘  │
└─────────────────────────────────────────────────┘
          │                           │
          ▼                           ▼
    ┌──────────┐               ┌──────────┐
    │ AWS SQS  │               │  Neo4j   │
    │  Queue   │               │ Database │
    └──────────┘               └──────────┘
          │                           
          ▼                           
    ┌──────────┐               
    │PostgreSQL│               
    │ Database │               
    └──────────┘

🚀 Quick Start

Prerequisites

Python 3.12+
AWS Account with SQS access
PostgreSQL database
Neo4j database (cloud or self-hosted)

Installation

Clone the repository

git clone <repository-url>
cd sqs-microservice

Create virtual environment

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Configure environment

cp env.example .env
# Edit .env with your credentials

Run database migrations

alembic upgrade head

Start the consumer

python -m app.main

📝 Configuration

Environment Variables

Create a .env file with the following variables:

# Neo4j Database
NEO4J_URI=neo4j+s://xxx.databases.neo4j.io
NEO4J_USER=neo4j
NEO4J_PASSWORD=your-password

# PostgreSQL Database
POSTGRES_URL=postgresql://user:password@host:5432/database

# AWS Configuration
AWS_REGION=us-east-1
AWS_ACCESS_KEY_ID=your-access-key
AWS_SECRET_ACCESS_KEY=your-secret-key

# SQS Queues
SQS_QUEUE_URL=https://sqs.region.amazonaws.com/account/queue-name
SQS_COMPLETED_QUEUE_URL=https://sqs.region.amazonaws.com/account/completed-queue

# Dragonfly Cache (Optional - improves performance)
DRAGONFLY_URL=redis://default:password@your-instance.dragonflydb.cloud:6379

See env.example for a complete template.

🔄 How It Works

Message Flow

Message Reception: Worker receives a message from SQS containing:

{
  "job_id": "uuid",
  "manifestation_id": "string",
  "segment_id": "string",
  "start": 0,
  "end": 100
}

Atomic Task Claiming: Worker attempts to claim the task by updating status from QUEUED to IN_PROGRESS. If another worker already claimed it, this worker skips processing.
Relationship Processing: Worker queries Neo4j to find all related segments using BFS traversal through alignment pairs.
Result Storage: Worker stores the related segments in PostgreSQL with the segment task record.
Job Completion: When all segments are processed, a completion message is sent to the completed queue.

Race Condition Protection

The microservice uses atomic task claiming to prevent duplicate processing:

# Only updates if status is currently "QUEUED"
UPDATE segment_task 
SET status = 'IN_PROGRESS' 
WHERE job_id = ? AND segment_id = ? AND status = 'QUEUED'

This ensures that even with multiple workers consuming from the same queue, each segment is processed exactly once.

📊 Database Schema

RootJob Table

- job_id (UUID, PK)
- manifestation_id (String)
- total_segments (Integer)
- completed_segments (Integer)
- status (Enum: QUEUED, IN_PROGRESS, COMPLETED, FAILED)
- created_at (Timestamp)
- updated_at (Timestamp)

SegmentTask Table

- task_id (UUID, PK)
- job_id (UUID, FK)
- segment_id (String)
- status (Enum: QUEUED, IN_PROGRESS, COMPLETED, RETRYING, FAILED)
- result_json (JSONB) - Stores related segments
- result_location (String, optional)
- error_message (Text, nullable)
- created_at (Timestamp)
- updated_at (Timestamp)

🚢 Deployment

Render (Background Worker)

Create a new Background Worker on Render
Environment: Select Python 3
Build Command: pip install -r requirements.txt
Start Command: python -m app.main
Add environment variables from your .env file
Deploy: Render will automatically deploy when you push to main branch

Important: Create a runtime.txt file to specify Python version:

python-3.12.8

Running Multiple Workers

To scale processing, deploy multiple instances:

On Render:

Increase the number of instances in the service settings
Each instance will automatically claim tasks atomically

Locally:

# Terminal 1
python -m app.main

# Terminal 2
python -m app.main

# Terminal 3
python -m app.main

All workers will consume from the same queue without duplicating work.

📁 Project Structure

sqs-microservice/
├── app/
│   ├── main.py                 # SQS consumer entry point
│   ├── tasks.py                # Task processing logic
│   ├── relation.py             # API endpoints (optional)
│   ├── config.py               # Configuration management
│   ├── models.py               # Pydantic models
│   ├── neo4j_database.py       # Neo4j integration
│   ├── neo4j_quries.py         # Cypher queries
│   ├── neo4j_database_validator.py
│   ├── alembic/                # Database migrations
│   │   └── versions/
│   ├── alembic.ini
│   └── db/
│       ├── postgres.py         # PostgreSQL connection
│       └── models.py           # SQLAlchemy models
├── requirements.txt            # Python dependencies
├── runtime.txt                 # Python version for deployment
├── env.example                 # Environment template
├── purge_sqs.py               # Utility to purge SQS queue
└── README.md

🔧 Key Components

1. SQS Consumer (`app/main.py`)

The main consumer that polls SQS and processes messages:

consumer = SimpleConsumer(
    queue_url=queue_url,
    region=get("AWS_REGION"),
    polling_wait_time_ms=50
)
consumer.start()

2. Task Processor (`app/tasks.py`)

Handles the business logic:

Atomic task claiming
Neo4j relationship queries
Result storage
Error handling and retries

3. Neo4j Database (`app/neo4j_database.py`)

Graph database operations:

BFS traversal for related segments
Alignment pair queries
Segment overlap detection

4. Database Models (`app/db/models.py`)

SQLAlchemy models for job and task tracking.

🐛 Troubleshooting

Consumer Not Starting

# Check environment variables
python -c "from app.config import get; print(get('SQS_QUEUE_URL'))"

# Test AWS credentials
aws sqs get-queue-attributes --queue-url <YOUR_QUEUE_URL>

Database Connection Issues

# Test PostgreSQL
python -c "from app.db.postgres import engine; engine.connect(); print('✓ PostgreSQL connected')"

# Test Neo4j
python -c "from app.neo4j_database import Neo4JDatabase; db = Neo4JDatabase(); print('✓ Neo4j connected')"

No Messages Being Processed

Check SQS queue has messages: aws sqs get-queue-attributes --queue-url <URL> --attribute-names ApproximateNumberOfMessages
Verify visibility timeout is appropriate (recommended: 60+ seconds)
Check worker logs for errors

Duplicate Processing

If you see duplicate processing even with multiple workers:

Ensure all workers are using the same database
Check that _try_claim_segment_task is being called before processing
Verify database transactions are properly committed

📊 Monitoring

Logging

The application uses structured logging:

2025-11-18 10:30:45 - app.tasks - INFO - Processing segment ABC123 for job 550e8400
2025-11-18 10:30:45 - app.tasks - INFO - Attempting to claim segment task ABC123
2025-11-18 10:30:45 - app.tasks - INFO - Successfully claimed segment task ABC123
2025-11-18 10:30:46 - app.neo4j_database - INFO - Starting BFS traversal from manifestation XYZ

Job Status

Query the database to check job progress:

SELECT 
    job_id,
    manifestation_id,
    completed_segments,
    total_segments,
    status,
    updated_at
FROM root_job
WHERE status = 'IN_PROGRESS'
ORDER BY updated_at DESC;

🧪 Testing

Utility Scripts

Purge SQS Queue (for testing):

python purge_sqs.py

Manual Testing

Send a test message to SQS
Check logs for processing
Query database for results

🔐 Security Best Practices

✅ Never commit .env file
✅ Use IAM roles instead of access keys (when on AWS)
✅ Rotate credentials regularly
✅ Use connection pooling for database connections
✅ Set appropriate SQS visibility timeouts
✅ Monitor failed message dead letter queues

⚡ Performance Tips

Adjust Polling Wait Time: Increase polling_wait_time_ms to reduce API calls
Database Connection Pooling: Already configured in PostgreSQL connection
Neo4j Connection Reuse: Singleton driver pattern implemented
Multiple Workers: Scale horizontally for high throughput
SQS Batch Size: Consumer handles messages one at a time for reliable processing

🤝 Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

📄 License

[Add your license here]

📞 Support

For issues or questions:

Check the troubleshooting section
Review application logs
Check database records for task status
Open an issue on GitHub

Built with Python, AWS SQS, PostgreSQL, and Neo4j 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
app		app
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
env.example		env.example
requirements.txt		requirements.txt
test_dragonfly_cache.py		test_dragonfly_cache.py

OpenPecha/sqs-related-segment-microservice

Folders and files

Latest commit

History

Repository files navigation