AWS Tutorial: Queueing Analysis for Database Batch Scheduling

This repository is a simple tutorial for a canonical database-engineering problem: a nightly maintenance queue where ETL jobs, refreshes, backfills, and index work compete for a limited execution window.

The repository is intentionally DynamoDB-centric. It uses DynamoDB as a workload catalog and status store, then queries a single batch night and runs a queueing simulation in Python to estimate delay, congestion, and overflow risk. It is a teaching example for capacity planning, not a production scheduler.

What This Tutorial Teaches

model a nightly queue of database jobs in DynamoDB
design the table for Query access by nightly batch date and requested window
derive queueing parameters such as lambda, mu, c, and rho
simulate wait time, queue length, and overflow risk
understand when PyTorch is useful and when it is unnecessary

Industry Standards

Two standards matter here:

For a real asynchronous work queue on AWS, the usual default is Amazon SQS, not DynamoDB. Standard AWS guidance emphasizes queue choice, retries, dead-letter queues, long polling, deduplication or ordering requirements, and idempotent consumers.
For DynamoDB workloads, the standard practice is to design for access patterns first and favor Query over table-wide Scan.

This tutorial intentionally keeps DynamoDB in the foreground because the problem is batch scheduling analysis for database engineers, not operational message brokering.

References:

Repository Layout

README.md
.gitignore
requirements.txt
requirements-ml.txt
aws_costs.py
sample_jobs.py
queue_analysis.py
seed_job_requests.py
tutorial.py
pytorch_extension.py
cleanup_demo.py
docs/
  architecture.md
  costs.md
  industry_standards.md
  results.md
infra/
  dynamodb_table.yaml

Data Model

Each DynamoDB item represents one queued batch job.

batch_date
window_job_id
job_id
requested_window
submitted_at
workload_class
estimated_runtime_minutes
priority
requires_exclusive_lock
status

The primary key is:

partition key: batch_date
sort key: window_job_id, formatted as <requested_window>#<job_id>

Example item:

{
  "batch_date": "2026-05-13",
  "window_job_id": "01:00#JOB-20260513-001",
  "job_id": "JOB-20260513-001",
  "requested_window": "01:00",
  "submitted_at": "2026-05-13T00:07:00Z",
  "workload_class": "etl",
  "estimated_runtime_minutes": 25,
  "priority": 2,
  "requires_exclusive_lock": false,
  "status": "queued"
}

Core Workflow

The main tutorial path is:

DynamoDB -> Query batch_date -> summarize arrivals and runtimes -> queue simulation -> queueing metrics

The simulator reports:

arrival rate lambda
service rate mu
worker count c
utilization rho
average wait
p95 wait
average queue length
worker utilization
overflow risk after the batch cutoff

Exclusive-lock jobs are treated as special blocking jobs that consume the full worker pool while they run. This keeps the example simple while making lock contention visible.

Setup

Install the core dependencies:

pip install -r requirements.txt

Run the local test suite:

python -m unittest discover -s tests -v

Optional PyTorch extension:

pip install -r requirements-ml.txt

Configure AWS credentials and region if you want the AWS-backed path:

aws configure

or:

set AWS_REGION=us-east-1

Execution

Seed the DynamoDB table:

python seed_job_requests.py

Run the local tutorial. By default this prints one stable case and one overloaded case:

python tutorial.py --source local

Run one selected batch night from DynamoDB:

python tutorial.py --source aws --batch-date 2026-05-13 --workers 3

Optionally filter one requested window:

python tutorial.py --source aws --batch-date 2026-05-13 --window 02:00 --workers 3

Optional PyTorch extension:

python pytorch_extension.py

Why PyTorch Is Optional

For a first queueing tutorial, PyTorch is usually the wrong starting point. A simple queueing problem is better explained with explicit assumptions and a discrete-event simulation.

PyTorch becomes useful when:

service time changes sharply by workload class
lock contention creates nonlinear delay patterns
arrival pressure depends on many correlated upstream signals
the goal is forecasting or learned dispatch, not first-principles explanation

This repository keeps PyTorch in a separate optional script for that reason.

Limitations

The sample dataset is small and instructional.
The simulator uses a simplified dispatch policy.
Exclusive-lock jobs are modeled as full-pool blockers for clarity.
The tutorial is not a production scheduler.
The tutorial is not the canonical AWS operational queue pattern; SQS is.

Verification

The local tutorial path is intended to work without provisioning AWS resources:

python tutorial.py --source local

The AWS-backed path and the optional PyTorch extension are kept separate so the core teaching flow stays lightweight and easy to review.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS Tutorial: Queueing Analysis for Database Batch Scheduling

What This Tutorial Teaches

Industry Standards

Repository Layout

Data Model

Core Workflow

Setup

Execution

Why PyTorch Is Optional

Limitations

Verification

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
infra		infra
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aws_costs.py		aws_costs.py
cleanup_demo.py		cleanup_demo.py
pytorch_extension.py		pytorch_extension.py
queue_analysis.py		queue_analysis.py
requirements-ml.txt		requirements-ml.txt
requirements.txt		requirements.txt
sample_jobs.py		sample_jobs.py
seed_job_requests.py		seed_job_requests.py
tutorial.py		tutorial.py

Folders and files

Latest commit

History

Repository files navigation

AWS Tutorial: Queueing Analysis for Database Batch Scheduling

What This Tutorial Teaches

Industry Standards

Repository Layout

Data Model

Core Workflow

Setup

Execution

Why PyTorch Is Optional

Limitations

Verification

References

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages