Skip to content

tianfa-lee/sql-batch-job

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SQL Batch Job Execution Sequencer

Project Structure

sql-batch-job/
├── schema/
│   └── 01_create_tables.sql       # Table definitions and indexes
├── data/
│   └── 02_seed_data.sql           # Sample batch job data (13 steps, 22 rules)
├── src/
│   └── 04_execution_order_view.sql  # Core solution: recursive CTE view
└── tests/
    └── 03_tests.sql               # 9 automated assertion-based tests

Note: Files are numbered to reflect the intended execution order. The tests/03_tests.sql is numbered ahead of src/04_execution_order_view.sql because of TDD approach when working on the assessment


How to Run

Prerequisites

  • PostgreSQL 12 or later (required for CREATE OR REPLACE VIEW with recursive CTEs and DO $$ ... $$ anonymous blocks)

Steps

Run scripts in order:

-- 1. Create schema
\i schema/01_create_tables.sql

-- 2. Seed data
\i data/02_seed_data.sql

-- 3. Create the view (core solution)
\i src/04_execution_order_view.sql

-- 4. Run tests
\i tests/03_tests.sql

Or chain them in a single psql call:

psql -U <user> -d <database> \
  -f schema/01_create_tables.sql \
  -f data/02_seed_data.sql \
  -f src/04_execution_order_view.sql \
  -f tests/03_tests.sql

Solution Design

Core Logic — batch_execution_order View (04_execution_order_view.sql)

The view uses a recursive CTE with three stages:

  1. anchor — Seeds the recursion with all steps that have no dependency (STEP_DEP_ID = 0), assigning them execution_level = 1.

  2. level_resolver — Recursively walks the dependency graph, incrementing the execution level at each hop.

  3. final_level — Collapses multiple paths to the same step by taking MAX(execution_level). This correctly handles steps with multiple dependencies: a step that depends on both a Level 2 and a Level 3 step will resolve to Level 4, ensuring all its parents have completed first.

The final SELECT joins back to prog_name to return the human-readable procedure name alongside the execution level.

Key Design Decisions

Decision Rationale
MAX(execution_level) in final_level A step with multiple dependencies must wait for all of them. Taking the maximum level across all paths guarantees this.
Recursive CTE over procedural logic Pure SQL, portable within any SQL engine supporting SQL:1999 recursive CTEs. No stored procedures or application code required.
View (not materialised) Data stays live and consistent with the underlying tables. Re-query the view after any data change to get an updated plan.
Indexes on (unit_nbr, step_dep_id) and (unit_nbr, step_seq_id) The recursive join hits dependency_rules on every iteration; covering indexes on both sides of the join keep performance predictable as job sizes grow.

Schema

prog_name

Column Type Description
unit_nbr INTEGER Batch job identifier
step_seq_id INTEGER Unique step ID within the batch job
step_prog_name VARCHAR(200) Name of the stored procedure to execute

Primary key: (unit_nbr, step_seq_id)

dependency_rules

Column Type Description
unit_nbr INTEGER Batch job identifier
rule_id INTEGER Unique rule ID within the batch job
step_seq_id INTEGER The step that has the dependency
step_dep_id INTEGER The step that must complete first (0 = no dependency)

Primary key: (unit_nbr, rule_id)


Tests

Nine automated tests are included in 03_tests.sql, using PostgreSQL DO blocks with ASSERT statements. Passing tests emit a RAISE NOTICE message; failing tests raise an exception with a descriptive message.

# Test What It Verifies
1 Step 1 is Level 1 The root step with no dependencies executes first
2 Step 2 is Level 2 A step depending on Step 1 runs in the next wave
3 Steps 3 & 4 are both Level 3 Steps sharing the same dependency run in parallel
4 Steps 5–9 are all Level 4 Steps with multiple dependencies wait for all of them
5 Step 10 is Level 5 Fan-in step correctly waits for all Level 4 steps
6 Steps 11, 12, 13 are Levels 6, 7, 8 Strictly sequential tail correctly ordered
7 No duplicate steps in output Each step appears exactly once in the view
8 Step count matches prog_name No steps are silently dropped or added
9 No dependency ordering violations For every rule, the parent's level is strictly less than the child's level

Design Document

Refer to the design_doc.docx for the assumed business context, diagram visualization, assumptions made and possible improvements.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors