An industry-oriented, portfolio-grade Data Structures and Algorithms (DSA) project implementing a single-processor batch task scheduling simulator. This system optimizes task execution order to maximize total value/profit and meet deadline constraints using a custom Max-Heap (Priority Queue) and Greedy Heuristics.
In cloud computing, operating system kernels, and project management pipelines, resources (CPU, threads, workers) are finite, while tasks are dynamic. If tasks are executed randomly or in simple arrival order (First-In, First-Out), critical deadlines can be missed, leading to system failure, SLA violations, or loss of business profit.
The Task Scheduler Optimization System is a modular Python command-line utility that simulates execution queues. It implements two distinct scheduling paradigms:
- Profit-Greedy Scheduler (Max-Heap driven): Prioritizes scheduling tasks with high profit density (value per time-unit) as late as possible before their deadlines to optimize total value.
- Earliest Deadline First (EDF) Scheduler: Focuses on task urgency by sorting tasks ascending by their deadlines to maximize completed task counts.
Given a set of
- A unique Task ID and Name.
- A Priority (criticality rank from 1 to 10).
- A Deadline (the time-step by which it must complete).
- A Duration (execution time steps required, non-preemptive).
- A Profit (value or weight earned if completed on time).
The goal is to determine the optimal sequence of tasks to execute on a single CPU resource such that we maximize the total profit earned and analyze scheduling tradeoffs (turnaround time, waiting time, and CPU utilization).
-
Custom Max-Heap (Priority Queue): Implemented from scratch in heap.py to extract the most important/highest profit task dynamically in
$O(\log N)$ time. -
Greedy Strategy: Uses profit density (
$\frac{\text{Profit}}{\text{Duration}}$ ) and deadline-matching backwards allocation to find high-yield schedules. - Sorting Algorithms: Timsort is utilized to pre-sort timelines and format data reports.
- Array-Based Execution Timeline: Simulates sequential CPU slots to track idle and active intervals.
Task-Scheduler-Optimization-System/
│
├── data/
│ └── sample_tasks.csv # Default CSV file containing simulation workloads
│
├── src/
│ ├── __init__.py
│ ├── task.py # Task class & CSV Loader
│ ├── heap.py # Custom Max-Heap implementation
│ ├── scheduler.py # Greedy and EDF scheduling engines
│ ├── metrics.py # Performance calculations & comparison tables
│ ├── report.py # ASCII visualizer & CSV/Text exporters
│ └── test_heap.py # Unit tests for verification
│
├── outputs/
│ ├── schedule_report.txt # Formatted report showing comparison and timeline
│ └── schedule_report.csv # Detailed tabular log of CPU execution steps
│
├── docs/
│ └── project_guide.md # Technical deep-dive & interview preparation guide
│
├── requirements.txt # Project python environment dependencies
├── .gitignore # Git tracking exclusions
├── main.py # CLI Interactive Controller
└── README.md # Repository landing page
- Python 3.8 or higher installed on your system.
- Standard libraries only (No external dependencies required).
-
Clone or download the project folder:
cd "Task Scheduler Optimization System"
-
Run the interactive CLI application:
- Windows (PowerShell / cmd):
python main.py
- macOS / Linux (Terminal):
python3 main.py
- Windows (PowerShell / cmd):
-
Verify the Custom Heap: You can run unit tests to verify the correctness of the heap operations:
python -m unittest src/test_heap.py
Below is an example of the terminal simulation comparison for a workload of 8 tasks:
============================================================
TASK SCHEDULER OPTIMIZATION SYSTEM
============================================================
1. Load and Run Default Sample Tasks (from CSV)
2. Load and Run Custom CSV Task File
3. Enter Tasks Manually via Command Line
4. Generate and Run Random Task Workload
5. Exit
Enter choice (1-5): 1
[ Profit-Greedy (Max-Heap) Timeline ]
Time: t=1 |t=2 |t=3 |t=4 |t=5 |t=6 |t=7 |t=8 |t=9 |t=10 |
----------------------------------------------------------------------------
CPU: [IDLE]|[ T3 ]|[ T6 ]|[ T1 ]|[ T1 ]|[ T1 ]|[ T4 ]|[ T4 ]|[IDLE]|[IDLE]|
[ Earliest Deadline First (EDF) Timeline ]
Time: t=1 |t=2 |t=3 |t=4 |t=5 |t=6 |t=7 |t=8 |
--------------------------------------------------------------
CPU: [ T3 ]|[ T6 ]|[ T5 ]|[ T5 ]|[ T8 ]|[ T8 ]|[ T4 ]|[ T4 ]|
|--------------------------------+----------------------------+--------------------------------|
| Metric | Profit-Greedy (Max-Heap) | Earliest Deadline First (EDF) |
|--------------------------------+----------------------------+--------------------------------|
| Algorithm Strategy | Greedy (Profit Optimization) | Chronological Deadline-First |
| Tasks Completed On-Time | 4/8 (50.0%) | 5/8 (62.5%) |
| Tasks Missed (Lateness) | 4 | 3 |
| Total Profit/Value Earned | $250.00 | $240.00 |
| CPU Utilization Rate | 70.0% | 100.0% |
| Average Waiting Time | 3.00 units | 2.60 units |
| Average Turnaround Time | 4.75 units | 4.20 units |
|--------------------------------+----------------------------+--------------------------------|
[Analysis] Tradeoff Analysis:
1. **Profit/Value**: Profit-Greedy earned $10.00 more than EDF because it prioritized high-value tasks.
2. **Success Rate (Throughput)**: EDF completed 1 more task(s) on-time by focusing on urgency first.
3. **Wait & Turnaround**: EDF achieved lower average waiting time, which is common as it processes tasks sequentially without gaps.
By completing and review this project, you will demonstrate:
- Core Data Structure Mastery: Writing a custom Binary Max-Heap from scratch rather than importing library wrappers.
- Heuristic Algorithm Design: Understanding the mechanics of greedy scheduling and identifying its performance boundaries.
- Software Architecture Best Practices: Developing modular, production-ready Python packages with separate folders for models, services, reports, and tests.
- Interview Readiness: Explaining core system design principles behind process scheduling, cloud optimization, and latency tradeoffs (see docs/project_guide.md).