Skip to content

Techietal/Task-Scheduling-Dataset

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Task Scheduling Dataset for Cloud Environments

A collection of workflow task graphs for benchmarking and evaluating task scheduling algorithms in heterogeneous computing environments.

Dataset Structure

Each top-level directory represents a scientific workflow application. Subdirectories within each workflow are organized by the number of nodes (tasks) in the workflow graph.

Workflow Description Node Sizes Available
FFT Fast Fourier Transform workflows 40, 96, 224, 512, 1152
GE Gene Expression analysis workflows 9, 35, 135, 527, 2079
Genome Genome sequencing workflows 50 – 1000 (in increments of 50–100)
LA Linear Algebra workflows 11, 37, 137, 529, 2081
LIGO Laser Interferometer Gravitational-Wave Observatory workflows 50 – 1000 (in increments of 50–100)
Montage Astronomical image mosaicking workflows 50 – 1000 (in increments of 50–100)

Directory Layout

Dataset/
├── FFT/
│   ├── 40Nodes/
│   ├── 96Nodes/
│   ├── 224Nodes/
│   ├── 512Nodes/
│   └── 1152Nodes/
├── GE/
│   ├── 9Nodes/
│   ├── 35Nodes/
│   ├── 135Nodes/
│   ├── 527Nodes/
│   └── 2079Nodes/
├── Genome/
│   ├── 50Nodes/
│   ├── 100Nodes/
│   ├── ...
│   └── 1000Nodes/
├── LA/
│   ├── 11Nodes/
│   ├── 37Nodes/
│   ├── 137Nodes/
│   ├── 529Nodes/
│   └── 2081Nodes/
├── LIGO/
│   ├── 50Nodes/
│   ├── 100Nodes/
│   ├── ...
│   └── 1000Nodes/
└── Montage/
    ├── 50Nodes/
    ├── 100Nodes/
    ├── ...
    └── 1000Nodes/

Usage

These datasets can be used to evaluate scheduling algorithms on metrics such as:

  • Makespan — total execution time of the workflow
  • Resource utilization — CPU/memory efficiency across cloud instances
  • Cost — monetary cost of cloud resource allocation
  • Scalability — algorithm performance as workflow size increases

The varying node counts within each workflow allow for scalability testing — from small workflows (9–40 nodes) to large-scale workflows (1000–2081 nodes).

About

Dataset curated for simulation of Task Scheduling of workflows that are DAG based on heterogeneous processing environments

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages