daggen

A Python port of the daggen tool proposed by Suter & Hunold.

This library generates random, synthetic task graphs for simulation. You can easily convert the generated task graphs into directed acyclic graphs (DAGs) in tools such as NetworkX or igraph, or export them to files. The library helps evaluate scheduling algorithms using various application configurations.

Here is an example of how to generate a DAG of 15 tasks, accepting the default parameters, and import it into igraph:

import daggen as dg
import igraph as ig

dag = dg.DAG(seed=42, num_tasks=15)
tasks, edges = dag.task_n_edge_dicts()
igraph_dag = ig.Graph.DictList(vertices=tasks, edges=edges, directed=True)

You can alternatively convert the task graph into a NetworkX' DiGraph:

import daggen as dg
import networkx as nx

dag = dag.DAG(seed=46, num_tasks=15)
tasks, edges = dag.task_n_edge_tuples()
net_dag = nx.DiGraph()
net_dag.add_nodes_from(tasks)
net_dag.add_edges_from(edges)

Task and edge attributes

Each generated task has two attributes:

computation: a sequential cost (in Flops).
alpha: the alpha parameter of Amdahl's Law, used to encode the overhead of parallelizing tasks in parallel task graphs.

Each edge represents a communication from a parent task to a child task and has a data attribute representing the amount of data transferred from parent to child.

DAG parameters

You can configure the characteristics of the generated DAGs by parametrizing the DAG() function. Following the C implementation, one can set the following parameters:

`seed`: used to seed the random number generator and ensure reproducibility.
`num_tasks`: Number of computation nodes (application tasks) in the DAG.
`min_data`: Minimum amount of data in bytes a task processes.
`max_data`: Maximum amount of data in bytes a task processes.
`min_alpha`: Minimum value for the extra parameter (e.g., Amdahl's law parameter).
`max_alpha`: Minimum value for the extra parameter.
`fat`: Width of the DAG, the maximum number of tasks executed concurrently. A small value results in a thin DAG. (e.g., chain) with low task parallelism, while a large value creates a fat DAG (e.g., fork-join). with a high degree of parallelism.
`density`: Determines the dependencies between tasks of two consecutive DAG levels.
`regular`: Regularity of the task distribution between the different levels of the DAG.
`ccr`: Communication to computation ratio. It encodes the complexity of the computation of a task depending on the size n of the dataset it processes. The encoding is as follows:
- 1 : a * n (a is a constant picked randomly between 26 and 29).
- 2 : a * n log n
- 3 : n3/2
- 0 : Random choice among the three complexities.
`jump_size`: Maximum number of levels spanned by inter-task communications, which enables DAGs with execution paths of multiple lengths.

Examples on Google Colab

A couple examples are available on this Colab notebook:

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github/workflows		.github/workflows
daggen		daggen
notebooks		notebooks
tests		tests
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.rst		README.rst
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/workflows

.github/workflows

daggen

daggen

notebooks

notebooks

tests

tests

.gitignore

.gitignore

LICENSE.txt

LICENSE.txt

MANIFEST.in

MANIFEST.in

README.rst

README.rst

pyproject.toml

pyproject.toml

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

daggen

Task and edge attributes

DAG parameters

Examples on Google Colab

About

Releases

Packages

Languages

License

assuncaomarcos/daggen

Folders and files

Latest commit

History

Repository files navigation

daggen

Task and edge attributes

DAG parameters

Examples on Google Colab

About

Resources

License

Stars

Watchers

Forks

Languages