Skip to content

Commit df8e673

Browse files
added explanatory text to the tutorial (#80)
1 parent 629abe3 commit df8e673

File tree

1 file changed

+153
-66
lines changed

1 file changed

+153
-66
lines changed

example_workflows/arithmetic/jobflow.ipynb

Lines changed: 153 additions & 66 deletions
Original file line numberDiff line numberDiff line change
@@ -24,58 +24,63 @@
2424
{
2525
"id": "982a4fbe-7cf9-45dd-84ae-9854149db0b9",
2626
"cell_type": "markdown",
27-
"source": "# jobflow",
27+
"source": [
28+
"# jobflow"
29+
],
2830
"metadata": {}
2931
},
3032
{
3133
"id": "e6180712-d081-45c7-ba41-fc5191f10427",
3234
"cell_type": "markdown",
33-
"source": "## Define workflow with jobflow",
35+
"source": [
36+
"## Define workflow with jobflow\n",
37+
"\n",
38+
"This tutorial will demonstrate how to use the PWD with `jobflow` and load the workflow with `aiida` and `pyiron`.\n",
39+
"\n",
40+
"[`jobflow`](https://joss.theoj.org/papers/10.21105/joss.05995) was developed to simplify the development of high-throughput workflows. It uses a decorator-based approach to define the “Job“s that can be connected to form complex workflows (“Flow“s). `jobflow` is the workflow language of the workflow library [`atomate2`](https://chemrxiv.org/engage/chemrxiv/article-details/678e76a16dde43c9085c75e9), designed to replace [atomate](https://www.sciencedirect.com/science/article/pii/S0927025617303919), which was central to the development of the [Materials Project](https://pubs.aip.org/aip/apm/article/1/1/011002/119685/Commentary-The-Materials-Project-A-materials) database."
41+
],
3442
"metadata": {}
3543
},
44+
{
45+
"cell_type": "markdown",
46+
"source": [
47+
"First, we start by importing the job decorator and the Flow class from jobflow, as welll as the necessary modules from the python workflow definition and the example arithmetic workflow."
48+
],
49+
"metadata": {
50+
"collapsed": false
51+
},
52+
"id": "69bedfb9ec12c092"
53+
},
3654
{
3755
"id": "000bbd4a-f53c-4eea-9d85-76f0aa2ca10b",
3856
"cell_type": "code",
39-
"source": "from jobflow import job, Flow",
57+
"source": [
58+
"from jobflow import job, Flow"
59+
],
4060
"metadata": {
4161
"trusted": true,
4262
"ExecuteTime": {
43-
"end_time": "2025-04-24T10:30:16.328511Z",
44-
"start_time": "2025-04-24T10:30:16.309562Z"
63+
"end_time": "2025-04-24T12:51:34.747117656Z",
64+
"start_time": "2025-04-24T12:51:33.203979325Z"
4565
}
4666
},
47-
"outputs": [
48-
{
49-
"ename": "ModuleNotFoundError",
50-
"evalue": "No module named 'jobflow'",
51-
"output_type": "error",
52-
"traceback": [
53-
"\u001B[31m---------------------------------------------------------------------------\u001B[39m",
54-
"\u001B[31mModuleNotFoundError\u001B[39m Traceback (most recent call last)",
55-
"\u001B[36mCell\u001B[39m\u001B[36m \u001B[39m\u001B[32mIn[4]\u001B[39m\u001B[32m, line 1\u001B[39m\n\u001B[32m----> \u001B[39m\u001B[32m1\u001B[39m \u001B[38;5;28;01mfrom\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[34;01mjobflow\u001B[39;00m\u001B[38;5;250m \u001B[39m\u001B[38;5;28;01mimport\u001B[39;00m job, Flow\n",
56-
"\u001B[31mModuleNotFoundError\u001B[39m: No module named 'jobflow'"
57-
]
58-
}
59-
],
60-
"execution_count": 4
67+
"outputs": [],
68+
"execution_count": 1
6169
},
6270
{
6371
"id": "06c2bd9e-b2ac-4b88-9158-fa37331c3418",
6472
"cell_type": "code",
65-
"source": "from python_workflow_definition.jobflow import write_workflow_json",
73+
"source": [
74+
"from python_workflow_definition.jobflow import write_workflow_json"
75+
],
6676
"metadata": {
6777
"trusted": true
6878
},
6979
"outputs": [],
7080
"execution_count": 2
7181
},
7282
{
73-
"metadata": {
74-
"ExecuteTime": {
75-
"end_time": "2025-04-24T10:30:04.618439Z",
76-
"start_time": "2025-04-24T10:30:04.598701Z"
77-
}
78-
},
83+
"metadata": {},
7984
"cell_type": "code",
8085
"source": [
8186
"from workflow import (\n",
@@ -85,7 +90,17 @@
8590
],
8691
"id": "f9217ce7b093b5fc",
8792
"outputs": [],
88-
"execution_count": 1
93+
"execution_count": null
94+
},
95+
{
96+
"cell_type": "markdown",
97+
"source": [
98+
"Using the job object decorator, the imported functions from the arithmetic workflow are transformed into jobflow “Job”s. These “Job”s can delay the execution of Python functions and can be chained into workflows (“Flow”s). A “Job” can return serializable outputs (e.g., a number, a dictionary, or a Pydantic model) or a so-called “Response” object, which enables the execution of dynamic workflows where the number of nodes is not known prior to the workflow’s execution. "
99+
],
100+
"metadata": {
101+
"collapsed": false
102+
},
103+
"id": "2639deadfae9c591"
89104
},
90105
{
91106
"metadata": {
@@ -95,7 +110,9 @@
95110
}
96111
},
97112
"cell_type": "code",
98-
"source": "workflow_json_filename = \"jobflow_simple.json\"",
113+
"source": [
114+
"workflow_json_filename = \"jobflow_simple.json\""
115+
],
99116
"id": "1feba0898ee4e361",
100117
"outputs": [],
101118
"execution_count": 2
@@ -110,31 +127,17 @@
110127
"get_prod_and_div = job(_get_prod_and_div)"
111128
],
112129
"metadata": {
113-
"trusted": true,
114-
"ExecuteTime": {
115-
"end_time": "2025-04-24T10:30:05.169761Z",
116-
"start_time": "2025-04-24T10:30:05.043635Z"
117-
}
130+
"trusted": true
118131
},
119-
"outputs": [
120-
{
121-
"ename": "NameError",
122-
"evalue": "name 'job' is not defined",
123-
"output_type": "error",
124-
"traceback": [
125-
"\u001B[31m---------------------------------------------------------------------------\u001B[39m",
126-
"\u001B[31mNameError\u001B[39m Traceback (most recent call last)",
127-
"\u001B[36mCell\u001B[39m\u001B[36m \u001B[39m\u001B[32mIn[3]\u001B[39m\u001B[32m, line 1\u001B[39m\n\u001B[32m----> \u001B[39m\u001B[32m1\u001B[39m get_sum = \u001B[43mjob\u001B[49m(_get_sum)\n\u001B[32m 2\u001B[39m get_prod_and_div = job(_get_prod_and_div, data=[\u001B[33m\"\u001B[39m\u001B[33mprod\u001B[39m\u001B[33m\"\u001B[39m, \u001B[33m\"\u001B[39m\u001B[33mdiv\u001B[39m\u001B[33m\"\u001B[39m])\n",
128-
"\u001B[31mNameError\u001B[39m: name 'job' is not defined"
129-
]
130-
}
131-
],
132-
"execution_count": 3
132+
"outputs": [],
133+
"execution_count": null
133134
},
134135
{
135136
"id": "ecef1ed5-a8d3-48c3-9e01-4a40e55c1153",
136137
"cell_type": "code",
137-
"source": "obj = get_prod_and_div(x=1, y=2)",
138+
"source": [
139+
"obj = get_prod_and_div(x=1, y=2)"
140+
],
138141
"metadata": {
139142
"trusted": true
140143
},
@@ -144,7 +147,9 @@
144147
{
145148
"id": "2b88a30a-e26b-4802-89b7-79ca08cc0af9",
146149
"cell_type": "code",
147-
"source": "w = get_sum(x=obj.output.prod, y=obj.output.div)",
150+
"source": [
151+
"w = get_sum(x=obj.output.prod, y=obj.output.div)"
152+
],
148153
"metadata": {
149154
"trusted": true
150155
},
@@ -154,17 +159,31 @@
154159
{
155160
"id": "a5e5ca63-2906-47c9-bac6-adebf8643cba",
156161
"cell_type": "code",
157-
"source": "flow = Flow([obj, w])",
162+
"source": [
163+
"flow = Flow([obj, w])"
164+
],
158165
"metadata": {
159166
"trusted": true
160167
},
161168
"outputs": [],
162169
"execution_count": 8
163170
},
171+
{
172+
"cell_type": "markdown",
173+
"source": [
174+
"As jobflow itself is only a workflow language, the workflows are typically executed on high-performance computers with a workflow manager such as [Fireworks](https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.3505) or [jobflow-remote](https://github.com/Matgenix/jobflow-remote). For smaller and test workflows, simple linear, non-parallel execution of the workflow graph can be performed with jobflow itself. All outputs of individual jobs are saved in a database. For high-throughput applications typically, a MongoDB database is used. For testing and smaller workflows, a memory database can be used instead."
175+
],
176+
"metadata": {
177+
"collapsed": false
178+
},
179+
"id": "27688edd256f1420"
180+
},
164181
{
165182
"id": "e464da97-16a1-4772-9a07-0a47f152781d",
166183
"cell_type": "code",
167-
"source": "write_workflow_json(flow=flow, file_name=workflow_json_filename)",
184+
"source": [
185+
"write_workflow_json(flow=flow, file_name=workflow_json_filename)"
186+
],
168187
"metadata": {
169188
"trusted": true
170189
},
@@ -174,7 +193,9 @@
174193
{
175194
"id": "bca646b2-0a9a-4271-966a-e5903a8c9031",
176195
"cell_type": "code",
177-
"source": "!cat {workflow_json_filename}",
196+
"source": [
197+
"!cat {workflow_json_filename}"
198+
],
178199
"metadata": {
179200
"trusted": true
180201
},
@@ -187,16 +208,34 @@
187208
],
188209
"execution_count": 10
189210
},
211+
{
212+
"cell_type": "markdown",
213+
"source": [
214+
"Finally, you can write the workflow data into a JSON file to be imported later."
215+
],
216+
"metadata": {
217+
"collapsed": false
218+
},
219+
"id": "65389ef27c38fdec"
220+
},
190221
{
191222
"id": "87a27540-c390-4d34-ae75-4739bfc4c1b7",
192223
"cell_type": "markdown",
193-
"source": "## Load Workflow with aiida",
224+
"source": [
225+
"## Load Workflow with aiida\n",
226+
"\n",
227+
"In this part, we will demonstrate how to import the `jobflow` workflow into `aiida` via the PWD."
228+
],
194229
"metadata": {}
195230
},
196231
{
197232
"id": "66a1b3a6-3d3b-4caa-b58f-d8bc089b1074",
198233
"cell_type": "code",
199-
"source": "from aiida import load_profile\n\nload_profile()",
234+
"source": [
235+
"from aiida import load_profile\n",
236+
"\n",
237+
"load_profile()"
238+
],
200239
"metadata": {
201240
"trusted": true
202241
},
@@ -215,17 +254,32 @@
215254
{
216255
"id": "4679693b-039b-45cf-8c67-5b2b3d705a83",
217256
"cell_type": "code",
218-
"source": "from python_workflow_definition.aiida import load_workflow_json",
257+
"source": [
258+
"from python_workflow_definition.aiida import load_workflow_json"
259+
],
219260
"metadata": {
220261
"trusted": true
221262
},
222263
"outputs": [],
223264
"execution_count": 12
224265
},
266+
{
267+
"cell_type": "markdown",
268+
"source": [
269+
"We import the necessary modules from `aiida` and the PWD, as well as the workflow JSON file."
270+
],
271+
"metadata": {
272+
"collapsed": false
273+
},
274+
"id": "cc7127193d31d8ef"
275+
},
225276
{
226277
"id": "68c41a61-d185-47e8-ba31-eeff71d8b2c6",
227278
"cell_type": "code",
228-
"source": "wg = load_workflow_json(file_name=workflow_json_filename)\nwg",
279+
"source": [
280+
"wg = load_workflow_json(file_name=workflow_json_filename)\n",
281+
"wg"
282+
],
229283
"metadata": {
230284
"trusted": true
231285
},
@@ -246,10 +300,22 @@
246300
],
247301
"execution_count": 13
248302
},
303+
{
304+
"cell_type": "markdown",
305+
"source": [
306+
"Finally, we are now able to run the workflow with `aiida`."
307+
],
308+
"metadata": {
309+
"collapsed": false
310+
},
311+
"id": "4816325767559bbe"
312+
},
249313
{
250314
"id": "05228ece-643c-420c-8df8-4ce3df379515",
251315
"cell_type": "code",
252-
"source": "wg.run()",
316+
"source": [
317+
"wg.run()"
318+
],
253319
"metadata": {
254320
"trusted": true
255321
},
@@ -265,13 +331,19 @@
265331
{
266332
"id": "2c942094-61b4-4e94-859a-64f87b5bec64",
267333
"cell_type": "markdown",
268-
"source": "## Load Workflow with pyiron_base",
334+
"source": [
335+
"## Load Workflow with pyiron_base\n",
336+
"\n",
337+
"In this part, we will demonstrate how to import the `jobflow` workflow into `pyiron` via the PWD."
338+
],
269339
"metadata": {}
270340
},
271341
{
272342
"id": "ea102341-84f7-4156-a7d1-c3ab1ea613a5",
273343
"cell_type": "code",
274-
"source": "from python_workflow_definition.pyiron_base import load_workflow_json",
344+
"source": [
345+
"from python_workflow_definition.pyiron_base import load_workflow_json"
346+
],
275347
"metadata": {
276348
"trusted": true
277349
},
@@ -281,7 +353,10 @@
281353
{
282354
"id": "8f2a621d-b533-4ddd-8bcd-c22db2f922ec",
283355
"cell_type": "code",
284-
"source": "delayed_object_lst = load_workflow_json(file_name=workflow_json_filename)\ndelayed_object_lst[-1].draw()",
356+
"source": [
357+
"delayed_object_lst = load_workflow_json(file_name=workflow_json_filename)\n",
358+
"delayed_object_lst[-1].draw()"
359+
],
285360
"metadata": {
286361
"trusted": true
287362
},
@@ -300,7 +375,9 @@
300375
{
301376
"id": "cf80267d-c2b0-4236-bf1d-a57596985fc1",
302377
"cell_type": "code",
303-
"source": "delayed_object_lst[-1].pull()",
378+
"source": [
379+
"delayed_object_lst[-1].pull()"
380+
],
304381
"metadata": {
305382
"trusted": true
306383
},
@@ -322,14 +399,24 @@
322399
"execution_count": 17
323400
},
324401
{
325-
"id": "9d819ed0-689c-46a7-9eff-0afb5ed66efc",
326-
"cell_type": "code",
327-
"source": "",
402+
"cell_type": "markdown",
403+
"source": [
404+
"Here, the procedure is the same as before: Import the necessary `pyiron_base` module from the PWD, import the workflow JSON file and run the workflow with pyiron."
405+
],
328406
"metadata": {
329-
"trusted": true
407+
"collapsed": false
330408
},
409+
"id": "9414680d1cbc3b2e"
410+
},
411+
{
412+
"cell_type": "code",
413+
"execution_count": null,
331414
"outputs": [],
332-
"execution_count": null
415+
"source": [],
416+
"metadata": {
417+
"collapsed": false
418+
},
419+
"id": "c199b28f3c0399cc"
333420
}
334421
]
335422
}

0 commit comments

Comments
 (0)