[New WfMS] Implement Python Workflow Definition in pyiron_workflow #111

jan-janssen · 2025-05-23T08:03:11Z

Status of the Implementation for pyiron_workflow

Example Name	Export to Python Workflow Definition	Import from Python Workflow Definition
arithmetic	✅	✅
nfdi4ing	✅	✅
quantum_espresso	✅	✅

liamhuber · 2025-05-23T16:08:14Z

Hi @jan-janssen, I'm just migrating your e-mail comment here

The export from pyiron_workflow works fine, at least for the first of the three examples, but the import from the Python Workflow Definition to pyiron_workflow currently still has issues, so I was looking for existing code I could recycle.

I forked the repo. I didn't install all the dependencies for the other managers, but for the arithmetic example I can see that

✅ building the pyiron_workflow workflow up from pieces works fine and saving it makes (up to some ordering) the same json file
❌ loading the existing workflow.json file directly doesn't produce the correct pyiron_workflow.Workflow object

Later this morning I'll tack a crack at debugging python_workflow_definition.pyiron_workflow.load_workflow_json to resolve the failing step.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

…on_workflow.py

jan-janssen · 2025-05-24T09:43:57Z

@liamhuber Thanks a lot, based on your pull request I was able to get the arithmetic example to work as well as the nfdi4ing example. The missing part is the quantum espresso example, here we use the following get_dict() function to merge the outputs of multiple individual jobs into one dictionary:

def get_dict(**kwargs) -> dict:
    return {k: v for k, v in kwargs.items()}

This is not compatible with the type checking from pyiron_workflow so the tests currently fail.

liamhuber · 2025-05-24T14:36:46Z

Super, glad it is working so well!

Unfortunately the function you show for the quantum espresso example will never work. In pyiron_workflow, we insist that each node's IO is known already at the class level (ie at recipe time).

You might be able to hack something together with the inputs-to-dict meta-node (which creates a subclass of a generic input channels to dictionary node with fixed input channels). Better, IMO, would be to write a more explicit function for the workflow.

jan-janssen · 2025-05-24T17:20:20Z

The challenge is that the calculate_qe() function takes a dictionary as an input with one parameter being the structure of the previous step, so we use the above function to insert the structure into the dictionary.

@liamhuber Can you explain a bit more more how the hacky solution would work?

liamhuber · 2025-05-25T03:05:01Z

The challenge is that the calculate_qe() function takes a dictionary as an input with one parameter being the structure of the previous step, so we use the above function to insert the structure into the dictionary.

The problem isn't the dict, it's the **kwargs. If you know a-priori what fields are going in and explicitly write a function to take each and put them in a dictionary, everything should work with pyiron_workflow out-of-the-box.

@liamhuber Can you explain a bit more more how the hacky solution would work?

Not really, I just wonder if there's some way to abuse inputs_to_dict to work in a bespoke way for this particular workflow.

In general though,

def get_dict(**kwargs) -> dict:
    return {k: v for k, v in kwargs.items()}

seems like a bad node to me. It doesn't have clear handles for its IO at the recipe level -- you only actually know what the node's interface looks like at runtime. I haven't dug into how you're getting it to work for the other examples, so maybe you have some graceful way around this.

jan-janssen · 2025-05-25T09:25:41Z

@liamhuber I was able to implement a workaround for importing the Quantum ESPRESSO workflow from the Python Workflow Definition to pyiron_workflow. For the other way of exporting from pyiron_workflow to the Python Workflow Definition I am still struggling with the input_to_dict() function. For some reason the nodes are already executed when I call inputs_to_dict(), as you can see in the demonstration notebook:
https://github.com/pythonworkflow/python-workflow-definition/blob/pyiron_workflow/example_workflows/quantum_espresso/pyiron_workflow.ipynb

Furthermore, the inputs_to_dict() node is not correctly visualized when I draw the corresponding workflow. I created a small example to demonstrate this in a separate issue:
pyiron/pyiron_workflow#648

liamhuber · 2025-05-25T14:01:54Z

@jan-janssen, following the pyiron_workflow issue, let me know if you are still having hiccups once you skip the "generator" step. Fingers cross that will clear up the remaining issues.

jan-janssen · 2025-05-25T15:18:53Z

@liamhuber Thanks a lot, that definitely clarified the issue with the execution before calling run(). I updated the corresponding notebook and I am optimistic that it is now possible to export it to the Python Workflow Definition.

Still there is one confusing issue remaining when I use the following code, everything works fine:

volume_dict = {"s_" + str(i): j["volume"] for i, j in enumerate(job_strain_lst)}
wf.volume_dict = inputs_to_dict(
    input_specification=list(volume_dict.keys()),
    **volume_dict
)
energy_dict = {"s_" + str(i): j["energy"] for i, j in enumerate(job_strain_lst)}
wf.energy_dict = inputs_to_dict(
    input_specification=list(energy_dict.keys()),
    **energy_dict,
)
wf.volume_lst = get_values_from_dict(
    input_dict=wf.volume_dict
)
wf.energy_lst = get_values_from_dict(
    input_dict=wf.energy_dict
)

But when I use the following modied code, it does not:

volume_dict = {"s_" + str(i): j["volume"] for i, j in enumerate(job_strain_lst)}
wf.volume_lst = get_values_from_dict(
    input_dict=inputs_to_dict(
        input_specification=list(volume_dict.keys()),
        **volume_dict
    )
)
energy_dict = {"s_" + str(i): j["energy"] for i, j in enumerate(job_strain_lst)}
wf.energy_lst = get_values_from_dict(
    input_dict=inputs_to_dict(
        input_specification=list(energy_dict.keys()),
        **energy_dict,
    )
)

I get the following error message:

KeyError: 'The channel [/my_workflow/volume_lst.input_dict](https://hub.gesis.mybinder.org/my_workflow/volume_lst.input_dict) has a connection to the upstream channel [/InputsToDictm3100931232959297192.dict](https://hub.gesis.mybinder.org/InputsToDictm3100931232959297192.dict), but the upstream owner InputsToDictm3100931232959297192 was not found among nodes. All nodes in the data flow dependency tree must be included.'

liamhuber · 2025-05-25T22:33:38Z

@jan-janssen, super!

Yes, I'm open to suggestions for how to make the error message friendlier, but it contains all the necessary information we need. Namely, we're trying to connect InputsToDictm3100931232959297192.dict -> my_workflow/volume_lst.input_dict, but InputsToDictm3100931232959297192 isn't a sibling so the connection causes trouble (I can't remember or tell from the error message if you're getting this immediately at connection time (optimal) or later at runtime (better than never, but failing earlier would be better)).

It works in the first example because you're explicitly declaring the inputs_to_dict node as wf.*_dict =. IMO this is the clearest and therefore best way to do it. If you want, you can use the second example where the node gets an automatically generated name by passing in the parentage at node instantiation. I think this will fix it for you if you:

volume_dict = {"s_" + str(i): j["volume"] for i, j in enumerate(job_strain_lst)}
wf.volume_lst = get_values_from_dict(
    input_dict=inputs_to_dict(
        input_specification=list(volume_dict.keys()),
        parent=wf,  # Parent the implicit node before using as input
        **volume_dict
    )
)
energy_dict = {"s_" + str(i): j["energy"] for i, j in enumerate(job_strain_lst)}
wf.energy_lst = get_values_from_dict(
    input_dict=inputs_to_dict(
        input_specification=list(energy_dict.keys()),
        parent=wf,  # Parent the implicit node before using as input
        **energy_dict,
    )
)

jan-janssen · 2025-05-26T05:31:48Z

@liamhuber Thanks again, I was able to fix the last remaining issues. Now the interface between pyiron_workflow and the Python Workflow Definition supports all three examples in both directions import and export.

jan-janssen and others added 3 commits May 23, 2025 10:02

Implement Python Workflow Definition in pyiron_workflow

3bdaee6

black formatting

fbbcc08

bug fix

1df9b19

jan-janssen marked this pull request as draft May 23, 2025 08:29

jan-janssen and others added 2 commits May 23, 2025 11:10

add function to load workflow from JSOn - not working

2bc578b

black formatting

55a0764

liamhuber and others added 12 commits May 23, 2025 12:07

Add json-to-Workfow function

9830048

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Update python_workflow_definition/src/python_workflow_definition/pyir…

038bc85

…on_workflow.py

merge Liam's changes

9739b47

black formatting

602fa70

Merge remote-tracking branch 'origin/main' into pyiron_workflow

aed419c

update notebooks

d743bea

fix file name

81b7df7

Add pyiron_workflow to all universal notebooks

78ae48b

update notebooks for other workflow frameworks

4188e24

fix reuse of functions

264b43d

black formatting

0fd2a9a

Add NFDI example

09a5db0

jan-janssen changed the title ~~Implement Python Workflow Definition in pyiron_workflow~~ [New WfMS] Implement Python Workflow Definition in pyiron_workflow May 24, 2025

This was referenced May 24, 2025

Create a pyiron_workflow example pyiron-dev/NFDI4Ing_pyiron_base#2

Closed

[Documentation] Summarize supported Workflow Frameworks #116

Open

jan-janssen added 4 commits May 25, 2025 07:27

dynamic get_dict() function

f72e7f1

dynamic get_list() function

f0894f1

add s_ to input parameters

c027262

uniquue child names

39b9d56

jan-janssen and others added 4 commits May 25, 2025 08:13

silence validation

1b245ae

handle special case of get_list()

613db09

black formatting

e80d591

test quantum espresso export

a127e65

Add files via upload

68e145f

jan-janssen added 3 commits May 26, 2025 06:34

update write_workflow_json()

373df59

black formatting

560fa19

update notebook

9e541a6

Add files via upload

9636be6

jan-janssen mentioned this pull request Jul 6, 2025

[Suggestion] Possible integration with AWL and its SPARQL interface #127

Open

liamhuber mentioned this pull request Jul 16, 2025

Parse dictionary output pyiron/semantikon#228

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[New WfMS] Implement Python Workflow Definition in pyiron_workflow #111

[New WfMS] Implement Python Workflow Definition in pyiron_workflow #111

Uh oh!

jan-janssen commented May 23, 2025 •

edited

Loading

Uh oh!

liamhuber commented May 23, 2025

Uh oh!

jan-janssen commented May 24, 2025

Uh oh!

liamhuber commented May 24, 2025

Uh oh!

jan-janssen commented May 24, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 25, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 25, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 26, 2025

Uh oh!

Uh oh!

[New WfMS] Implement Python Workflow Definition in pyiron_workflow #111

Are you sure you want to change the base?

[New WfMS] Implement Python Workflow Definition in pyiron_workflow #111

Uh oh!

Conversation

jan-janssen commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

liamhuber commented May 23, 2025

Uh oh!

jan-janssen commented May 24, 2025

Uh oh!

liamhuber commented May 24, 2025

Uh oh!

jan-janssen commented May 24, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 25, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 25, 2025

Uh oh!

liamhuber commented May 25, 2025

Uh oh!

jan-janssen commented May 26, 2025

Uh oh!

Uh oh!

jan-janssen commented May 23, 2025 •

edited

Loading