Data models #69

liamhuber · 2026-01-07T01:14:31Z

~~TODO: tests~~ done

Adds pydantic models for the atomic¹ and workflow node types.
Edges are structured as dictionaries per #65, and ~~workflows~~ all nodes explicitly specify their IO labels per #63.

Out of scope (will stack as a separate PRs):

Parsing python into these types
Parsers-as-decorators
Other graph elements (for, ...)
Metadata and things pertaining to node store, etc.

I'm running with "atomic" right now since both Sam and I are fine with it, but this is subject to change. ↩

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

@XzzX

I'm lifting the validator directly from @XzzX's [`PythonWorkflowDefinitionFunctionNode`](https://github.com/XzzX/python-workflow-definition/blob/fec059137d5c23a5983a798d347a50dbb911e56b/src/python_workflow_definition/models.py#L57) Co-authored-by: Sebastian Eibl <xzzx@users.noreply.github.com> Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

@XzzX

Again, lifted from @XzzX's attack for the python workflow definition [`PythonWorkflowDefinitionNode`](https://github.com/XzzX/python-workflow-definition/blob/fec059137d5c23a5983a798d347a50dbb911e56b/src/python_workflow_definition/models.py#L68) Co-authored-by: Sebastian Eibl <xzzx@users.noreply.github.com> Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

E.g. to avoid double ".." Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And correct the nodes typing Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

github-actions · 2026-01-07T01:14:41Z

👈 Launch a binder notebook on branch pyiron/flowrep/data_models

codecov · 2026-01-07T01:15:59Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.52%. Comparing base (5a6a828) to head (01e2965).
⚠️ Report is 53 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #69      +/-   ##
==========================================
+ Coverage   95.50%   96.52%   +1.01%     
==========================================
  Files           4        4              
  Lines         668      805     +137     
==========================================
+ Hits          638      777     +139     
+ Misses         30       28       -2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

liamhuber · 2026-01-07T01:16:05Z

@XzzX, I lifted almost verbatim a couple pydantic snippets from your python workflow definition model file, so I added you as a co-author on those commits. LMK if you'd prefer to handle this a different way.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Without this, tuple keys ("node", "port") get transformed into "node,port" and destroy the deserialization. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Including child port names Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

And do basic model validation for their interaction with the output labels Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

We only need to convert the format for JSON (so far) -- for python we can retain the original dict-structure. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber · 2026-01-07T18:21:06Z

Ok, I'm happy with this. I extended the data model to always include the inputs and outputs per this comment. The next thing I'd like to stack onto this work is to actually import the fully qualified names and validate the node model against the ast inspection of the functions. This would be either a pre-cursor to or part-of writing a parser to go from a python function object to the recipe model (and from there the decorator is a trivial step).

I really like the file serialization helpers @XzzX made here and think they'd be great on the base NodeModel class, but will leave that to another PR.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

# Conflicts: # flowrep/model.py

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

flowrep/model.py

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

@XzzX

Per @XzzX's suggestion Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

XzzX

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

flowrep/model.py

Co-authored-by: Sebastian Eibl <XzzX@users.noreply.github.com>

liamhuber · 2026-01-16T15:39:21Z

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

Can you be more specific what it is that you don't like about the enums or literals? Is there a feature you wish we had that is missing? I'm not aware of any other python data structures that more concretely satisfy our needs, but if you can highlight a missing need/extraneous feature, maybe I could search better.

One use case we want to handle is version migration under the case of field re-naming. But as far as I understand, this is something we handle on the pydantic side like

class RecipeElementType(StrEnum):
    MACRO = "macro"  # new canonical name

# In your model:
@pydantic.field_validator("type", mode="before")
@classmethod
def migrate_legacy_names(cls, v):
    if v == "workflow":
        return "macro"
    return v

From my perspective, we could switch back to Literal for lower boiler-plate, but we'd lose the tab-completion and object reference inside the codebase. The above is a situation where I like enums over literals, because in my IDE I can go to the enum value declaration, right click, and hit "find usages". For literals I could search for the string, but might get false positives. It's a minor win, but still a win, so overall I prefer the enums.

XzzX · 2026-01-16T18:35:45Z

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

Can you be more specific what it is that you don't like about the enums or literals? Is there a feature you wish we had that is missing? I'm not aware of any other python data structures that more concretely satisfy our needs, but if you can highlight a missing need/extraneous feature, maybe I could search better.

One use case we want to handle is version migration under the case of field re-naming. But as far as I understand, this is something we handle on the pydantic side like
class RecipeElementType(StrEnum):
    MACRO = "macro"  # new canonical name

# In your model:
@pydantic.field_validator("type", mode="before")
@classmethod
def migrate_legacy_names(cls, v):
    if v == "workflow":
        return "macro"
    return v
From my perspective, we could switch back to Literal for lower boiler-plate, but we'd lose the tab-completion and object reference inside the codebase. The above is a situation where I like enums over literals, because in my IDE I can go to the enum value declaration, right click, and hit "find usages". For literals I could search for the string, but might get false positives. It's a minor win, but still a win, so overall I prefer the enums.

After looking at it again with a little more time, I am totally ok with it. I do not know why I was confused today morning.

XzzX

Good job 👍

liamhuber and others added 12 commits January 6, 2026 15:56

Add pydantic dependency

d9c4520

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Scaffold the two most basic models

2ff08fa

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Make the validation stricter

9aa6fff

E.g. to avoid double ".." Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Fix message string

07d1f74

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Black newlines

96d1b14

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Ruff

4240152

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Fill out the WorkflowNode

e294052

And correct the nodes typing Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Add a list of excluded child names

ab5a39c

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Add a cyclicity check

e16e82a

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Black

d9c7cea

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber changed the base branch from main to type_names January 7, 2026 15:36

liamhuber mentioned this pull request Jan 7, 2026

How to represent a function? #21

Closed

liamhuber added 11 commits January 7, 2026 08:45

Add tests

c605ac6

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Fix JSON tuple-key issue

963a93b

Without this, tuple keys ("node", "port") get transformed into "node,port" and destroy the deserialization. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Pull inputs and output up to the base model

1e0068d

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Validate entire graph topology

3de62ed

Including child port names Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Add unpacking arguments suggested by @XzzX

0f07c94

And do basic model validation for their interaction with the output labels Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

lint

8c0b070

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Make output unpacking choices disjoint

0a80c25

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Use an enum instead of a literal

fb139b3

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Fine-grain the edge dumping

3c49f48

We only need to convert the format for JSON (so far) -- for python we can retain the original dict-structure. Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Hold reserved names on the workflow class

b2f49e7

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Rename

298e3e6

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

liamhuber changed the title ~~(WIP) Data models~~ Data models Jan 7, 2026

liamhuber added 9 commits January 12, 2026 11:14

Enforce IO label uniqueness

ef7a54d

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Lint test file

3ed57b9

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Merge branch 'type_names' into data_models

e1d90c0

# Conflicts: # flowrep/model.py

Validate IO labels

b5a6f2c

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Update input names in tests

383708a

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Refactor: merge and rename method

8c51494

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Apply label validation to node labels

c867706

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Remove unused attribute

c7869d9

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Test exception branch

cf43247

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Base automatically changed from type_names to main January 14, 2026 15:55

liamhuber requested a review from XzzX January 14, 2026 15:56

XzzX reviewed Jan 15, 2026

View reviewed changes

flowrep/model.py Outdated Show resolved Hide resolved

flowrep/model.py Outdated Show resolved Hide resolved

flowrep/model.py Outdated Show resolved Hide resolved

flowrep/model.py Show resolved Hide resolved

liamhuber added 5 commits January 15, 2026 07:03

Replace Enum with StrEnum

b2ab71e

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Replace Literal with StrEnum

963ab99

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Freeze type fields

932104c

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Ensure node models specify a type

82fc4bc

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

Return an empty set instead of None

b714694

Per @XzzX's suggestion Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>

XzzX reviewed Jan 16, 2026

View reviewed changes

flowrep/model.py Outdated Show resolved Hide resolved

Update flowrep/model.py

01e2965

Co-authored-by: Sebastian Eibl <XzzX@users.noreply.github.com>

This was referenced Jan 16, 2026

Questions on storing version data #74

Open

Atomic parser #73

Open

This was linked to issues Jan 16, 2026

Representing edges #65

Closed

How to represent a macro/subgraph #63

Closed

How to represent a function? #21

Closed

XzzX approved these changes Jan 16, 2026

View reviewed changes

liamhuber mentioned this pull request Jan 16, 2026

How to represent if - elif - else #22

Open

liamhuber merged commit 686f347 into main Jan 16, 2026
20 checks passed

liamhuber deleted the data_models branch January 16, 2026 21:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Data models #69

Data models #69

liamhuber commented Jan 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

codecov bot commented Jan 7, 2026 •

edited

Loading

Uh oh!

liamhuber commented Jan 7, 2026

Uh oh!

liamhuber commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

XzzX left a comment

Uh oh!

Uh oh!

liamhuber commented Jan 16, 2026

Uh oh!

XzzX commented Jan 16, 2026

Uh oh!

XzzX left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Data models #69

Data models #69

Conversation

liamhuber commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Footnotes

Uh oh!

github-actions bot commented Jan 7, 2026

Uh oh!

codecov bot commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

liamhuber commented Jan 7, 2026

Uh oh!

liamhuber commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

XzzX left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liamhuber commented Jan 16, 2026

Uh oh!

XzzX commented Jan 16, 2026

Uh oh!

XzzX left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

liamhuber commented Jan 7, 2026 •

edited

Loading

codecov bot commented Jan 7, 2026 •

edited

Loading