Skip to content

Conversation

@liamhuber
Copy link
Member

@liamhuber liamhuber commented Jan 7, 2026

TODO: tests done

Adds pydantic models for the atomic1 and workflow node types.
Edges are structured as dictionaries per #65, and workflows all nodes explicitly specify their IO labels per #63.

Out of scope (will stack as a separate PRs):

  • Parsing python into these types
  • Parsers-as-decorators
  • Other graph elements (for, ...)
  • Metadata and things pertaining to node store, etc.

Footnotes

  1. I'm running with "atomic" right now since both Sam and I are fine with it, but this is subject to change.

liamhuber and others added 12 commits January 6, 2026 15:56
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
I'm lifting the validator directly from @XzzX's [`PythonWorkflowDefinitionFunctionNode`](https://github.com/XzzX/python-workflow-definition/blob/fec059137d5c23a5983a798d347a50dbb911e56b/src/python_workflow_definition/models.py#L57)

Co-authored-by: Sebastian Eibl <xzzx@users.noreply.github.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Again, lifted from @XzzX's attack for the python workflow definition [`PythonWorkflowDefinitionNode`](https://github.com/XzzX/python-workflow-definition/blob/fec059137d5c23a5983a798d347a50dbb911e56b/src/python_workflow_definition/models.py#L68)

Co-authored-by: Sebastian Eibl <xzzx@users.noreply.github.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
E.g. to avoid double ".."

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
And correct the nodes typing

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
@github-actions
Copy link

github-actions bot commented Jan 7, 2026

Binder 👈 Launch a binder notebook on branch pyiron/flowrep/data_models

@codecov
Copy link

codecov bot commented Jan 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.52%. Comparing base (5a6a828) to head (01e2965).
⚠️ Report is 53 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #69      +/-   ##
==========================================
+ Coverage   95.50%   96.52%   +1.01%     
==========================================
  Files           4        4              
  Lines         668      805     +137     
==========================================
+ Hits          638      777     +139     
+ Misses         30       28       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@liamhuber
Copy link
Member Author

@XzzX, I lifted almost verbatim a couple pydantic snippets from your python workflow definition model file, so I added you as a co-author on those commits. LMK if you'd prefer to handle this a different way.

@liamhuber liamhuber changed the base branch from main to type_names January 7, 2026 15:36
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Without this, tuple keys ("node", "port") get transformed into "node,port" and destroy the deserialization.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Including child port names

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
And do basic model validation for their interaction with the output labels

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
We only need to convert the format for JSON (so far) -- for python we can retain the original dict-structure.

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
@liamhuber
Copy link
Member Author

Ok, I'm happy with this. I extended the data model to always include the inputs and outputs per this comment. The next thing I'd like to stack onto this work is to actually import the fully qualified names and validate the node model against the ast inspection of the functions. This would be either a pre-cursor to or part-of writing a parser to go from a python function object to the recipe model (and from there the decorator is a trivial step).

I really like the file serialization helpers @XzzX made here and think they'd be great on the base NodeModel class, but will leave that to another PR.

@liamhuber liamhuber changed the title (WIP) Data models Data models Jan 7, 2026
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
# Conflicts:
#	flowrep/model.py
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Base automatically changed from type_names to main January 14, 2026 15:55
@liamhuber liamhuber requested a review from XzzX January 14, 2026 15:56
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Per @XzzX's suggestion

Signed-off-by: liamhuber <liamhuber@greyhavensolutions.com>
Copy link
Contributor

@XzzX XzzX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

Co-authored-by: Sebastian Eibl <XzzX@users.noreply.github.com>
@liamhuber
Copy link
Member Author

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

Can you be more specific what it is that you don't like about the enums or literals? Is there a feature you wish we had that is missing? I'm not aware of any other python data structures that more concretely satisfy our needs, but if you can highlight a missing need/extraneous feature, maybe I could search better.

One use case we want to handle is version migration under the case of field re-naming. But as far as I understand, this is something we handle on the pydantic side like

class RecipeElementType(StrEnum):
    MACRO = "macro"  # new canonical name

# In your model:
@pydantic.field_validator("type", mode="before")
@classmethod
def migrate_legacy_names(cls, v):
    if v == "workflow":
        return "macro"
    return v

From my perspective, we could switch back to Literal for lower boiler-plate, but we'd lose the tab-completion and object reference inside the codebase. The above is a situation where I like enums over literals, because in my IDE I can go to the enum value declaration, right click, and hit "find usages". For literals I could search for the string, but might get false positives. It's a minor win, but still a win, so overall I prefer the enums.

@XzzX
Copy link
Contributor

XzzX commented Jan 16, 2026

I do not know if I like the enum/literal stuff. I see what you want to achieve, however. I need to check on Monday if an alternative solution exists.

Can you be more specific what it is that you don't like about the enums or literals? Is there a feature you wish we had that is missing? I'm not aware of any other python data structures that more concretely satisfy our needs, but if you can highlight a missing need/extraneous feature, maybe I could search better.

One use case we want to handle is version migration under the case of field re-naming. But as far as I understand, this is something we handle on the pydantic side like

class RecipeElementType(StrEnum):
    MACRO = "macro"  # new canonical name

# In your model:
@pydantic.field_validator("type", mode="before")
@classmethod
def migrate_legacy_names(cls, v):
    if v == "workflow":
        return "macro"
    return v

From my perspective, we could switch back to Literal for lower boiler-plate, but we'd lose the tab-completion and object reference inside the codebase. The above is a situation where I like enums over literals, because in my IDE I can go to the enum value declaration, right click, and hit "find usages". For literals I could search for the string, but might get false positives. It's a minor win, but still a win, so overall I prefer the enums.

After looking at it again with a little more time, I am totally ok with it. I do not know why I was confused today morning.

Copy link
Contributor

@XzzX XzzX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job 👍

@liamhuber liamhuber merged commit 686f347 into main Jan 16, 2026
20 checks passed
@liamhuber liamhuber deleted the data_models branch January 16, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Representing edges How to represent a macro/subgraph How to represent a function?

3 participants