[To discuss] `@node` decorator syntax #2471

merelcht · 2023-03-27T14:11:58Z

Introduction

ChatGPT once suggested this syntax to a user

Is this a syntax we'd like to add to Kedro?

jasonmhite · 2023-04-27T13:51:53Z

Some folks on my team have been doing this for a while and it's pretty ergonomic. A basic implementation is trivial, just

from kedro.pipeline import node

def make_node(**kwargs):
  return lambda func: node(func=func, **kwargs)

antonymilne · 2023-05-04T14:05:20Z

Copying here my comments from #2408.

As noted already, there's a problem here if you want to reuse the same function in multiple nodes and when you start using modular pipelines. These are not insurmountable with the decorator approach but it does get cumbersome.

That said, I actually think there's potentially a lot of value in the decorator method for simple projects. Just like auto-registration of pipelines, anything we can do to make it easier for people to make a simple project is good in my book. Certainly just slapping a node decorator on a function is easier than needing to make a pipeline.py file and probably sufficient in lots of cases.

In fact, a while ago there was a QB tool that did something similar and took the simplification even further:

use the name of the function arguments as dataset names by default (which is very common practice already), so you don't even need to write out the inputs mapping unless it's something other than the default
use the name of the function to generate the the output(s?) name by default, so you don't even need to write out the outputs mapping unless it's something other than the default
automatically detect all functions decorated with node so that you don't even need to write a pipeline.py file at all
in cases where I don't want to change input/output name or apply tags etc., there's in fact no need to use the decorator at all - go through and automatically detect node functions by finding functions with a certain pattern in their name like node_xyz and automatically make the pipeline based on this

I actually think there's a lot to be said for at least some aspects of this approach, although it won't ever replace the current system.

noklam · 2023-05-04T15:13:41Z

use the name of the function arguments as dataset names by default (which is very common practice already), so you don't even need to write out the mapping of the input unless it's something other than the default

This is a reasonable default, but I suspect this is less scalable as you are likely to use a generic variable name like df, and these name collide.
In addition, I use CAPTIAL letter for the persisted dataset as a convention (I think I saw similar patterns elsewhere as well), this is to differentiate the MemoryDataSet that is generated by Kedro automatically. I

use the name of the function to generate the the output(s?) name by default, so you don't even need to write out the outputs mapping unless it's something other than the default

automatically detect all functions decorated with node so that you don't even need to write a pipeline.py file at all

this is my favourite part, It's sometimes annoying to search over 10 pipelines.py.

in cases where I don't want to change input/output name or apply tags etc., there's in fact no need to use the decorator at all - go through and automatically detect node functions by finding functions with a certain pattern in their name like node_xyz and automatically make the pipeline based on this

jasonmhite · 2023-05-10T14:02:24Z

Definitely agree with @antonymilne , this functionality definitely would not replace the existing scheme, just act as sugar for what I suspect is a decent chunk of use cases. The possibility to automatically infer names and arguments would be slick.

noklam · 2023-06-03T08:30:15Z

Just coming back to it again, it may be quite useful for learning kedro smoothing.

I have been teaching some people about Kedro lately and sometimes they get stuck about how to write a node , then I will ask "forget about Kedro, how would you write it in normal python script?". Suddenly they can write the code properly and then just map it to Kedro.

I suspect this happen more for beginner when they trying to jump too fast and think about node. @astrojuanlu Have you ever encountered this, is it worth to check with other people who teaches Kedro?

astrojuanlu · 2023-06-26T13:58:00Z

Hi @astrojuanlu, I thought of the decorator syntax but didn't find it satisfactory because of 2 reasons:

It conflates functions and nodes. I have functions that are reused in different pipelines (and want to keep open the ability to reuse others in new pipelines). Tying functions to specific inputs and outputs makes reuse harder.

It reduces readability because the pipeline is now scattered in different places. To grasp a pipeline, one would need to match input/output dataset names and scroll up and down to collect the nodes in their head. Whereas with the proposed pipeline builder, everything is in one block and one can understand a pipeline with just a glance.

Originally posted by @mle-els in #2726 (comment)

astrojuanlu · 2023-11-25T08:48:19Z

I think it's safe to say that the current approach will not go away for the time being, so the question would be whether to maintain this as an alternative, recommended approach. And from that perspective, it looks like a lot of documentation and maintainability effort, plus the potential confusion of users having to pick between the two approaches, all of this for little gain. There have been some very valid concerns about this approach, most importantly that it blends functions with nodes and that hampers reusability of said functions by coupling them to Kedro.

I'm closing this issue, if you disagree and think this deserves further discussion feel free to leave a comment.

merelcht mentioned this issue Mar 27, 2023

Make node to accept input/outputs in a consistent manner #2408

Open

merelcht changed the title ~~To Discuss: @node decorator syntax~~ [To discuss] @node decorator syntax Mar 29, 2023

noklam mentioned this issue Apr 27, 2023

Override new pipeline template #2543

Closed

astrojuanlu mentioned this issue Jun 26, 2023

Pipeline builder #2726

Closed

astrojuanlu closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[To discuss] `@node` decorator syntax #2471

[To discuss] `@node` decorator syntax #2471

merelcht commented Mar 27, 2023

jasonmhite commented Apr 27, 2023

antonymilne commented May 4, 2023

noklam commented May 4, 2023

jasonmhite commented May 10, 2023

noklam commented Jun 3, 2023

astrojuanlu commented Jun 26, 2023

astrojuanlu commented Nov 25, 2023

[To discuss] @node decorator syntax #2471

[To discuss] @node decorator syntax #2471

Comments

merelcht commented Mar 27, 2023

Introduction

jasonmhite commented Apr 27, 2023

antonymilne commented May 4, 2023

noklam commented May 4, 2023

jasonmhite commented May 10, 2023

noklam commented Jun 3, 2023

astrojuanlu commented Jun 26, 2023

astrojuanlu commented Nov 25, 2023

[To discuss] `@node` decorator syntax #2471

[To discuss] `@node` decorator syntax #2471