Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration-driven DAG shaping + grouping #108

Merged
merged 4 commits into from
Mar 16, 2023
Merged

Conversation

elijahbenizzy
Copy link
Collaborator

[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

@elijahbenizzy elijahbenizzy force-pushed the configure-everything branch 9 times, most recently from a4eb9fd to 4b4f996 Compare March 11, 2023 02:36
tests/test_end_to_end.py Outdated Show resolved Hide resolved
Comment on lines 20 to 30
@dynamic(
lambda columns_to_sum_map: parameterize(
**{
key: {"col_1": source(value[0]), "col_2": source(value[1])}
for key, value in columns_to_sum_map.items()
}
),
at=ResolveAt.COMPILE,
)
def generic_summation(col_1: pd.Series, col_2: pd.Series) -> pd.Series:
return col_1 + col_2
Copy link
Collaborator

@skrawcz skrawcz Mar 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@config.inject(
    lambda columns_to_sum_map: parameterize(
        **{
            key: {"col_1": source(value[0]), "col_2": source(value[1])}
            for key, value in columns_to_sum_map.items()
        }
    ),
    at=ResolveAt.COMPILE,
)
def generic_summation(col_1: pd.Series, col_2: pd.Series) -> pd.Series:
    return col_1 + col_2

I think that reads better...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's not bad

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so I don't actually like putting this under config now that I think about it -- its pretty confusing. Specifically, we're not actually injecting anything, we're using it to delay evaluation. Furthermore, it doesn't actually have anything to do with the other config decorators, which people already have a tough time understanding. Its also pretty meta, so I want it kept on its own.

I'm thinking @delay_resolution(until=CONFIG_AVAILABLE, lambda ...). That's very accurate what it does, lives separately, and it nice and readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@resolve(
    when=CONFIG_AVAILABLE,
    decorate_with=lambda: conf_var_1 -> ...):

@elijahbenizzy elijahbenizzy force-pushed the configure-everything branch 4 times, most recently from 64f6eed to 5e21e4b Compare March 12, 2023 23:09
@elijahbenizzy elijahbenizzy marked this pull request as ready for review March 12, 2023 23:19
@elijahbenizzy elijahbenizzy changed the title WIP for configuration-driven DAG shaping Configuration-driven DAG shaping + grouping Mar 13, 2023
@elijahbenizzy elijahbenizzy force-pushed the configure-everything branch 3 times, most recently from 4044982 to 27a9281 Compare March 13, 2023 19:00
See #109

Note this makes the following decisions:
1. Chooses to use the generic `@dynamic` decorator
2. Buries it under a power-user mode
3. Uses the parameter names of the labmda/function to draw from the
   config

We also Deprecate dynamic_transform

This was an old way of doing the same thing. This was never accessible
to users, as it reuqired implementing a subclass and konwing some of
hamilton's internals. delay_resolution solves the same problem, and
gives users access to any of the current decorators.
A few things we need to iron out, but this allows group() in addition to
@source and @value.
This is a little hacky but the API is solid. Basically this is a
parameterized of 1 parameterization. It goes very well with
config-driven pipelines, allowing you to group a set of functions to do
stuff with.
There is a bit of duplicated code here, but its well-tested and solves
a common problem. At some point we'll clean it up when we rewrite
decorators, but for now this is OK.

This allows mapping between the following:
- group(*args) -> List[...]
- group(**kwargs) -> Dict[str, ...]
Anything else is not supported yet.
@elijahbenizzy elijahbenizzy deleted the configure-everything branch March 16, 2023 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants