Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds group feature for injecting kwargs #107

Merged
merged 3 commits into from
Mar 13, 2023

Conversation

elijahbenizzy
Copy link
Collaborator

We were doing this in the replacement function. Looks like everythin ghas to go in the replacement function, which makes things slightly trickier, but at least we can centralize the logic in that function.

[Short description explaining the high-level reason for the pull request]

Changes

How I tested this

Notes

Checklist

  • PR has an informative and human-readable title (this will be pulled into the release notes)
  • Changes are limited to a single goal (no scope creep)
  • Code passed the pre-commit check & code is left cleaner/nicer than when first encountered.
  • Any change in functionality is tested
  • New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future TODOs are captured in comments
  • Project documentation has been updated if adding/changing functionality.

@elijahbenizzy elijahbenizzy changed the title Removes unecessary line in parameterize Adds group feature for injecting kwargs Mar 9, 2023
@elijahbenizzy elijahbenizzy force-pushed the grouped-parameterize branch 2 times, most recently from eb26af0 to d3db8e5 Compare March 11, 2023 00:06
@elijahbenizzy elijahbenizzy changed the base branch from main to configure-everything March 11, 2023 02:29
@elijahbenizzy elijahbenizzy force-pushed the grouped-parameterize branch 2 times, most recently from 807ef28 to 96f4457 Compare March 11, 2023 05:40
@skrawcz
Copy link
Collaborator

skrawcz commented Mar 12, 2023

Part of me wants the grouped dependency to be a dictionary (or list of tuples) and not a straight list. Why? So that someone could do this and get the correct behavior:

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """One would likely expect the series to be named and thus the output to be "foo", "bar", "baz"... but it's not guaranteed."""
    return pd.concat(series_list)

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_list: List[Tuple[str, pd.Series]]) -> pd.DataFrame:
    """This would be guaranteed to work"""
    return pd.DataFrame(dict(series_list))

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_dict: Dict[str, pd.Series]) -> pd.DataFrame:
    """This would be guaranteed to work"""
    return pd.DataFrame(series_dict)

Or maybe we can do all the above and know what to do based on the parameter type annotation?

@skrawcz
Copy link
Collaborator

skrawcz commented Mar 12, 2023

Otherwise @elijahbenizzy should someone be able to do this:

@inject(group(source("foo"), source("foo"), source("foo")))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """Should we have three? or just one?"""
    return sum(series_list)

constant = pd.Series([1,2,3,4])
@inject(group(source("foo"), source("boo"), source("baz"), value(constant)))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """Are values allowed?"""
    return sum(series_list)

@elijahbenizzy
Copy link
Collaborator Author

Otherwise @elijahbenizzy should someone be able to do this:

@inject(group(source("foo"), source("foo"), source("foo")))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """Should we have three? or just one?"""
    return sum(series_list)

Ughh, yeah, currently this is broken but not for great reasons. Need to fix + add tests. Was thinking of releasing then fixing -- as this is a weird case. IMO the strangest case that will currently break is:

@Inject(group(source("foo"), source("boo"), source("baz")))
def my_df(series_list: List[pd.Series], foo: pd.Series) -> pd.DataFrame:
"""Are values allowed?"""
return sum(series_list)

This is busted due to the popping of kwargs when we parse

constant = pd.Series([1,2,3,4])
@Inject(group(source("foo"), source("boo"), source("baz"), value(constant)))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
"""Are values allowed?"""
return sum(series_list)

This is fine by me -- it'll just work I think?

@skrawcz
Copy link
Collaborator

skrawcz commented Mar 12, 2023

Otherwise @elijahbenizzy should someone be able to do this:

@inject(group(source("foo"), source("foo"), source("foo")))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """Should we have three? or just one?"""
    return sum(series_list)

Ughh, yeah, currently this is broken but not for great reasons. Need to fix + add tests. Was thinking of releasing then fixing -- as this is a weird case.

I would argue against allowing duplicates, since you can achieve that another way more clearly.

IMO the strangest case that will currently break is:

@Inject(group(source("foo"), source("boo"), source("baz")))
def my_df(series_list: List[pd.Series], foo: pd.Series) -> pd.DataFrame:
"""Are values allowed?"""
return sum(series_list)

This is busted due to the popping of kwargs when we parse

constant = pd.Series([1,2,3,4])
@Inject(group(source("foo"), source("boo"), source("baz"), value(constant)))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
"""Are values allowed?"""
return sum(series_list)

This is fine by me -- it'll just work I think?

I don't think we want that to work TBH. What's the "name" of the constant?

@elijahbenizzy
Copy link
Collaborator Author

Part of me wants the grouped dependency to be a dictionary (or list of tuples) and not a straight list. Why? So that someone could do this and get the correct behavior:

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_list: List[pd.Series]) -> pd.DataFrame:
    """One would likely expect the series to be named and thus the output to be "foo", "bar", "baz"... but it's not guaranteed."""
    return pd.concat(series_list)

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_list: List[Tuple[str, pd.Series]]) -> pd.DataFrame:
    """This would be guaranteed to work"""
    return pd.DataFrame(dict(series_list))

@inject(group(source("foo"), source("bar"), source("baz")))
def my_df(series_dict: Dict[str, pd.Series]) -> pd.DataFrame:
    """This would be guaranteed to work"""
    return pd.DataFrame(series_dict)

Or maybe we can do all the above and know what to do based on the parameter type annotation?

Yeah, so I think we should allow it to be a dict depending on the annotation -- it should be pretty easy. That said, we could just as well do a grouped_dict as well as grouped_list, so you could have to annotate it in the right way...

@elijahbenizzy elijahbenizzy force-pushed the grouped-parameterize branch 4 times, most recently from 6b79312 to 6f3828f Compare March 12, 2023 22:19
@elijahbenizzy elijahbenizzy force-pushed the configure-everything branch 2 times, most recently from 64f6eed to 5e21e4b Compare March 12, 2023 23:09
A few things we need to iron out, but this allows group() in addition to
@source and @value.
This is a little hacky but the API is solid. Basically this is a
parameterized of 1 parameterization. It goes very well with
config-driven pipelines, allowing you to group a set of functions to do
stuff with.
@elijahbenizzy elijahbenizzy marked this pull request as ready for review March 12, 2023 23:19
There is a bit of duplicated code here, but its well-tested and solves
a common problem. At some point we'll clean it up when we rewrite
decorators, but for now this is OK.

This allows mapping between the following:
- group(*args) -> List[...]
- group(**kwargs) -> Dict[str, ...]
Anything else is not supported yet.
@elijahbenizzy
Copy link
Collaborator Author

elijahbenizzy commented Mar 13, 2023

Merging this to consolidate PRs -- these two were using similar pieces so I wanted to merge them together. See https://github.com/DAGWorks-Inc/hamilton/tree/configure-everything for continuation.

@elijahbenizzy elijahbenizzy merged commit 08e3072 into configure-everything Mar 13, 2023
@elijahbenizzy elijahbenizzy deleted the grouped-parameterize branch March 13, 2023 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants