Adds parameterized_inputs #104

skrawcz · 2022-04-06T05:23:11Z

This change adds paramterized_inputs decorator, which
is almost a carbon copy of paramterized_input but that it allows
any number of parameters to be mapped.

Why is it a separate class? Well for backwards compatibility
we don't want to break parameterized_input. Should we try to
consolidate them? I think so -- so we should then mark parameterized_input
as deprecated and will be removed in a 2.0 release? ✅

I should then probably update all documentation to reflect parameterized_inputs
and thus the documentation on paramterized_input to either be hidden or
non-existant? Hmm. ✅

Changes

Add parameterized_inputs decorator.
Deprecates parameterized_input decorator.
Adjust documentation to promote parameterized_inputs instead.

Testing

unit tests pass.

Notes

Motivation is a side project and having to write out repetitive functions is laborious -- this is admittedly a power
user mode, but one I think that makes sense to have.
e.g. this looks nice and readable

from hamilton.function_modifiers import parametrized_inputs
import pandas as pd

NUMERIC_FEATURES = ['age', 'height', 'weight', 'body_mass_index', 'transportation_expense',
                    'distance_from_residence_to_work', 'service_time', 'work_load_average_per_day',
                    'hit_target']


@parametrized_inputs(**{f'{k}_mean': dict(feature=k) for k in NUMERIC_FEATURES})
def mean_computer(feature: pd.Series) -> pd.Series:
    """Average of {feature}"""
    return feature.mean()


@parametrized_inputs(**{f'{k}_std_dev': dict(feature=k) for k in NUMERIC_FEATURES})
def std_dev_computer(feature: pd.Series) -> pd.Series:
    """Standard deviation of {feature}"""
    return feature.std()


@parametrized_inputs(**{f'{k}_zero_mean': dict(feature=k, feature_mean=f'{k}_mean') for k in NUMERIC_FEATURES})
def zero_mean_computer(feature: pd.Series, feature_mean: pd.Series) -> pd.Series:
    """Creates zero mean value of {feature} by subtracting {feature_mean}"""
    return feature - feature_mean


@parametrized_inputs(**{f'{k}_zero_mean_unit_var': dict(feature_mean=f'{k}_mean', feature_std_dev=f'{k}_std_dev') 
                        for k in NUMERIC_FEATURES})
def zero_mean_unit_var_computer(feature_mean: pd.Series, feature_std_dev: pd.Series) -> pd.Series:
    """Computes zero mean unit variance of {feature_mean} with {feature_std_dev} to create {output_name}"""
    return feature_mean / feature_std_dev

Checklist

Testing checklist

Python - local testing

python 3.6
python 3.7

skrawcz · 2022-04-06T05:25:00Z

So next steps I'm thinking:

Change docs to default to parameterized_inputs (plural).
Put a deprecation warning on parameterized_input -- and log when it's used.

tests/resources/parametrized_inputs.py

hamilton/function_modifiers.py

elijahbenizzy · 2022-04-11T03:26:34Z

Not completely sure how I feel about this API. Definitely cleaner but it seems that its not particularly consistent with other APIs. Its dryer/slightly easier to read (which is good), but it takes quite a bit of work to figure out what's happening/parse the API. The design philosophy behind decorators is specifically to make it easy to figure out what's going on. I worry that with objects/named tuples its going to be harder to figure out the shape of the DAG.

Honestly, I wonder if that's just an inherent issue with @parametrized_inputs?

elijahbenizzy

Some thoughts

decorators.md

elijahbenizzy

LGTM -- not convinced we want parameterization. I get it if we need later kwargs, but I think using **parameterization is going to look cleaner/be more readable.

hamilton/function_modifiers.py

elijahbenizzy

LGTM!

hamilton/function_modifiers.py

This is a squashed commit of all the commits to create the parameterized_inputs decorator. See the commits below that are in reverse chronological order. Basically, this replaces `parameterized_input`, and provides a more succinct API by using the function doc as a template, removing the need for a tuple. We believe this is a simpler API to read/maintain — the kwargs assumption will make it harder to extend the API, but we don’t think we need to. ——— Consolidated commits below —— Adds format doc function to parameterized_inputs To consolidate formatting in a single place. Want to keep the code DRY. (+7 squashed commits) Squashed commits: [338c02a] Touches up documentation typos And adds reraising original exception for formatting errors. [f2725ec] Adds validate test for parameterized_inputs for the doc string Doc string templates errors should be checked during validate. That way we can throw a better error. [4689177] Changes parameterized_inputs to use kwargs 1. So that we don't scope creep this API. Key words arguments are assumed to be outputs. This removes a level of nesting on the API. 2. We're inline with other decorators in having the kwargs behavior, e.g. config.when. [df571f3] Changes parameterized_inputs to take dict and format doc string With the prior design, the issue was documentation. But after noodling on it and prodding from Eljiah, I realized that the documentation string should just be a template. Since you're going to be using the same function, the doc string should be fairly generic, with only the parameters being passed changing the meaning -- which with templatization, allows for that to come through. Note - stylistically I think we prefer: ```python @parametrized_inputs( parameterization={ 'output_123': dict(input_value_tbd1='input_1', input_value_tbd2='input_2', input_value_tbd3='input_3') } ) ``` because syntax highlighting will make it clearer what is being replaced in the function with what input value. to ```python @parametrized_inputs( parameterization={ 'output_123': { 'input_value_tbd1'='input_1', 'input_value_tbd2='input_2', 'input_value_tbd3'='input_3' } } ) ``` or ```python @parametrized_inputs( parameterization=dict( output_123={ 'input_value_tbd1'='input_1', 'input_value_tbd2='input_2', 'input_value_tbd3'='input_3' } ) ) ``` I think the last two ones are less clear/readable. Otherwise I did have the thought of using actual functions in the mapping. It would then be clearer to trace what is going on with an IDE. However that brings up the possibility of people importing functions and running into import messes... So punting on that for now. We can always add that in later if we think that's required. [0e21e9d] Changes parameterized_input to parameterized_inputs in decorator docs Since parameterized_input is deprecated, we should just remove it from the docs and instead just push parameterized_inputs. [4379a38] Adds ParameterMapping named tuple object to adjust parameterized_inputs People creating a dictionary of tuple to tuples is probably unwieldy. This is a power use case, and thus it should afford a little more friction to use, and the net result of that is that the code should become more readable. We should preference keyword arguments everywhere here -- as that will make the code much more readable than without it. E.g. : ```python @parametrized_inputs( parameters=['input_value_tbd1', 'input_value_tbd2', 'input_value_tbd3'], parameter_mappings=[ ParameterMapping( inputs=['input_1', 'input_2', 'input_3'], output_name='output_123', output_docs='function_with_multiple_inputs called using input_1'), ] ) def func(...) @parametrized_inputs( ['input_value_tbd1', 'input_value_tbd2', 'input_value_tbd3'], [ParameterMapping(['input_1', 'input_2', 'input_3'], 'output_123', 'function_with_multiple_inputs called using input_1')] ) ``` Adjusts tests and documentation accordingly. [c1d5fa8] Adds parameterized_inputs This change adds `paramterized_inputs` decorator, which is almost a carbon copy of `paramterized_input` but that it allows any number of parameters to be mapped. Why is it a separate class? Well for backwards compatibility we don't want to break parameterized_input. Should we try to consolidate them? I think so -- so we should then mark `parameterized_input` as deprecated and will be removed in a 2.0 release? I should then probably update all documentation to reflect `parameterized_inputs` and thus the documentation on `paramterized_input` to either be hidden or non-existant? Hmm.

So that we can warn users not to use this, and instead migrate to `parameterized_inputs` (plural). Not sure when we'd be ready for a 2.X release, but when we do, we should remove this decorator.

skrawcz requested a review from elijahbenizzy April 6, 2022 05:23

skrawcz force-pushed the add_parameterized_inputs branch from 1c45530 to c1d5fa8 Compare April 6, 2022 05:38

elijahbenizzy reviewed Apr 6, 2022

View reviewed changes

tests/resources/parametrized_inputs.py Outdated Show resolved Hide resolved

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

skrawcz marked this pull request as ready for review April 9, 2022 06:24

elijahbenizzy reviewed Apr 11, 2022

View reviewed changes

decorators.md Outdated Show resolved Hide resolved

decorators.md Outdated Show resolved Hide resolved

skrawcz requested a review from elijahbenizzy April 11, 2022 21:37

elijahbenizzy reviewed Apr 11, 2022

View reviewed changes

hamilton/function_modifiers.py Show resolved Hide resolved

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

elijahbenizzy reviewed Apr 12, 2022

View reviewed changes

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

skrawcz requested a review from elijahbenizzy April 12, 2022 06:01

elijahbenizzy approved these changes Apr 12, 2022

View reviewed changes

hamilton/function_modifiers.py Outdated Show resolved Hide resolved

skrawcz force-pushed the add_parameterized_inputs branch from 8e02d7f to b8e4507 Compare April 12, 2022 17:22

skrawcz added 2 commits April 12, 2022 10:29

Adds deprecation warning to parameterized_input (singular)

72ea019

So that we can warn users not to use this, and instead migrate to `parameterized_inputs` (plural). Not sure when we'd be ready for a 2.X release, but when we do, we should remove this decorator.

skrawcz force-pushed the add_parameterized_inputs branch from b8e4507 to 72ea019 Compare April 12, 2022 17:33

skrawcz merged commit b4e874b into main Apr 12, 2022

skrawcz deleted the add_parameterized_inputs branch April 12, 2022 17:35

gitbook-com bot pushed a commit that referenced this pull request Aug 2, 2022

GitBook: [#104] Small updates

05b2c17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds parameterized_inputs #104

Adds parameterized_inputs #104

skrawcz commented Apr 6, 2022 •

edited

Loading

skrawcz commented Apr 6, 2022 •

edited

Loading

elijahbenizzy commented Apr 11, 2022

elijahbenizzy left a comment

elijahbenizzy left a comment

elijahbenizzy left a comment

Adds parameterized_inputs #104

Adds parameterized_inputs #104

Conversation

skrawcz commented Apr 6, 2022 • edited Loading

Changes

Testing

Notes

Checklist

Testing checklist

Python - local testing

skrawcz commented Apr 6, 2022 • edited Loading

elijahbenizzy commented Apr 11, 2022

elijahbenizzy left a comment

Choose a reason for hiding this comment

elijahbenizzy left a comment

Choose a reason for hiding this comment

elijahbenizzy left a comment

Choose a reason for hiding this comment

skrawcz commented Apr 6, 2022 •

edited

Loading

skrawcz commented Apr 6, 2022 •

edited

Loading