Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Adds documentation showing scalar creation & input #31

Merged
merged 1 commit into from
Dec 23, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ def spend_per_signup(spend: pd.Series, signups: pd.Series) -> pd.Series:
The astute observer will notice we have not defined `spend` or `signups` as functions. That is okay,
this just means these need to be provided as input when we come to actually wanting to create a dataframe.

Note: functions can take or create scalar values too.
skrawcz marked this conversation as resolved.
Show resolved Hide resolved

2. Create a `my_script.py` which is where code will live to tell Hamilton what to do:
```python
import importlib
Expand All @@ -80,6 +82,7 @@ from hamilton import driver
logger = logging.getLogger(__name__)
logging.basicConfig(stream=sys.stdout)
initial_columns = { # load from actuals or wherever -- this is our initial data we use as input.
# Note: these do not have to be all series, they could be scalar inputs.
'signups': pd.Series([1, 10, 50, 100, 200, 400]),
'spend': pd.Series([10, 10, 20, 40, 40, 50]),
}
Expand Down
20 changes: 20 additions & 0 deletions examples/hello_world/my_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,23 @@ def avg_3wk_spend(spend: pd.Series) -> pd.Series:
def spend_per_signup(spend: pd.Series, signups: pd.Series) -> pd.Series:
"""The cost per signup in relation to spend."""
return spend / signups


def spend_mean(spend: pd.Series) -> float:
"""Shows function creating a scalar. In this case it computes the mean of the entire column."""
return spend.mean()


def spend_zero_mean(spend: pd.Series, spend_mean: float) -> pd.Series:
"""Shows function that takes a scalar. In this case to zero mean spend."""
return spend - spend_mean


def spend_std_dev(spend: pd.Series) -> float:
"""Function that computes the standard deviation of the spend column."""
return spend.std()


def spend_zero_mean_unit_variance(spend_zero_mean: pd.Series, spend_std_dev: float) -> pd.Series:
"""Function showing one way to make spend have zero mean and unit variance."""
return spend_zero_mean / spend_std_dev
2 changes: 2 additions & 0 deletions examples/hello_world/my_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
logger = logging.getLogger(__name__)
logging.basicConfig(stream=sys.stdout)
initial_columns = { # load from actuals or wherever -- this is our initial data we use as input.
# Note: these values don't have to be all series, they could be a scalar.
skrawcz marked this conversation as resolved.
Show resolved Hide resolved
'signups': pd.Series([1, 10, 50, 100, 200, 400]),
'spend': pd.Series([10, 10, 20, 40, 40, 50]),
}
Expand All @@ -21,6 +22,7 @@
'signups',
'avg_3wk_spend',
'spend_per_signup',
'spend_zero_mean_unit_variance'
]
# let's create the dataframe!
df = dr.execute(output_columns, display_graph=True)
Expand Down