Skip to content
This repository has been archived by the owner on Jul 3, 2023. It is now read-only.

Adds simple case to help motivate @extract_fields #66

Merged
merged 5 commits into from
Feb 10, 2022

Commits on Feb 7, 2022

  1. Adds simple case to help motivate @extract_outputs

    If someone wanted to use Hamilton to model a modeling dataflow,
    they would struggle. Need a new decorator to handle extracting
    outputs from functions that return multiple things and aren't
    a dataframe.
    skrawcz committed Feb 7, 2022
    Configuration menu
    Copy the full SHA
    95bf960 View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2022

  1. Adds extract_fields decorator for operating over dicts

    The API to use it looks like this:
    
    ```python
    
    @function_modifiers.extract_fields(
        {'X_train': np.ndarray, 'X_test': np.ndarray, 'y_train': np.ndarray, 'y_test': np.ndarray})
    def train_test_split_func( ... ,  ... ) -> dict
    ```
    
    I decided to go with a straight dict of `field_name` to `field_type` because that seemed
    the simplest thing to define. Note, we use the documentation for the original function,
    rather than enabling individual doc strings for the types. I think this suffices for now.
    
    To support TypedDict, I didn't want to have to import typing_extensions to handle
    it. Also you can't inline define a TypedDict class, so it would be more verbose which
    is less that ideal.
    
    We can always add TypedDict support later. Also punted on `Tuple` support -- that
    might be another decorator...
    skrawcz committed Feb 9, 2022
    Configuration menu
    Copy the full SHA
    1f4f6de View commit details
    Browse the repository at this point in the history
  2. Refactors model example to show how to be generic

    This example is interesting because it shows how one might
    build a "bank" of hamilton functions that do some generic
    modeling -- while keeping it generic so that adding new contexts/
    running it with new models, results in a small amount of work.
    
    The key things to get this to work are:
    
     - different python modules to load data. They have to output what's required to link
       with the my_train_evaluate_logic functions.
     - config & @config.when to add the correct model function to the dataflow.
    
    So if you want to switch between model types: easy -- change config.
    So if you want to switch fitting models on different data: easy -- change the data loading module.
    skrawcz committed Feb 9, 2022
    Configuration menu
    Copy the full SHA
    bb896ad View commit details
    Browse the repository at this point in the history
  3. Adds unit tests for extract_fields decorator

    Helps prove things work as intended!
    skrawcz committed Feb 9, 2022
    Configuration menu
    Copy the full SHA
    f5dfc77 View commit details
    Browse the repository at this point in the history
  4. Fixes graphviz test

    I wonder if this is flakey somehow? Anyway adding this to see if circleci complains
    or not. Seems like there could be a version mismatch somewhere that causes this,
    i.e. my local env, versus what circleci installs, etc.
    skrawcz committed Feb 9, 2022
    Configuration menu
    Copy the full SHA
    b359b3b View commit details
    Browse the repository at this point in the history