Skip to content

Commit

Permalink
Updates for running home credit example with Dask (#953)
Browse files Browse the repository at this point in the history
* home credit tests

* update home credit test

* Improve column assignment for trans features

* update home credit test

* testing updates

* home credit test updates

* home credit notebook update

* home credit test updates

* home credit test updates

* home credit test updates

* home credit test updates

* home credit updates

* update feature_set_calculator.py

* remove unnecessary repartitioning from notebook

* update notebook text

* update notebook to use os.path.join
  • Loading branch information
thehomebrewnerd committed May 12, 2020
1 parent e7de404 commit 8a30486
Show file tree
Hide file tree
Showing 2 changed files with 614 additions and 4 deletions.
7 changes: 3 additions & 4 deletions featuretools/computational_backends/feature_set_calculator.py
Original file line number Diff line number Diff line change
Expand Up @@ -732,7 +732,8 @@ def last_n(df):
# the column (in actuality grouping by both index and group would
# work)
if isinstance(base_frame, dd.core.DataFrame):
to_merge = base_frame.groupby(base_frame[groupby_var]).agg(to_agg)
to_merge = base_frame.groupby(groupby_var).agg(to_agg)

else:
to_merge = base_frame.groupby(base_frame[groupby_var],
observed=True, sort=False).agg(to_agg)
Expand All @@ -747,9 +748,7 @@ def last_n(df):
to_merge.index = to_merge.index.astype(object).astype(categories)

if isinstance(frame, dd.core.DataFrame):
child_merge_var = to_merge.index.name
frame = dd.merge(left=frame, right=to_merge.reset_index(),
left_on=parent_merge_var, right_on=child_merge_var, how='left')
frame = frame.merge(to_merge, left_on=parent_merge_var, right_index=True, how='left')
else:
frame = pd.merge(left=frame, right=to_merge,
left_index=True, right_index=True, how='left')
Expand Down
Loading

0 comments on commit 8a30486

Please sign in to comment.