Skip to content

Rename target_dataframe_name parameter to target_dataframe_index#353

Merged
gsheni merged 26 commits intomainfrom
var_rename
Dec 12, 2022
Merged

Rename target_dataframe_name parameter to target_dataframe_index#353
gsheni merged 26 commits intomainfrom
var_rename

Conversation

@gsheni
Copy link
Copy Markdown
Contributor

@gsheni gsheni commented Nov 24, 2022

@gsheni gsheni self-assigned this Nov 24, 2022
@gsheni gsheni marked this pull request as ready for review November 24, 2022 19:36
@gsheni gsheni requested a review from a team November 24, 2022 19:55
@codecov
Copy link
Copy Markdown

codecov bot commented Nov 24, 2022

Codecov Report

Merging #353 (ae874cb) into main (0b476f1) will not change coverage.
The diff coverage is 100.00%.

@@            Coverage Diff            @@
##              main      #353   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           29        29           
  Lines         1339      1339           
=========================================
  Hits          1339      1339           
Impacted Files Coverage Δ
composeml/conftest.py 100.00% <ø> (ø)
composeml/tests/test_data_slice/test_extension.py 100.00% <ø> (ø)
composeml/tests/test_featuretools.py 100.00% <ø> (ø)
composeml/tests/test_label_maker.py 100.00% <ø> (ø)
composeml/tests/test_label_serialization.py 100.00% <ø> (ø)
composeml/tests/test_label_times.py 100.00% <ø> (ø)
composeml/label_maker.py 100.00% <100.00%> (ø)
composeml/label_times/description.py 100.00% <100.00%> (ø)
composeml/label_times/object.py 100.00% <100.00%> (ø)
...mposeml/tests/test_label_transforms/test_sample.py 100.00% <100.00%> (ø)

metadata = label_times.settings
target_column = metadata["label_times"]["target_columns"][0]
target_type = metadata["label_times"]["target_types"][target_column]
target_dataframe_name = metadata["label_times"]["target_dataframe_name"]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure about what this does and if we should remove this or not.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I look at the LabelTimes class definition, it seems like we have some pretty confusing terminology. I think in one sense target_column_name refers to the column we want to use as the "target" for creating labels (as in, I want to create labels for my customers using customer_id).

Then target_column here seems to refer to the column that is the machine learning target column (the label column, as in, total_spent for each customer).

I'm not sure how best to resolve this off the top of my head, but we should probably only use target in one case.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeff-hernandez Any idea what to do here?

Copy link
Copy Markdown
Contributor

@jeff-hernandez jeff-hernandez Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent is to use the same terminology as featuretools. For example, in the past, we used target_entity to specify which entity to create features for. Similarly, the idea was the use target_entity to specify which entity to create labels for. The reason we put it in the metadata is so that we have the settings for reconstructing the label times if we want to re-create the labels.

Copy link
Copy Markdown
Contributor

@jeff-hernandez jeff-hernandez Dec 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe using target_column_index instead of target_column_name can help clarify (or target_dataframe_index)

Comment thread docs/source/release_notes.rst
Gaurav Sheni added 2 commits December 12, 2022 14:02
@gsheni gsheni changed the title Rename target_dataframe_name parameter to target_column_name Rename target_dataframe_name parameter to target_dataframe_index Dec 12, 2022
Comment thread composeml/label_maker.py Outdated
jeff-hernandez
jeff-hernandez previously approved these changes Dec 12, 2022
Copy link
Copy Markdown
Contributor

@jeff-hernandez jeff-hernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just had a couple of comments.

Comment thread composeml/label_times/description.py
@gsheni gsheni enabled auto-merge (squash) December 12, 2022 20:26
@gsheni gsheni disabled auto-merge December 12, 2022 20:45
@jeff-hernandez jeff-hernandez self-requested a review December 12, 2022 20:49
@gsheni gsheni merged commit c559039 into main Dec 12, 2022
@gsheni gsheni mentioned this pull request Jan 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rename target_dataframe_name parameter to target_dataframe_index

3 participants