-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename target_dataframe_name parameter to target_dataframe_index #353
Conversation
gsheni
commented
Nov 24, 2022
- Closes Rename target_dataframe_name parameter to target_dataframe_index #348
Codecov Report
@@ Coverage Diff @@
## main #353 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 29 29
Lines 1339 1339
=========================================
Hits 1339 1339
|
@@ -20,11 +20,9 @@ def describe_label_times(label_times): | |||
metadata = label_times.settings | |||
target_column = metadata["label_times"]["target_columns"][0] | |||
target_type = metadata["label_times"]["target_types"][target_column] | |||
target_dataframe_name = metadata["label_times"]["target_dataframe_name"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure about what this does and if we should remove this or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I look at the LabelTimes
class definition, it seems like we have some pretty confusing terminology. I think in one sense target_column_name
refers to the column we want to use as the "target" for creating labels (as in, I want to create labels for my customers using customer_id
).
Then target_column
here seems to refer to the column that is the machine learning target column (the label column, as in, total_spent
for each customer).
I'm not sure how best to resolve this off the top of my head, but we should probably only use target in one case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeff-hernandez Any idea what to do here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent is to use the same terminology as featuretools. For example, in the past, we used target_entity
to specify which entity to create features for. Similarly, the idea was the use target_entity
to specify which entity to create labels for. The reason we put it in the metadata is so that we have the settings for reconstructing the label times if we want to re-create the labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe using target_column_index
instead of target_column_name
can help clarify (or target_dataframe_index
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, just had a couple of comments.