Rename target_dataframe_name parameter to target_dataframe_index#353
Rename target_dataframe_name parameter to target_dataframe_index#353
Conversation
gsheni
commented
Nov 24, 2022
- Closes Rename target_dataframe_name parameter to target_dataframe_index #348
Codecov Report
@@ Coverage Diff @@
## main #353 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 29 29
Lines 1339 1339
=========================================
Hits 1339 1339
|
| metadata = label_times.settings | ||
| target_column = metadata["label_times"]["target_columns"][0] | ||
| target_type = metadata["label_times"]["target_types"][target_column] | ||
| target_dataframe_name = metadata["label_times"]["target_dataframe_name"] |
There was a problem hiding this comment.
I am unsure about what this does and if we should remove this or not.
There was a problem hiding this comment.
When I look at the LabelTimes class definition, it seems like we have some pretty confusing terminology. I think in one sense target_column_name refers to the column we want to use as the "target" for creating labels (as in, I want to create labels for my customers using customer_id).
Then target_column here seems to refer to the column that is the machine learning target column (the label column, as in, total_spent for each customer).
I'm not sure how best to resolve this off the top of my head, but we should probably only use target in one case.
There was a problem hiding this comment.
@jeff-hernandez Any idea what to do here?
There was a problem hiding this comment.
I think the intent is to use the same terminology as featuretools. For example, in the past, we used target_entity to specify which entity to create features for. Similarly, the idea was the use target_entity to specify which entity to create labels for. The reason we put it in the metadata is so that we have the settings for reconstructing the label times if we want to re-create the labels.
There was a problem hiding this comment.
Maybe using target_column_index instead of target_column_name can help clarify (or target_dataframe_index)
jeff-hernandez
left a comment
There was a problem hiding this comment.
Overall looks good, just had a couple of comments.