Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DateTimeFeaturizer should return integer-valued features #1477

Closed
freddyaboulton opened this issue Nov 30, 2020 · 0 comments · Fixed by #1479
Closed

DateTimeFeaturizer should return integer-valued features #1477

freddyaboulton opened this issue Nov 30, 2020 · 0 comments · Fixed by #1479
Assignees
Labels
enhancement An improvement to an existing feature.
Milestone

Comments

@freddyaboulton
Copy link
Contributor

freddyaboulton commented Nov 30, 2020

The DateTimeFeaturizer returns string-valued features for month and day of week. These are useful features but the string values necessitate having an encoder downstream in the pipeline. The DateTimeFeaturizer should handle the encoding and provide a mapping so users know how the encoding happened.

Proposed output:

image

New api method:

dt = DateTimeFeaturizer()
dt.fit(X, y)
dt.get_feature_names()
{"Date_month: ["January", ..., "December"],
 "Date_day_of_week: ["Monday, ..., "Sunday"]
}
@freddyaboulton freddyaboulton added the enhancement An improvement to an existing feature. label Nov 30, 2020
@freddyaboulton freddyaboulton added this to the December 2020 milestone Nov 30, 2020
@freddyaboulton freddyaboulton self-assigned this Nov 30, 2020
@freddyaboulton freddyaboulton changed the title DateTimeFeaturizer Should Return Integer-valued columns DateTimeFeaturizer should return integer-valued features Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement to an existing feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant