Skip to content

Conversation

@shchur
Copy link
Contributor

@shchur shchur commented Aug 12, 2025

Issue #, if available:

Description of changes:

  • Previously, categorical dynamic features were kept as object dtype, which broke GluonTS models that accept feat_dynamic_real / past_feat_dynamic_real and attempt to convert them to float32 inside the transform. Now categorical features are encoded as integers (using ordinal encoding).

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@shchur shchur requested a review from abdulfatir August 12, 2025 08:15
df = df.astype(astype_dict)
if category_as_ordinal:
cat_cols = [col for col in df.select_dtypes(include="category").columns if col != id_column]
df = df.assign(**{col: df[col].cat.codes for col in cat_cols})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible alternatives:

  • automatically one-hot-encode categorical columns
  • drop categorical columns

Ideally, this should be a configurable option, but currently the fev.convert_input_data method does not allow routing kwargs to the individual adapters. @abdulfatir what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just discussed target encoding as a good option w/ @abdulfatir . why not also use it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial idea was that adapters perform the bare minimum preprocessing such that the data can be consumed by the respective frameworks, but I agree that we can also incorporate the best practices here.

If we go for target encoding, we should probably enable/disable it via an optional argument to the GluonTSAdapter. Currently these are not supported since fev.convert_input_data does not forward kwargs to the adapters.

How about we

  1. Merge this (or some other simple strategy) as a simple default that unbreaks GluonTS models with covaraites
  2. Add a better strategy after the Task refactor with an optional argument to the GluonTSAdapter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would vote for putting as little model-related stuff here as possible. If the user wants to do other types of encodings, they should do this on the model side.

@shchur shchur merged commit 9a835d0 into main Aug 19, 2025
@shchur shchur deleted the ordinal-encode-cat-features-gluonts branch August 19, 2025 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants