Skip to content

Conversation

codeloop
Copy link
Member

@codeloop codeloop commented Sep 3, 2025

Improve auto-select logic and handle missing data

This commit introduces following to the forecasting operator.

Improved AUTO_SELECT_SERIES Logic:

  • A fallback to the AUTO_SELECT model has been implemented for cases where AUTO_SELECT_SERIES is used without specifying target_category_columns.

Missing Data Handling:

  • The build_fforms_meta_features function now fills missing values in the target column with zeros. This prevents errors during meta-feature calculation when the data contains NaNs.

New Test Case:

  • A new test case has been added to verify that the auto-select-series model functions correctly with datasets containing missing values.

This comment was marked as resolved.

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. label Sep 3, 2025
@codeloop codeloop force-pushed the vikaspa/fix-auto-select branch from 250585d to a57b872 Compare September 3, 2025 15:49
@oracle-contributor-agreement oracle-contributor-agreement bot added OCA Verified All contributors have signed the Oracle Contributor Agreement. and removed OCA Required At least one contributor does not have an approved Oracle Contributor Agreement. labels Sep 3, 2025
@codeloop codeloop changed the title improve auto-select logic and handle missing data [ODSC - 76829/76830] : improve auto-select logic and handle missing data Sep 4, 2025
@codeloop codeloop marked this pull request as ready for review September 4, 2025 15:33
@codeloop codeloop requested a review from prasankh September 4, 2025 15:33
@codeloop codeloop enabled auto-merge September 5, 2025 05:23
@codeloop codeloop changed the title [ODSC - 76829/76830] : improve auto-select logic and handle missing data [ODSC-76829/76830] : improve auto-select logic and handle missing data Sep 5, 2025
Copy link
Member

@ahosler ahosler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments. Just needs a bit of polish

if target_col not in data.columns:
raise ValueError(f"Target column '{target_col}' not found in DataFrame")

data[target_col] = data[target_col].fillna(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why fillna with 0? why no backfill? Did we discuss this?
Don't we already have this covered in pre-processing steps? What are we gaining from this?

):

operator_config.spec.model = AUTO_SELECT
model = ForecastOperatorModelFactory.get_model(operator_config, datasets)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Can we reflect this in the report? Make sure it's still saying "auto-select-series".

Can we add a unit test for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants