Skip to content

Fix converters#346

Merged
cristian-tamblay merged 8 commits into
developfrom
fix/converters
Oct 23, 2025
Merged

Fix converters#346
cristian-tamblay merged 8 commits into
developfrom
fix/converters

Conversation

@Irozuku
Copy link
Copy Markdown
Collaborator

@Irozuku Irozuku commented Oct 22, 2025

This pull request refactors several scikit-learn and imbalanced-learn converter classes in the backend to simplify their initialization and schema definitions, especially regarding the handling of missing values and feature selection tags. The main themes are the removal of redundant code for missing value casting, schema simplification, and improved handling of internal parameters for feature selection and sampling converters.

Refactoring and simplification of missing value handling:

  • Removed the cast_string_to_type utility and related code from the KNNImputer, SimpleImputer, and MissingIndicator converters, resulting in cleaner __init__ methods and schemas. This includes dropping the missing_values field from their schemas and removing unnecessary imports. [1] [2] [3] [4] [5] [6] [7]

Feature selection converters improvements:

  • Added explicit __init__ methods to the SelectFdr, SelectFpr, and SelectFwe converters to ensure proper initialization. [1] [2] [3]
  • Updated the SelectKBest and SelectPercentile converters to dynamically set the "requires_y" tag in their __init__ methods, ensuring compatibility with scikit-learn's requirements for target variables. [1] [2]

Other parameter and schema adjustments:

  • Changed the default value of n_components from 100 to 2 in the NystroemSchema, reducing the default number of constructed features.

SMOTEENN converter fix:

  • Modified the SMOTEENNConverter to explicitly create and pass a SMOTE instance as part of its initialization.

Base automatically changed from feat/metadata-explorer-converter to develop October 23, 2025 13:44
@Irozuku Irozuku marked this pull request as ready for review October 23, 2025 13:54
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors scikit-learn and imbalanced-learn converter classes to simplify initialization logic and schema definitions. The main focus is removing redundant missing value handling code and improving feature selection converters.

  • Removes cast_string_to_type utility usage and missing_values schema fields from imputer converters
  • Adds explicit __init__ methods to feature selection converters to set requires_y tag
  • Updates SMOTEENN converter to explicitly instantiate its SMOTE component

Reviewed Changes

Copilot reviewed 10 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
DashAI/back/converters/scikit_learn/simple_imputer.py Removes missing_values schema field and related initialization logic
DashAI/back/converters/scikit_learn/knn_imputer.py Removes missing_values schema field and simplifies imports
DashAI/back/converters/scikit_learn/missing_indicator.py Replaces entire schema with empty pass statement, removes all custom initialization
DashAI/back/converters/scikit_learn/select_k_best.py Adds init to dynamically set requires_y tag
DashAI/back/converters/scikit_learn/select_percentile.py Adds init to dynamically set requires_y tag
DashAI/back/converters/scikit_learn/select_fdr.py Adds minimal init calling super
DashAI/back/converters/scikit_learn/select_fpr.py Adds minimal init calling super
DashAI/back/converters/scikit_learn/select_fwe.py Adds minimal init calling super
DashAI/back/converters/scikit_learn/nystroem.py Changes n_components default from 100 to 2
DashAI/back/converters/imbalanced_learn/smoteenn_converter.py Explicitly creates SMOTE instance with parameters

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread DashAI/back/converters/scikit_learn/select_k_best.py
Comment thread DashAI/back/converters/scikit_learn/select_percentile.py
Comment thread DashAI/back/converters/scikit_learn/nystroem.py
Comment thread DashAI/back/converters/imbalanced_learn/smoteenn_converter.py
@cristian-tamblay cristian-tamblay merged commit 1073140 into develop Oct 23, 2025
18 checks passed
@cristian-tamblay cristian-tamblay deleted the fix/converters branch October 23, 2025 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants