Fix converters#346
Merged
Merged
Conversation
…nd select percentile
…ds and parameters
…or proper initialization
…requires_y' tag is set
…from schema fields
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR refactors scikit-learn and imbalanced-learn converter classes to simplify initialization logic and schema definitions. The main focus is removing redundant missing value handling code and improving feature selection converters.
- Removes
cast_string_to_typeutility usage andmissing_valuesschema fields from imputer converters - Adds explicit
__init__methods to feature selection converters to setrequires_ytag - Updates
SMOTEENNconverter to explicitly instantiate itsSMOTEcomponent
Reviewed Changes
Copilot reviewed 10 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| DashAI/back/converters/scikit_learn/simple_imputer.py | Removes missing_values schema field and related initialization logic |
| DashAI/back/converters/scikit_learn/knn_imputer.py | Removes missing_values schema field and simplifies imports |
| DashAI/back/converters/scikit_learn/missing_indicator.py | Replaces entire schema with empty pass statement, removes all custom initialization |
| DashAI/back/converters/scikit_learn/select_k_best.py | Adds init to dynamically set requires_y tag |
| DashAI/back/converters/scikit_learn/select_percentile.py | Adds init to dynamically set requires_y tag |
| DashAI/back/converters/scikit_learn/select_fdr.py | Adds minimal init calling super |
| DashAI/back/converters/scikit_learn/select_fpr.py | Adds minimal init calling super |
| DashAI/back/converters/scikit_learn/select_fwe.py | Adds minimal init calling super |
| DashAI/back/converters/scikit_learn/nystroem.py | Changes n_components default from 100 to 2 |
| DashAI/back/converters/imbalanced_learn/smoteenn_converter.py | Explicitly creates SMOTE instance with parameters |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
cristian-tamblay
approved these changes
Oct 23, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request refactors several scikit-learn and imbalanced-learn converter classes in the backend to simplify their initialization and schema definitions, especially regarding the handling of missing values and feature selection tags. The main themes are the removal of redundant code for missing value casting, schema simplification, and improved handling of internal parameters for feature selection and sampling converters.
Refactoring and simplification of missing value handling:
cast_string_to_typeutility and related code from theKNNImputer,SimpleImputer, andMissingIndicatorconverters, resulting in cleaner__init__methods and schemas. This includes dropping themissing_valuesfield from their schemas and removing unnecessary imports. [1] [2] [3] [4] [5] [6] [7]Feature selection converters improvements:
__init__methods to theSelectFdr,SelectFpr, andSelectFweconverters to ensure proper initialization. [1] [2] [3]SelectKBestandSelectPercentileconverters to dynamically set the"requires_y"tag in their__init__methods, ensuring compatibility with scikit-learn's requirements for target variables. [1] [2]Other parameter and schema adjustments:
n_componentsfrom 100 to 2 in theNystroemSchema, reducing the default number of constructed features.SMOTEENN converter fix:
SMOTEENNConverterto explicitly create and pass aSMOTEinstance as part of its initialization.