-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for feature selection #15
Comments
Not supported at the moment. See jpmml/jpmml-sklearn#3 Very likely, will not be supported in the context of "feature preparation and pre-processing" ever. You should adopt the
JPMML-SkLearn supports
By JPMML-SkLearn conventions, if you are convertinng a supervised estimator object (eg. some classification or regression model), then If you are converting an unsupervised estimator object (eg. some clustering model), then all
Currently there is no feature selection support. So, you would need to perform feature selection "manually" in your SkLearn code, and then construct an appropriate However, built-in support for feature selection seems like a desirable functionality, so perhaps the def sklearn2pmml(estimator, mapper, mapper_to_estimator_connection = None, pmml)
pass The responsibility of the In the beginning, it could be a single SkLearn feature selection transformer. Later on, if the concept is successfully validated, it could be a list transformers or even a pipeline of transformers. |
What kind of feature selector classes do you need? Specifically, class |
Thank you very much for the quick response. |
I'm reopening this issue, because there needs to be support for feature selection in PMML conversion workflows. If the feature selection happens in transformed feature space (eg. after categorical columns have been expanded using |
JPMML-SkLearn version 1.2.0 added limited support for the By definition, the pipeline contains a list of zero or more transformation steps followed by a final estimator step. You cannot do arbitrary feature transformation work there (as it should be encapsulated into the pipeline = Pipeline([
("selector", SelectKBest(k = 5)),
("estimator", ...)
])
sklearn2pmml(pipeline, mapper, "Workflow.pmml") |
Is it possible to directly use a regular sklearn pipeline with sklearn2pmml?
How can I specify the label encoding and split in target / X in for PMML?
How can I perform the selection of continuous / factor fields? would I need to hard code them?
The text was updated successfully, but these errors were encountered: