We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a primitive for Ballet feature engineering pipelines will allow these pipelines to be included in an MLPipeline
Prototype (that is not generic)
{ "name": "predict_census_income.engineer_features", "contributors": [ "Micah Smith <micahs@mit.edu>" ], "documentation": "", "description": "Applies the feature engineering pipeline from the predict_census_income project", "classifiers": { "type": "preprocessor", "subtype": "transformer" }, "modalities": [], "primitive": "predict_census_income.api.make_feature_engineering_pipeline", "fit": { "method": "fit", "args": [ { "name": "X", "type": "pandas.DataFrame" }, { "name": "y", "type": "pandas.DataFrame" } ] }, "produce": { "method": "transform", "args": [ { "name": "X", "type": "pandas.DataFrame" } ], "output": [ { "name": "X", "type": "pandas.DataFrame" } ] }, "hyperparameters": {} }
The text was updated successfully, but these errors were encountered:
Something like this allows the feature engineering pipeline to have access to the unencoded targets for supervised transformations
import mlblocks from ballet import b from sklearn.metrics import classification_report X_df, y_df = b.api.load_data() X_df_te, y_df_te = b.api.load_data(input_dir='data/val') encoder = b.api.encoder y = encoder.fit_transform(y_df) y_te = encoder.transform(y_df_te) pipeline = mlblocks.MLPipeline( primitives=[ 'predict_census_income.engineer_features', 'sklearn.ensemble.RandomForestClassifier', ], input_names={ 'predict_census_income.engineer_features#1': { 'y': 'y_df', } }, ) pipeline.fit(X_df, y, y_df=y_df) y_pred = pipeline.predict(X_df) report = classification_report(y, y_pred, output_dict=True) y_pred_te = pipeline.predict(X_df_te) report_te = classification_report(y_te, y_pred_te, output_dict=True)
Sorry, something went wrong.
Added in #86
No branches or pull requests
Creating a primitive for Ballet feature engineering pipelines will allow these pipelines to be included in an MLPipeline
Prototype (that is not generic)
The text was updated successfully, but these errors were encountered: