-
Notifications
You must be signed in to change notification settings - Fork 589
Open
Description
- Add a notebook + video to show how all the pandas code in the Visual inspection of data subsection can be simplified using skrub.TableReport:
- Replace
ColumnTransformerwith skrub.TableVectorizer starting from the Using numerical and categorical variables together notebook- In the same notebook, section Fitting a more powerful model, replace
OrinalEncoderbyskrub.ToCategorical. - Explicitly mention that
TableVectorizermakes the column selection automatically by using itsdtype - Introduce concept of "low/high cardinality" and demonstrate effect of
cardinality_thresholdon the "native-country" column in the Adult Census dataset. - Update visualizing scikit-learn pipelines video to use
TableVectorizer(with scikit-learn version >= 1.8) - Modify wrap-up quizzes that use the Ames Housing dataset i.e. M1, M4 and M5 to select subset of numerical columns with pandas
- In the same notebook, section Fitting a more powerful model, replace
- Redo the datasets description using
TableReport
SebastienMelo
Metadata
Metadata
Assignees
Labels
No labels