From 1bf55e7318a914125f60f46b2f7770d83e58278b Mon Sep 17 00:00:00 2001 From: Gaurav Gupta <47334368+gaugup@users.noreply.github.com> Date: Wed, 20 Apr 2022 17:01:11 -0700 Subject: [PATCH] Add pre-built cohort into adult census notebook (#1243) * [WIP] Add pre-built cohort into adult census notebook Signed-off-by: Gaurav Gupta * erroranalysis version bump in raiwidgets to 0.1.31 (#1245) * Make cohrtData empty list in case no pre-bdefined cohorts are injected (#1247) Signed-off-by: Gaurav Gupta * Simplify the train pipeline responsibleaidashboard-census-classification-model-debugging.ipynb (#1195) * Simplify the train pipeline responsibleaidashboard-census-classification-model-debugging.ipynb Signed-off-by: Gaurav Gupta * Address code review comments * Update notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb Co-authored-by: Roman Lutz Co-authored-by: Roman Lutz Signed-off-by: Gaurav Gupta * Add regression test for pre-defined cohorts in raiwidgets (#1249) Signed-off-by: Gaurav Gupta * color (#1248) * Add feature importance box & bar chart (#1241) * refactor * build * build * temp * temp * temp * temp * box * cache * e2e * e2e * fix * e2e fix * e2e * fix e2e * widget * widget * fix * widget * e2e * e2e * e2e * test * test * PreBuilt cohorts UX changes (#1242) * Intial SDK implementation cohorts Signed-off-by: Gaurav Gupta * Add basic validationf for cohorts Signed-off-by: Gaurav Gupta * Add serialized version of cohort config to ResponsibleAiDashboard Signed-off-by: Gaurav Gupta * Add more tests cohorts Signed-off-by: Gaurav Gupta * fix broken builds due to pip upgrade which broke pip-tools (#1185) * refactor matrix filter and area state to be private static (#1179) * Change variable name Signed-off-by: Gaurav Gupta * Add more cohort filters Signed-off-by: Gaurav Gupta * Add cohort data to dashboard e2e Signed-off-by: Gaurav Gupta * Add more cohorts filters Signed-off-by: Gaurav Gupta * Document various data validation for cohorts Signed-off-by: Gaurav Gupta * Add new interfaces for pre-built cohort Signed-off-by: Gaurav Gupta * Add more cohort filters Signed-off-by: Gaurav Gupta * Add prebuilt cohort walking logic in UI and add more data validation scenarios Signed-off-by: Gaurav Gupta * Add basic data validation checks Signed-off-by: Gaurav Gupta * Add logic to translate the Index cohort filter Signed-off-by: Gaurav Gupta * Remove commented out code Signed-off-by: Gaurav Gupta * Add SDK validations for Index based cohort filter Signed-off-by: Gaurav Gupta * Add code for validating classification outcome Signed-off-by: Gaurav Gupta * Add error filter validations and add tests Signed-off-by: Gaurav Gupta * Add fake cohorts for regression dataset Signed-off-by: Gaurav Gupta * Add fake cohorts for multi-class classification dataset Signed-off-by: Gaurav Gupta * Add handling of regression filter Signed-off-by: Gaurav Gupta * Add support for classification outcome in UI Signed-off-by: Gaurav Gupta * Add validations for Predicted Y and True Y cohort filters Signed-off-by: Gaurav Gupta * Add UI code to handle prediced Y and true Y for pre-built cohort filters Signed-off-by: Gaurav Gupta * Add cohort validation with test data to raiwidgets Signed-off-by: Gaurav Gupta * Add tests for validating Predicted/True Y cohorts Signed-off-by: Gaurav Gupta * Add UI support for TrueY/PredictedY for classification Signed-off-by: Gaurav Gupta * Rename cohort_filter_list to cohort_list Signed-off-by: Gaurav Gupta * Rename UI varibles to match SDK Signed-off-by: Gaurav Gupta * Fix duplicate cohort name Signed-off-by: Gaurav Gupta * Add SDK cohorts to notebook Signed-off-by: Gaurav Gupta * Add dataset validations and add categorical features Signed-off-by: Gaurav Gupta * Add validations for categorical_features Signed-off-by: Gaurav Gupta * Fix sorted imports Signed-off-by: Gaurav Gupta * Add code for translating categorical values Signed-off-by: Gaurav Gupta * Move cohort processing to a separate file Signed-off-by: Gaurav Gupta * Fix code review comments Signed-off-by: Gaurav Gupta * Refactor cohort translated function into different small functions Signed-off-by: Gaurav Gupta * Change to lowercase for outcome Signed-off-by: Gaurav Gupta * Fix code review comments Signed-off-by: Gaurav Gupta * Refactor cohort_list validations and converge pytest common functions into fixtures Signed-off-by: Gaurav Gupta * Add conftest into raiwidgets tests Signed-off-by: Gaurav Gupta * Add validations for cohort list Signed-off-by: Gaurav Gupta * Add cohortData test Signed-off-by: Gaurav Gupta * Fix sorted imports Signed-off-by: Gaurav Gupta * isort fix Signed-off-by: Gaurav Gupta * Add UI unit tests for cohort translation Signed-off-by: Gaurav Gupta * Add more checks in UI uni test Signed-off-by: Gaurav Gupta * Add UI tests for regression cohorts Signed-off-by: Gaurav Gupta * REmove notebook change Signed-off-by: Gaurav Gupta * Fix typescript build Signed-off-by: Gaurav Gupta * Change cohort filter values so that cohort filters non-zero points Signed-off-by: Gaurav Gupta * Fix for empty cohort list Signed-off-by: Gaurav Gupta * Simplify the train pipeline responsibleaidashboard-census-classification-model-debugging.ipynb (#1195) * Simplify the train pipeline responsibleaidashboard-census-classification-model-debugging.ipynb Signed-off-by: Gaurav Gupta * Address code review comments * Update notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb Co-authored-by: Roman Lutz Co-authored-by: Roman Lutz * Propagate error strings instead of raising exceptions Signed-off-by: Gaurav Gupta * Fix code issues Signed-off-by: Gaurav Gupta * Fix code review comments Signed-off-by: Gaurav Gupta * Fix code review comments Signed-off-by: Gaurav Gupta Co-authored-by: Ilya Matiach Co-authored-by: Roman Lutz * Make _cohort.py module a public module (#1253) * Make _cohort.py a public module Signed-off-by: Gaurav Gupta * Add missing file Signed-off-by: Gaurav Gupta * fix notebook build failures due to pywinpty dependency release failing in python 3.6 (#1257) * fix notebook build failures due to pywinpty dependency release failing in python 3.6 * build pywinpty from conda instead * add lowerbound * fixup * fixup * Add supported models and data types to README.md responsibleai (#1259) Signed-off-by: Gaurav Gupta * make getting-started notebook a markdown file showing APIs (#1223) * refactor tabs out of RAI dashboard into a separate component (#1256) * Add individual causal scatter chart (#1258) * temp * refactor * test * style fix * comment * minor fix to url for responsibleai package in setup.py (#1260) * Fix UX e2e tests and address code review comments Signed-off-by: Gaurav Gupta * Fix eslint Signed-off-by: Gaurav Gupta * Address review comments Signed-off-by: Gaurav Gupta * Reset the number of samples in test dataset Signed-off-by: Gaurav Gupta Co-authored-by: Ilya Matiach Co-authored-by: Roman Lutz Co-authored-by: Bo Zhang <71688188+zhb000@users.noreply.github.com> --- .../modelAssessmentDatasets.ts | 13 ++++ .../describeModelPerformanceSideBar.ts | 14 +++- ...ensus-classification-model-debugging.ipynb | 78 ++++++++++++++++++- 3 files changed, 101 insertions(+), 4 deletions(-) diff --git a/apps/widget-e2e/src/describer/modelAssessment/modelAssessmentDatasets.ts b/apps/widget-e2e/src/describer/modelAssessment/modelAssessmentDatasets.ts index 30806fad91..c6548aeb25 100644 --- a/apps/widget-e2e/src/describer/modelAssessment/modelAssessmentDatasets.ts +++ b/apps/widget-e2e/src/describer/modelAssessment/modelAssessmentDatasets.ts @@ -56,6 +56,14 @@ const modelAssessmentDatasets = { "capital-loss" ], modelStatisticsData: { + cohortDropDownValues: [ + "All data", + "Cohort Age and Hours-Per-Week", + "Cohort Marital-Status", + "Cohort Index", + "Cohort Predicted Y", + "Cohort True Y" + ], defaultXAxis: "Probability : <=50K", defaultXAxisPanelValue: "Prediction probabilities", defaultYAxis: "Cohort", @@ -115,6 +123,7 @@ const modelAssessmentDatasets = { "s6" ], modelStatisticsData: { + cohortDropDownValues: ["All data"], defaultXAxis: "Error", defaultXAxisPanelValue: "Error", defaultYAxis: "Cohort", @@ -180,6 +189,7 @@ const modelAssessmentDatasets = { ], isRegression: true, modelStatisticsData: { + cohortDropDownValues: ["All data"], defaultXAxis: "Error", defaultXAxisPanelValue: "Error", defaultYAxis: "Cohort", @@ -272,6 +282,7 @@ const modelAssessmentDatasets = { "YrSold" ], modelStatisticsData: { + cohortDropDownValues: ["All data"], defaultXAxis: "Probability : Less than median", defaultXAxisPanelValue: "Prediction probabilities", defaultYAxis: "Cohort", @@ -364,6 +375,7 @@ const modelAssessmentDatasets = { "YrSold" ], modelStatisticsData: { + cohortDropDownValues: ["All data"], hasModelStatisticsComponent: false, hasSideBar: false }, @@ -416,6 +428,7 @@ const modelAssessmentDatasets = { ], isMulticlass: true, modelStatisticsData: { + cohortDropDownValues: ["All data"], defaultXAxis: "Predicted Y", defaultXAxisPanelValue: "Prediction probabilities", defaultYAxis: "Cohort", diff --git a/apps/widget-e2e/src/describer/modelAssessment/modelStatistics/describeModelPerformanceSideBar.ts b/apps/widget-e2e/src/describer/modelAssessment/modelStatistics/describeModelPerformanceSideBar.ts index e540b1c9f1..d12b2f2463 100644 --- a/apps/widget-e2e/src/describer/modelAssessment/modelStatistics/describeModelPerformanceSideBar.ts +++ b/apps/widget-e2e/src/describer/modelAssessment/modelStatistics/describeModelPerformanceSideBar.ts @@ -19,7 +19,12 @@ export function describeModelPerformanceSideBar( }); it("Side bar should be updated with updated values", () => { - cy.get(Locators.MSSideBarCards).should("have.length", 1); + cy.get(Locators.MSSideBarCards).should( + "have.length", + dataShape.modelStatisticsData?.cohortDropDownValues + ? dataShape.modelStatisticsData?.cohortDropDownValues.length + : 0 + ); cy.get(`${Locators.MSCRotatedVerticalBox} button`) .click() .get( @@ -50,7 +55,12 @@ export function describeModelPerformanceSideBar( cy.get(`${Locators.MSCRotatedVerticalBox}`).contains( dataShape.modelStatisticsData?.defaultYAxis || "Cohort" ); - cy.get(Locators.MSSideBarCards).should("have.length", 1); + cy.get(Locators.MSSideBarCards).should( + "have.length", + dataShape.modelStatisticsData?.cohortDropDownValues + ? dataShape.modelStatisticsData?.cohortDropDownValues.length + : 0 + ); }); it("Should have dropdown to select cohort when y axis is changed to different value than cohort", () => { diff --git a/notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb b/notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb index 9fcb2a1390..783dafb5ad 100644 --- a/notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb +++ b/notebooks/responsibleaidashboard/responsibleaidashboard-census-classification-model-debugging.ipynb @@ -252,6 +252,80 @@ "rai_insights.compute()" ] }, + { + "cell_type": "markdown", + "id": "b84c6c0d", + "metadata": {}, + "source": [ + "Compose some cohorts which can be injected into the `ResponsibleAIDashboard`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0994b7d6", + "metadata": {}, + "outputs": [], + "source": [ + "from raiwidgets.cohort import Cohort, CohortFilter, CohortFilterMethods\n", + "\n", + "# Cohort on age and hours-per-week features in the dataset\n", + "cohort_filter_age = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_LESS,\n", + " arg=[65],\n", + " column='age')\n", + "cohort_filter_hours_per_week = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_GREATER,\n", + " arg=[40],\n", + " column='hours-per-week')\n", + "\n", + "user_cohort_age_and_hours_per_week = Cohort(name='Cohort Age and Hours-Per-Week')\n", + "user_cohort_age_and_hours_per_week.add_cohort_filter(cohort_filter_age)\n", + "user_cohort_age_and_hours_per_week.add_cohort_filter(cohort_filter_hours_per_week)\n", + "\n", + "# Cohort on marital-status feature in the dataset\n", + "cohort_filter_marital_status = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_INCLUDES,\n", + " arg=[\"Never-married\", \"Divorced\"],\n", + " column='marital-status')\n", + "\n", + "user_cohort_marital_status = Cohort(name='Cohort Marital-Status')\n", + "user_cohort_marital_status.add_cohort_filter(cohort_filter_marital_status)\n", + "\n", + "# Cohort on index of the row in the dataset\n", + "cohort_filter_index = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_LESS,\n", + " arg=[20],\n", + " column='Index')\n", + "\n", + "user_cohort_index = Cohort(name='Cohort Index')\n", + "user_cohort_index.add_cohort_filter(cohort_filter_index)\n", + "\n", + "# Cohort on predicted target value\n", + "cohort_filter_predicted_y = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_INCLUDES,\n", + " arg=['>50K'],\n", + " column='Predicted Y')\n", + "\n", + "user_cohort_predicted_y = Cohort(name='Cohort Predicted Y')\n", + "user_cohort_predicted_y.add_cohort_filter(cohort_filter_predicted_y)\n", + "\n", + "# Cohort on predicted target value\n", + "cohort_filter_true_y = CohortFilter(\n", + " method=CohortFilterMethods.METHOD_INCLUDES,\n", + " arg=['>50K'],\n", + " column='True Y')\n", + "\n", + "user_cohort_true_y = Cohort(name='Cohort True Y')\n", + "user_cohort_true_y.add_cohort_filter(cohort_filter_true_y)\n", + "\n", + "cohort_list = [user_cohort_age_and_hours_per_week,\n", + " user_cohort_marital_status,\n", + " user_cohort_index,\n", + " user_cohort_predicted_y,\n", + " user_cohort_true_y]" + ] + }, { "cell_type": "markdown", "id": "elder-fleet", @@ -267,7 +341,7 @@ "metadata": {}, "outputs": [], "source": [ - "ResponsibleAIDashboard(rai_insights)" + "ResponsibleAIDashboard(rai_insights, cohort_list=cohort_list)" ] }, { @@ -510,7 +584,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.11" + "version": "3.6.12" } }, "nbformat": 4,