diff --git a/notebooks/features/responsible_ai/Interpretability - ICE explainer.ipynb b/notebooks/features/responsible_ai/Interpretability - PDP and ICE explainer.ipynb similarity index 91% rename from notebooks/features/responsible_ai/Interpretability - ICE explainer.ipynb rename to notebooks/features/responsible_ai/Interpretability - PDP and ICE explainer.ipynb index 4d3849c5bb..a2ef0e2770 100644 --- a/notebooks/features/responsible_ai/Interpretability - ICE explainer.ipynb +++ b/notebooks/features/responsible_ai/Interpretability - PDP and ICE explainer.ipynb @@ -25,7 +25,9 @@ } }, "source": [ - "Partial Dependence Plot (PDP) and Individual Condition Expectation (ICE) are interpretation methods which describe the average behavior of a classification or regression model. They are particularly useful when the model developer wants to understand generally how the model depends on individual feature values or to catch unexpected model behavior.\n", + "Partial Dependence Plot (PDP) and Individual Condition Expectation (ICE) are interpretation methods which describe the average behavior of a classification or regression model. They are particularly useful when the model developer wants to understand generally how the model depends on individual feature values, overall model behavior and do debugging.\n", + "\n", + "In terms of [Responsible AI](https://www.microsoft.com/en-us/ai/responsible-ai), understanding which features drive your predictions facilitate the creation of [Transparency Notes](https://docs.microsoft.com/en-us/legal/cognitive-services/language-service/transparency-note), driving not only transparency but accountability while facilitating auditing to meet compliance with regulatory requirements.\n", "\n", "The goal of this notebook is to show how these methods work for a pretrained model." ] @@ -228,7 +230,7 @@ "source": [ "## Partial Dependence Plots\n", "\n", - "Partial dependence plots (PDP) show the dependence between the target response and a set of input features of interest, marginalizing over the values of all other input features. It can show whether the relationship between the target response and the input feature is linear, smooth, monotonic, or more complex. This is relevant when you want to have an overall understanding of model behavior. \n", + "Partial dependence plots (PDP) show the dependence between the target response and a set of input features of interest, marginalizing over the values of all other input features. It can show whether the relationship between the target response and the input feature is linear, smooth, monotonic, or more complex. This is relevant when you want to have an overall understanding of model behavior. E.g. Identifying specific age group have a favorable predictions vs other age groups.\n", "\n", "If you want to learn more please visit [this link](https://scikit-learn.org/stable/modules/partial_dependence.html#partial-dependence-plots)." ] @@ -258,7 +260,24 @@ } }, "source": [ - "To plot PDP we need to set up the instance of `ICETransformer` first and set the `kind` parameter to `average` and then call the `transform` function. For the setup we need to pass the pretrained model, specify the target column (\"probability\" in our case), and pass categorical and numeric feature names." + "To plot PDP we need to set up the instance of `ICETransformer` first and set the `kind` parameter to `average` and then call the `transform` function. \n", + "\n", + "For the setup we need to pass the pretrained model, specify the target column (\"probability\" in our case), and pass categorical and numeric feature names.\n", + "\n", + "Categorical and numeric features can be passed as a list of names. But we can specify parameters for the features by passing a list of dicts where each dict represents one feature. \n", + "\n", + "For the numeric features a dictionary can look like this:\n", + "\n", + "{\"name\": \"capital-gain\", \"numSplits\": 20, \"rangeMin\": 0.0, \"rangeMax\": 10000.0, \"outputColName\": \"capital-gain_dependance\"}\n", + "\n", + "Where the required key-value pair is `name` - the name of the numeric feature. Next key-values pairs are optional: `numSplits` - the number of splits for the value range for the numeric feature (default value is 10), `rangeMin` - specifies the min value of the range for the numeric feature, `rangeMax` - specifies the max value of the range for the numeric feature (if not specified, `rangeMin` and `rangeMax` will be computed from the background dataset), `outputColName` - the name for output column with explanations for the feature (default value is input name of the feature + \"_dependence\").\n", + "\n", + "\n", + "For the categorical features a dictionary can look like this:\n", + "\n", + "{\"name\": \"marital-status\", \"numTopValues\": 10, \"outputColName\": \"marital-status_dependance\"}\n", + "\n", + "Where the required key-value pair is `name` - the name of the numeric feature. Next key-values pairs are optional: `numTopValues` - the max number of top-occurring values to be included in the categorical feature (default value is 100), `outputColName` - the name for output column with explanations for the feature (default value is input name of the feature + _dependence)." ] }, { @@ -780,7 +799,9 @@ "source": [ "#### Example 1: Numeric feature: \"age\"\n", "\n", - "We can overlay the PDP on top of ICE plots. In the graph, the red line shows the PDP plot for the \"age\" feature, and the black lines show ICE plots for 50 randomly selected observations." + "We can overlay the PDP on top of ICE plots. In the graph, the red line shows the PDP plot for the \"age\" feature, and the black lines show ICE plots for 50 randomly selected observations. \n", + "\n", + "The visualization will show that all curves follow a similar course. That means that the PDP (red line) is already a good summary of the relationships between the displayed feature \"age\" and the model's average predictions of \"income\"" ] }, { @@ -845,7 +866,9 @@ "For visualization of categorical features, we are using a star plot.\n", "\n", "- The X-axis here is a circle which is splitted into equal parts, each representing a feature value.\n", - "- The Y-coordinate shows the dependence values. Each line represents a sample observation." + "- The Y-coordinate shows the dependence values. Each line represents a sample observation.\n", + "\n", + "Here we can see that \"Farming-fishing\" drives the least predictions - because values accumulated near the lowest probabilities, but, for example, \"Exec-managerial\" seems to have one of the highest impacts for model predictions." ] }, { @@ -916,7 +939,7 @@ "\n", "Partial dependence plots (PDP) and Individual Conditional Expectation (ICE) plots can be used to visualize and analyze interaction between the target response and a set of input features of interest.\n", "\n", - "PDP shows the dependence on average, and ICE shows the dependence in a individual sample level.\n", + "PDP shows the dependence on average, and ICE shows the dependence in an individual sample level. This is important not only to help debug and understand how the model behaves but also is critical to Responsible AI. These methodologies facilitate driving transparency towards the users of the model and accountability by the model creators.\n", "\n", "Using examples above we showed how to calculate and visualize such plots at a scalable manner to understand how a classification or regression model makes predictions, which features heavily impact the model, and how model prediction changes when feature value changes." ]