# EBMs Overall Importances

In this notebook we show how to compute and interpret Overall Importances shown in EBMs Global Explanations. We also show how to compute importances of a group of terms, i.e. composite importances.

Throughout the notebook we use _term_ to denote both single features and interactions (pairs)

## Train an Explainable Boosting Machine (EBM) for a regression task

Let's use the california housing price dataset as a reference and train an EBM:

In [None]:
from sklearn.datasets import fetch_california_housing
from interpret.glassbox import ExplainableBoostingRegressor

cal = fetch_california_housing()

ebm = ExplainableBoostingRegressor(feature_names=cal.feature_names)
ebm.fit(cal.data, cal.target)

## Explain the Model

EBMs provide explanations on a both global (overall behavior) and local (individual predictions) levels.

### Global Explanation

Global Explanations are useful for understanding what a model finds important, as well as identifying potential flaws in its decision making. Let's start by computing and displayng a global explanation:

In [None]:
from interpret import show

ebm_global = ebm.explain_global(name='EBM')
show(ebm_global)

Because EBMs are additive models, we can measure exactly how much each term contributes to a prediction. Let's take a look at the graph of the first term, `MedInc`, by selecting it in the above drop-down menu.

The way to interpret this is that if a new datapoint came in with `MedInc` = 10, the model adds about +2 to their final prediction. However, for a different datapoint with `MedInc` = 4, the model would now adds ~0 to the prediction, and for datapoints that have `MedInc` = 2, the model adds approx. -0.6.

To make individual predictions, the model uses each term graph as a look up table, notes the contribution per term, and sums them together with the learned intercept to make a prediction.

### Local Explanations

We can see the full breakdown for any individual prediction with Local Explanations. Here's the prediction breakdown for the first sample in our dataset:

In [None]:
from interpret import show
show(ebm.explain_local(cal.data[:1], cal.target[:1]))

The model prediction is 4.04. We can see that the intercept adds about +2, MedInc adds ~+1.8, and the Longitude adds about +0.4. So far, for the top 3 contributing terms, we're at a cumulative prediction of ~+4.2. If you repeat this process for all the features, you'll arrive exactly at the model prediction of +4.04.

### Overall Importance

Going back to the global explanation's Summary, the overall importance is calculated as _the average absolute contribution (score) a term (feature or pair) makes when predicting across the training dataset._

In our example above, we saw that `MedInc` contributed a +1.8 for this datapoint. If we repeat this process for all datapoints in the training set we'll arrive at the  `MedInc`'s overall importance. This can be interpreted as _on average, across the training dataset, how much does `MedInc` contribute to a single prediction in absolute terms?_

In other words, the overall importance is not a measure of positive/negative -- it is a measure of how important each term is in the score overall. These scores are represented in the same units as the y-axis of the feature graphs, so for a classification problem it would be in logits and for regression the original label space.

## Computing overall importances

To compute the overall importances of a trained EBM we use `get_importances()`:

In [None]:
importances = ebm.get_importances()
names = ebm.get_feature_names_out()

for (term_name, importance) in zip(names, importances):
    print(f"Term {term_name} importance: {importance}")

Note that this isn't the only way of calculating feature importance. Another metric our package provide is the 'max" option:

In [None]:
importances = ebm.get_importances("max")
names = ebm.get_feature_names_out()

for (term, importance) in zip(names, importances):
    print(f"Term {term} importance: {importance}")

## Composite Importances

A composite is a set of terms. We provide utility functions to compute the importances of composites and, optionally, append them to global explanations. Note that no individual graphs are generated, just its overall importance is shown on Summary.

### Computing composite importances

We create a list of terms -- single features or interactions -- as our composite and then compute its importance:

In [None]:
from interpret.glassbox.ebm.research.composite_importance import *

composite_terms_1 = ["MedInc", "Population", "Latitude x Longitude"]
importance = compute_composite_importance(composite_terms_1, ebm, cal.data)
print(f"Composite: {composite_terms_1} - Importance: {importance}")

In this example we create a composite with three terms, 2 features (`MedInc` and `Population`) and 1 interaction (`Latitude x Longitude`), and compute its importance. Similar to single feature importances, we interpret this score as _the average absolute contribution this set of terms makes when predicting across the training dataset._ 

Note that for each prediction, the contribution of each term in the composite will be added before taking the absolute value. 

We also have the option to create a global explanation containing the composite importance or append it to an existing explanation:

In [None]:
my_global_exp = append_composite_importance(composite_terms_1, ebm, cal.data)
show(my_global_exp)

The importance of `composite_terms_1` is about 0.53, which is higher than any individual importance. We could make this type of comparison between different composites too:

In [None]:
composite_terms_2 = ["AveRooms", "HouseAge"]
my_global_exp = append_composite_importance(composite_terms_2, ebm, cal.data, global_exp=my_global_exp)
show(my_global_exp)

The importance of `composite_terms_2` is about 0.11, higher than each of its terms but smaller then other important terms such as `Longitude`.

We can also compare one composite we are interested in (e.g. `composite_terms_1`) and a composite of all other terms:

In [None]:
all_other_terms = [term for term in ebm.get_feature_names_out() if term not in composite_terms_1]

my_global_exp = append_composite_importance(composite_terms_1, ebm, cal.data)
my_global_exp = append_composite_importance(all_other_terms, ebm, cal.data, composite_name="all_other_terms", global_exp=my_global_exp)
show(my_global_exp)

Note that `composite_terms_1` still has the highest importance score. Moreover, although `Latitude` is one of the terms in `all_other_terms`, its individual importance is higher than the composite itself -- this is possible because when we consider a composite, for each prediction, all the scores of the composite terms are added before taking the absolute value.

Finally, we also expose a function to compute the importances of a list of composite terms as well as all the model's original terms.

In [None]:
my_dict = get_composite_and_individual_terms([composite_terms_1, composite_terms_2], ebm, cal.data)
for key in my_dict:
    print(f"Term: {key} - Importance: {my_dict[key]}")