framework.qmd

# Evaluation Framework

In this section, we propose a human-centric framework for evaluating model
explanation methods that is grounded in causal reasoning and the premise that
explanations should align with an explainee’s objectives. An evaluation
framework is necessary because the multiplicity of available methods makes it
impossible to generate every possible explanation, effectively forcing
practitioners (explainers) to select a subset of methods. However, to perform
this selection in a principled way requires the explainer to define criteria
for comparing methods. Since there are innumerable valid criteria, it is
helpful to establish a principle, from which, criteria are derived. One
approach, leveraged by @miller_explanation_2018, is to start from the premise
that explanations should mirror human explanations. Consequently, model
explanations that more closely resemble human explanations are considered
better, which implicitly treats human explanations as a gold standard. In our
view, this is a valid, but arguably objectionable standard. Instead, we propose
the principle that every explanation should be aligned with the purpose of the
explanation as defined by the explainee. The implication is that explanations
that are more closely aligned with the explainee’s objectives are better, or
more correct, than explanations that are misaligned. In this way, we avoid the
pitfalls of prioritizing human-like explanations, while maintaining a
human-centric approach.

To aid practitioners in generating human-centric explanations, we propose a
four step process closely modeled after the formulate-approximate-explain (FAE)
framework @merrick_explanation_2020. First, the explainer and explainee must
specify a set of target explanatory questions that explanations should answer.
Next, the explainer identifies an explanation-generating method aligned with
each question. Third, the explainer generates the explanations and provides
them to the explainee. In the final stage, the explanations are interpreted,
evaluated, and the cycle may be repeated.

Every explanation is an answer to a question, so the first step in generating
correct model explanations is to specify the question that the explanation
should address. In our view, these questions can be elicited by the explainee,
but should be governed by the explainee’s objectives. The explainer and
explainee’s objectives may not always be aligned @mittelstadt_explaining_2019,
necessitating a choice over whose objectives to prioritize. In our framework,
we prioritize the explainee’s objectives. Since there are many potential
explainees, each of which may have different objectives, these questions must
be formulated on a per-case basis. A second consideration that the explainer
must keep in mind is whether the specified questions are answerable under the
relevant constraints of the explanation-generating process. These constraints
could involve the explainer’s knowledge of the available methods, the time
available for generating explanations, the explainee’s level of domain
expertise, the acceptability of the assumptions required to generate the
explanation, and many other factors.

Every target explanatory question has an associated level and type. By level,
we refer to the idea introduced earlier that explanations can either consider
the model independent of the real world, or attempt to account for real-world
relationships. We will address exactly what this means in subsequent sections.
A target explanatory question can also be one of three types: associative,
interventional, and counterfactual. Together, these groups are referred to as
the “ladder of causality” @pearl_causality_2009. Although explanation and
causality may seem like separate topics, they are highly intertwined
@miller_explanation_2018. To make things concrete, consider a model used to
predict risk of default as part of a loan application. The following are all
possible explanatory questions:

1. How did race influence the model’s decision to approve the application?
2. What if the user increases their income to $X'$ from $X$?
3. Would my loan application have been approved had my income been $X'$ rather than $X$?

Each of these questions implies a different explainee with different
objectives: a model auditor interested in assessing the model for potential
bias, an employee of the bank trying to understand the model’s behavior, and an
individual interested in what might have happened under different
circumstances. 

Specifying a target question may require multiple iterations, in particular, to
resolve any lingering ambiguity. For example, the intended level of explanation
is not immediately clear with the current wording . We suggest that it is the
explainer’s responsibility to resolve such ambiguity when possible. In
situations where the explainer cannot interact directly with the explainee, the
explainer may be forced to make assumptions about the explainees objectives.

Once the target questions have been specified, the next step is to select an
explanation-generating method that addresses each question. In order to make
this selection, the explainer must have a clear understanding of the specific
types of questions that each method is capable of addressing. In reality, this
stage occurs in parallel with the first, as the explainer must keep the
association between explanatory methods and the questions they address in mind
in order to assess the feasibility of answering the explainee’s target
questions. We provide additional details and specific recommendations about
which methods are most appropriate in different contexts in our discussion
section.

After a method has been identified, the explainer generates the explanation.
The explainer must have a sufficiently deep understanding of the method to
accurately assess whether the resulting explanation, when interpreted
correctly, addresses the target question. There are a variety of ways that the
two may become misaligned. For example, many explanation-generating methods
rely on sampling rather than computing exact values such that if the estimator
is biased, then the resulting explanations may not address the target question
unless certain other assumptions are met. As with the selection step, these
considerations should be kept in mind when specifying target questions during
the first step. Functionally, this means that the explainer may need to
communicate and assess the validity of any additional assumptions that are
required at this step while specifying the target questions.

The final step is for the explainer to provide the explanations to the
explainee, and – when possible – engage in a dialog with the explainee about
the explanations. This interactive approach to providing explanations is
aligned with @miller_explanation_2018, who suggests that conversation between
explainee and explainer is necessary because XAI is fundamentally a human-agent
interaction problem.

Generating correct explanations requires the explainer to have a deep
understanding of the available methods.  In particular, the explainer must have
a mental catalog of the relevant methods, the types of explanatory questions
that each addresses, any required assumptions, and other relevant
considerations. Causality is central to this effort because the types of
questions that typically motivate the desire for model explanations span the
rungs of the ladder of causality.