# The accuracy-explainability trade-off

In the previous chapters, we explored how **Concept Bottleneck Models (CBMs)** can effectively answer the driving evaluator’s questions (“What did the model see?”, “What does the model usually do in similar situations when something changes in what it sees?”, and “What should have changed for the model to decide to cross instead?”) allowing him to gain insights into the model’s decision-making process. However, the CBM’s ability to answer these questions rests on a **key assumption**: the set of concepts in the bottleneck (“traffic light color” and “ambulance presence”) must contain all the necessary information to solve the driving task (“cross the intersection”).

If the provided concepts are **incomplete**, perfect concept prediction is not enough to solve the task correctly because those concepts don’t contain enough information. Similarly, perfect task prediction is not enough to answer the driving evaluator’s questions as the model will attempt to encode useful additional information into the concept representation, thus making the effects of interventions unpredictable.
This leads to a trade-off during optimization between two conflicting objectives:
    1. **Maximizing task accuracy** — ensuring the task is solved as accurately as possible.
    2. **Maximizing concept accuracy** — ensuring that concepts are predicted as accurately as possible.
    
This trade-off is a specific instance of the **accuracy-explainability trade-off**. When this occurs, CBMs can exhibit two different effects:
    1. **Drop in task predictive performance** if we prioritize learning the concepts perfectly.
    2. **Drop in concept performance** if we prioritize learning the downstream task perfectly.

## The consequences of concept incompleteness
Let’s consider an example where the CBM is only provided with the concept “traffic light color” but does not have access to the concept “ambulance presence.” This concept incompleteness could lead to two possible effects:
    - In the first case (drop in task performance), without the "ambulance presence" concept, the model would predict to cross the intersection based solely on the traffic light, even if an ambulance is present, leading to possibly dangerous decisions. This happens because the information about the ambulance is not available in the concept set, limiting the model’s ability to fully solve the driving task.
    - In the second case (drop in concept performance), the model may try to encode information about the ambulance's presence indirectly within the "traffic light color" concept representation. This phenomenon, known as **leakage**, mixes the two concepts. As a result, when we intervene and change the value of "traffic light color" (for example, from red to green), we may unintentionally alter information about the ambulance's presence as well. This makes it difficult to predict the model’s behavior after such an intervention, as the traffic light concept is now entangled with information about the ambulance. This ultimately undermines the interpretability of the model and makes interventions unpredictable.