# Plot Slopes

`slopes` and `plot_slopes` are a part of Bambi's sub-package `plots` that feature a set of functions used to interpret complex regression models. This sub-package is inspired by the R package [marginaleffects](https://vincentarelbundock.github.io/marginaleffects/articles/predictions.html#conditional-adjusted-predictions-plot). These two functions allow the modeler to view a summary output and or plot the **slopes** estimated by the model. The slope measures the association between a change in a regressor $x$, and a change in the response $y$.  

Below, it is described why estimating the slope of the prediction function is useful in interpreting generalized linear models (GLMs), how this methodology is implemented in Bambi, and how to use `slopes` and `plot_slopes`. It is assumed that the reader is familiar with the basics of GLMs. If not, refer to the Bambi [Basic Building Blocks](https://bambinos.github.io/bambi/notebooks/how_bambi_works.html#Link-functions) example.

## Interpretation of Regression Coefficients

### Interpreting interactions effects

Specifying interactions in a regression model is a way of allowing parameters to be conditional on certain aspects of the data. By contrast, for a model with no interactions, the parameters are **not** conditional and thus, the value of one parameter is not dependent on the value of another covariate. However, once interactions exist, multiple parameters are always in play at the same time. Additionally, interactions can be specified for combinations **and** multiple categorical and continuous variables; making the interpretation of the the slopes more difficult.

As outlined in the "interpretation of regression coefficients" section, if the regression model is specified with a Gaussian response and has no link function, then the regression coefficients can be interpreted as the expected change in the response for a one unit change in the regressor. However, once a GLM is specified with a link function, and interaction effects are included, no longer does a one unit change in a covariate produce a constant change in the mean of the response. Instead, a one unit change in $x_i$ may produce a larger or smaller change in the probability $p_i$, for example, depending upon how far from zero the log-odds are.

With GLMs, every covariate essentially interacts with itself, because the impact of a change in a covariate depends upon the value of the covariate before the change. Generally speaking, every covariate effectively interacts with every other covariate, whether or not an interaction term is specified in the model. An example of paramters interacting with themselves is shown below.

Consider the mean of a Gaussian linear model
$$\mu = \alpha + \beta x$$

where the rate of change in $\mu$ w.r.t. $x$ is just $\beta$, i.e., the rate of change is constant no matter what the value of $x$ is. But when we consider GLMs with link functions used to map outputs to exponential family distribution parameters, calculating the derivative of the mean output $\mu$ w.r.t the predictor is not as straightforward as in the gaussian linear model. For example, computing the rate of change in a binomial probability $p$ w.r.t. to a predictor $x$

$$p = \frac{exp(\alpha + \beta x)}{1 + exp(\alpha + \beta x)}$$

And taking the derivative w.r.t. to $x$ yields

$$\frac{\partial p}{\partial x} = \frac{\beta}{2(1 + cosh(\alpha + \beta x))}$$

Since $x$ appears in the derivative, the impact of a change in $x$ depends upon $x$, i.e., an interaction with itself even though no interaction term was specified in the model. 

## Average Predictive Slopes

Here, we adopt the notation from Chapter 14.4 of [Regression and Other Stories](https://avehtari.github.io/ROS-Examples/) to first describe average predictive differences which is essential to computing `slopes`, and then secondly, average predictive slopes. Assume we have fit a Bambi model predicting an outcome $Y$ based on inputs $X$ and parameters $\theta$. Consider the following scalar inputs:

$$u: \text{the input of interest}$$
$$v: \text{all the other inputs}$$
$$X = (u, v)$$

In contrast to `comparisons`, for `slopes` we are interested in comparing $u^{\text{value}}$ to $u^{\text{value}+\epsilon}$ (perhaps age = 60 and 60.0001 respectively) with all other inputs $v$ held constant. The _predictive difference_ in the outcome changing **only** $u$ is:

$$\text{average predictive difference} = \mathbb{E}(y|u^{\text{value}}, v, \theta) - \mathbb{E}(y|u^{\text{value}+\epsilon}, v, \theta)$$

Selecting the maximum and minimum values of $u$ and averaging over all other inputs $v$ in the data gives you a new "hypothetical" dataset and corresponds to counting all pairs of transitions of $(u^\text{value})$ to $(u^{\text{value}+\epsilon})$, i.e., differences in $u$ with $v$ held constant. The difference between these two terms is the average predictive difference.

However, to obtain the slope estimate, we need to take the above equation and divide by $\epsilon$ to obtain the _average predictive slope_:

$$\text{average predictive slope} = \frac{\mathbb{E}(y|u^{\text{value}}, v, \theta) - \mathbb{E}(y|u^{\text{value}+\epsilon}, v, \theta)}{\epsilon}$$

## Computing Slopes

The objective of `slopes` and `plot_slopes` is to compute the rate of change (slope) in the mean of the response $y$ with respect to (w.r.t) a small change $\epsilon$ in the predictor $x$ conditional on the other covariates $v$ specified in the model. $u$ is specified by the user and the original value is either provided by the user, else a default value (the mean) is computed by Bambi (to compute a small change in $u$ there must be an _original_ value). The values for the other covariates $v$ specified in the model can be determined under the following three scenarios:

1. user provided values 
2. a grid of equally spaced and central values
3. empirical distribution (original data used to fit the model)

In the case of (1) and (2) above, Bambi assembles all pairwise combinations (transitions) of $u$ and $v$ into a new "hypothetical" dataset. In (3), Bambi uses the original $v$, but replaces $u$ with the user provided value or the default value computed by Bambi. In each scenario, predictions are made on the data using the fitted model. Once the predictions are made, comparisons are computed using the posterior samples by taking the difference in the predicted outcome for each pair of transitions. The average of these differences is the average predictive difference.

For continuous variables


For categorical variables


Thus, the goal of `slopes` and `plot_slopes` is to provide the modeler with a summary dataframe and visualization of the average slope conditional on, or averaged by, other covariates is returned. Below, we demonstrate how to use the two functions using several examples.

## Some model

### User provided values

### Multiple slope values

### Condtional slopes

### Unit level slopes

#### Marginalizing over covariates

#### Average by subgroups

### Interpreting coefficients as an elasticity