-
-
Notifications
You must be signed in to change notification settings - Fork 87
Description
There have been demands (#350, #349) for new functionalities regarding PPC Error plots. Those demands are logical and worth implementing; however, to avoid cluttering the current PPC error documentation page and offering functionalities that people are unlikely to use while missing useful functions, it makes sense to decide on the kind of function/plots we want to have regarding error and residual plots. Here is what we currently have:
PPC Error
Right now, we have PPC error plots which plot predictive errors y - yrep
.
ppc_error_hist()
: A separate histogram is plotted for the predictive errors computed from y and each dataset (row) in yrep. For this plot yrep should have only a small number of rows.ppc_error_hist_grouped()
: Like ppc_error_hist(), except errors are computed within levels of a grouping variable. The number of histograms is therefore equal to the product of the number of rows in yrep and the number of groups (unique values of group).ppc_error_scatter()
: A separate scatterplot is displayed for y vs. the predictive errors computed from y and each dataset (row) in yrep. For this plot yrep should have only a small number of rows.ppc_error_scatter_avg()
: A single scatterplot of y vs. the average of the errors computed from y and each dataset (row) in yrep. For each individual data point y[n] the average error is the average of the errors for y[n] computed over the the draws from the posterior predictive distribution.ppc_error_scatter_avg_vs_x()
: Same as ppc_error_scatter_avg(), except the average is plotted on the y-axis and a predictor variable x is plotted on the x-axis.ppc_error_binned()
: Intended for use with binomial data. A separate binned error plot (similar to arm::binnedplot()) is generated for each dataset (row) in yrep. For this plot y and yrep should contain proportions rather than counts, and yrep should have only a small number of rows.
Proposed Plots
Here is a small list of functions that people proposed to be implemented:
ppc_residual_scatter()
: A single scatterplot of y vs. the errors computed from y and a summary of each dataset (row) in yrep. For each individual data point y[n], the error is computed as the difference between y[n] and the summary of the draws from the posterior predictive distribution. (source)ppc_error_pava()
: The PAVA-residual plot is of the form stat(cep_y - p_pred) where cep_y is a matrix of conditional event probabilities obtained by PAVA transforming y based on the predictive probability samples in p_pred. (source)ppc_residual_binned()
: A residual plot that allows for discrete observation, similar toppc_error_binned
(source)
Future of PPC Error and Residual
Now, we need to decide on which of these existing or proposed plots are useful or what other alternative plots related to error or residual plots are needed. Please mention what functionalities you would like to have and which functionalities that currently exist are not needed!