Augmented-data projection (augmat
and augvec
objects): Replace attribute nobs_orig
by ndiscrete
#473
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This replaces the former attribute$C$ ) instead of the number of observations ($N$ ).
nobs_orig
ofaugmat
andaugvec
objects by a new attribute calledndiscrete
, giving the number of (possibly latent) response categories (The reason is that subsetting the rows of an augmented-rows matrix (or the elements of an augmented-length vector) is allowed in terms of the observations (individuals), but not in terms of the (possibly latent) response categories. So$C$ should always stay the same, in contrast to $N$ .
Note that this subsetting convention (only observations, not categories) is only an inofficial one; there is no code preventing us from subsetting any rows/elements, even across the (possibly latent) response categories, because functions like
str()
do not adhere to that subsetting convention (this is also the reason why previously, the global optionprojpred.additional_checks
was used to activate related checks only in the unit tests).I'm sorry that storing$N$ was a bad design choice from my side in PR #322. I guess the reasons why I chose $N$ instead of $C$ back then were (i) I thought that the switch between latent space and response space might be a problem for storing $C$ and (ii) I did not think of the problems when subsetting an augmented-rows matrix ($N$ being stored instead of $C$ (such subsetting—in particular in a fashion so that $N$ changes—is used only very rarely in projpred; subsampled PSIS-LOO CV is an example, see #433 and #434).
augmat
objects) or an augmented-length vector (augvec
objects) with