GRPtests: Goodness-of-fit testing for high-dimensional generalized linear models

Contents of this repository

This repository contains:

R package GRPtests in the GRPtests_Rpackage_update_Jan_2021 folder. The package can also be installed from CRAN (see the Installation section below).
Code reproducing empirical results from Section 4 of [1].

Installation of R package GRPtests

Installation in R from CRAN repository:

install.packages("GRPtests")

library("GRPtests")

Installation in R from this github repository:

install.packages(devtools)

library(devtools)

install_github("jankova/GRPtests/GRPtests_Rpackage_update_Jan_2021/GRPtests")

Examples from empirical section of [1]

The code reproducing empirical results from Section 4 of [1] is available in

Example_Section_4-1,
Example_Section_4-2,
Example_Section_4-3,
Example_Section_4-4.

Method heuristics

We briefly sketch the method's heuristics.

Let Y be the target vector and let X be the matrix with features as columns.

Split the observations randomly into two parts (X_A, y_A) and (X_B, y_B).

Fit a GLM regression of y_A on X_A and y_B on X_B and for both fitted regressions, compute the Pearson residuals R_A and R_B.

The main idea of the method is then as follows: if the logistic regression model was not a good fit, we would expect that some nonlinear signal was left in the residuals. Therefore we use a ML method (by default the random forest, but any method may be used) to predict the leftover signal from residuals R_A.

Using the random forest to fit R_A on X_A, we obtain a prediction function f_A(). If there was indeed nonlinear signal left in the residuals and the random forest picked it up, then this signal should also be present in the residuals from part B. Thus if we predict the random forest f_A() on X_B and compute the scalar product of f_A(X_B) and R_B, this would be large and we would reject the null hypothesis.

A schematic illustration of the procedure is pictured below.

References

[1] Janková, J., Shah, R. D., Bühlmann, P. and Samworth, R. J., Goodness-of-fit testing in high-dimensional generalized linear models (2020), Journal of the Royal Statistical Society 82, Part 3, pp. 773–795

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
GRPtests_Rpackage_update_Jan_2021		GRPtests_Rpackage_update_Jan_2021
.DS_Store		.DS_Store
.gitignore		.gitignore
Example1_Section_4-2.R		Example1_Section_4-2.R
Example2_Section_4-2.R		Example2_Section_4-2.R
Example_Section_4-1.R		Example_Section_4-1.R
Example_Section_4-3.R		Example_Section_4-3.R
Example_Section_4-4.R		Example_Section_4-4.R
README.md		README.md
grpimage.jpg		grpimage.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRPtests_Rpackage_update_Jan_2021

GRPtests_Rpackage_update_Jan_2021

.DS_Store

.DS_Store

.gitignore

.gitignore

Example1_Section_4-2.R

Example1_Section_4-2.R

Example2_Section_4-2.R

Example2_Section_4-2.R

Example_Section_4-1.R

Example_Section_4-1.R

Example_Section_4-3.R

Example_Section_4-3.R

Example_Section_4-4.R

Example_Section_4-4.R

README.md

README.md

grpimage.jpg

grpimage.jpg

Repository files navigation

GRPtests: Goodness-of-fit testing for high-dimensional generalized linear models

Contents of this repository

Installation of R package GRPtests

Examples from empirical section of [1]

Method heuristics

References

About

Releases

Packages

Languages

jankova/GRPtests

Folders and files

Latest commit

History

Repository files navigation

GRPtests: Goodness-of-fit testing for high-dimensional generalized linear models

Contents of this repository

Installation of R package GRPtests

Examples from empirical section of [1]

Method heuristics

References

About

Resources

Stars

Watchers

Forks

Languages