Regression influence diagnostics #7044

exalate-issue-sync · 2023-05-11T15:42:01Z

Regression influence diagnostics

The SAS documentation provides a conceptual overview and formulas for the diagnostics. The one we need in particular is the DFBETA statistic. This statistic measures how much the coefficient for a variable changes when an observation is deleted.

Proc Logistic: [https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_logistic_sect042.htm|https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_logistic_sect042.htm|smart-link]

Proc Reg:
[https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_reg_sect040.htm|https://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_reg_sect040.htm|smart-link]

The minimal requirement is to implement the DFBETA statistic for a binary target (proc Logistic) model. The DFBETA statistic conceptually can also be calculated for a continuous target (proc Reg).

The desired output is a data frame that has the same number of rows as the data frame used to estimate the model. It should have columns that correspond to the variables included in the final model after any selection algorithms have been run. The columns should contain the DFBETA values and could perhaps be named DFBETA_. The ordering of the rows should correspond to the ordering of the rows of the input data frame used to estimate the model, so that the DFBETA values can be combined back with the input data.

exalate-issue-sync · 2023-05-11T15:42:03Z

Arun Aryasomayajula commented: [~accountid:557058:04659f86-fbfe-4d01-90c9-146c34df6ee6] and team to discuss and assign an engineer

exalate-issue-sync · 2023-05-11T15:42:04Z

Wendy Wong commented: I have written a write-up describing my implementation:

[^Regression Influence Diagnostics.pdf]

h2o-ops · 2023-05-14T18:06:15Z

JIRA Issue Details

Jira Issue: PUBDEV-8638
Assignee: Wendy Wong
Reporter: Arun Aryasomayajula
State: Resolved
Fix Version: 3.40.0.1
Attachments: Available (Count: 2)
Development PRs: Available

h2o-ops · 2023-05-14T18:06:19Z

Attachments From Jira

Attachment Name: Regression Influence Diagnostics.pdf
Attached By: Wendy Wong
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8638/Regression Influence Diagnostics.pdf

Attachment Name: Regression influence diagnostics feature request.docx
Attached By: Arun Aryasomayajula
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8638/Regression influence diagnostics feature request.docx

h2o-ops · 2023-05-14T18:06:20Z

Linked PRs from JIRA

#6468

h2o-ops assigned wendycwong May 14, 2023

h2o-ops added the fixVersion/3.40.0.1 label May 14, 2023

h2o-ops closed this as completed May 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression influence diagnostics #7044

Regression influence diagnostics #7044

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023

Regression influence diagnostics #7044

Regression influence diagnostics #7044

Comments

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023