Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Studentized residuals vs fitted plot should return the threshold value #16

Closed
aravindhebbali opened this issue Jun 3, 2017 · 1 comment

Comments

Projects
None yet
1 participant
@aravindhebbali
Copy link
Member

commented Jun 3, 2017

ols_dsrvsp_plot() should return the threshold value used to classify observations as outliers.

> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> k <- ols_dsrvsp_plot(lm_fit2)

> k$Observation
 [1] 153 154 157 158 173 292 294 313 321 322 329 331 332 333 334 366 382 511
[19] 519 542 543 548 549 550 551 552 553 612 705 781 784 794 801 803 807 808
[37] 811 813 814

> k$`Fitted Values`
 [1] 360772.40 381583.73 383950.67 515681.86 267446.73 -33842.84 148765.08
 [8] 218837.66 299506.69 293232.33 329874.54 435111.42 387833.36 328982.41
[15] 247340.74 309322.96 316369.22 192166.95 173431.54 317974.74 207462.00
[22] 390030.50 460137.30 381167.93 336377.25 462055.20 452521.77 355958.76
[29] 463038.32 232302.16 238606.40 284384.56 207829.30 220043.13 352189.48
[36] 353470.84 435119.40 499025.57 541641.87

> k$`Deleted Studentized Residual`
 [1]  2.131717  2.434828  3.315315  3.807551 -2.016768  4.447093  2.274949
 [8]  2.710143  2.208636  2.292741  3.020921  2.141023  3.546566  4.471201
[15]  7.311647 -2.275739 -2.153874  2.006514  2.445023  2.174906  3.588165
[22]  2.945940  2.494206  3.543963  4.098482  2.753821  3.699032 -2.221324
[29] -2.327895  2.168987  2.231032  2.276834  4.058212  4.278508  2.967438
[36]  2.951958  2.243076  4.593773  4.147903

@aravindhebbali aravindhebbali self-assigned this Jun 3, 2017

@aravindhebbali aravindhebbali added this to the v0.2.0 milestone Jun 5, 2017

aravindhebbali added a commit that referenced this issue Jun 5, 2017

@aravindhebbali

This comment has been minimized.

Copy link
Member Author

commented Jun 5, 2017

ols_dsrvsp_plot() returns the threshold value used to classify observations as outliers.

> library(olsrr)
> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> k <- ols_dsrvsp_plot(lm_fit2)

> k$outliers
# A tibble: 39 × 3
   Observation `Fitted Values` `Deleted Studentized Residual`
         <int>           <dbl>                          <dbl>
1          153       360772.40                       2.131717
2          154       381583.73                       2.434828
3          157       383950.67                       3.315315
4          158       515681.86                       3.807551
5          173       267446.73                      -2.016768
6          292       -33842.84                       4.447093
7          294       148765.08                       2.274949
8          313       218837.66                       2.710143
9          321       299506.69                       2.208636
10         322       293232.33                       2.292741
# ... with 29 more rows
Warning message:
In getNamespace("grid") : reached elapsed time limit

> k$threshold
[1] 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.