Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cook's d bar plot should return the threshold used in the plot #13

Closed
aravindhebbali opened this issue Jun 3, 2017 · 1 comment

Comments

Projects
None yet
1 participant
@aravindhebbali
Copy link
Member

commented Jun 3, 2017

ols_cooksd_barplot() does not return the threshold value used in the plot.

> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> k <- ols_cooksd_barplot(lm_fit2)

> k$Observation
 [1]  19  48  69  74 109 121 153 154 155 156 157 158 173 278 292 294 313 315
[19] 321 322 327 329 331 332 333 334 366 382 443 457 511 543 544 547 548 549
[37] 550 551 552 553 555 599 612 705 761 781 784 801 803 805 806 807 808 809
[55] 810 811 812 813 814

> k$`Cook's Distance`
 [1] 0.004925816 0.006469155 0.009081726 0.005898704 0.007078976 0.005087501
 [7] 0.005863605 0.007676384 0.004586336 0.004469663 0.017683564 0.072050631
[13] 0.007834145 0.006855228 0.050686585 0.005360302 0.010846417 0.007199480
[19] 0.008042144 0.004714478 0.007306099 0.008606838 0.007368080 0.082314429
[25] 0.045272571 0.246924097 0.018766002 0.008591291 0.004412350 0.004904602
[31] 0.004877418 0.004500206 0.009037852 0.009439700 0.007430731 0.009794273
[37] 0.016138561 0.015583934 0.014069024 0.023726397 0.012552851 0.015113836
[43] 0.005988976 0.028129643 0.004754092 0.007761762 0.007677705 0.020385093
[49] 0.026974805 0.005075846 0.010551736 0.006427076 0.006372290 0.006065736
[55] 0.006524888 0.007945376 0.012811643 0.046742065 0.054528741

@aravindhebbali aravindhebbali self-assigned this Jun 3, 2017

@aravindhebbali aravindhebbali changed the title Cook's d bar plot does not return the threshold used in the plot Cook's d bar plot should return the threshold used in the plot Jun 3, 2017

@aravindhebbali aravindhebbali added this to the v0.2.0 milestone Jun 5, 2017

@aravindhebbali

This comment has been minimized.

Copy link
Member Author

commented Jun 5, 2017

ols_cooksd_barplot() returns the threshold value used to classify the observations as outliers.

> library(olsrr)
> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> k <- ols_cooksd_barplot(lm_fit2)

> k$outliers
# A tibble: 59 × 2
   Observation `Cook's Distance`
         <int>             <dbl>
1           19       0.004925816
2           48       0.006469155
3           69       0.009081726
4           74       0.005898704
5          109       0.007078976
6          121       0.005087501
7          153       0.005863605
8          154       0.007676384
9          155       0.004586336
10         156       0.004469663
# ... with 49 more rows

> k$threshold
[1] 0.004
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.