-
-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatches between significance test result of R and SPSS #100
Comments
Hi!
Personally, I wish it would be left as is to be consistent with other parts of R. But if you consider it is a serious issue I can change this behaviour.
prop.test(x = c(9, 19), n = c(85, 19)) # x is number of successes, n - number of trials
# 2-sample test for equality of proportions with continuity correction
#
# data: c(9, 19) out of c(85, 19)
# X-squared = 58.636, df = 1, p-value = 1.897e-14
# alternative hypothesis: two.sided
# 95 percent confidence interval:
# -0.9917263 -0.7965090
# sample estimates:
# prop 1 prop 2
# 0.1058824 1.0000000 |
Hi @gdemin, Thank you for quick and informative response.
|
Hi @gdemin, I hope you are doing well 😊 Many thanks! |
Hi @khanhhtt I will add option about rounding in the next version. But I cant promise anything about when it will be ready. As for workaround, you can set library(expss)
round2 = function(x, digits = 0) {
posneg = sign(x)
z = abs(x)*10^digits
z = z + 0.5 + sqrt(.Machine$double.eps)
z = trunc(z)
z = z/10^digits
z*posneg
}
round_table_values = function(tbl, digits = 1){
col_index = seq_along(tbl)[-1]
cell_pattern = "^(.*?)([-0-9.]+)(.*?)$"
for(i in col_index){
curr = tbl[[i]]
if(is.character(curr)){
numeric_values = suppressWarnings(as.numeric(gsub(cell_pattern, "\\2", curr)))
numeric_values = round2(numeric_values, digits = digits)
not_na_index = which(!is.na(numeric_values))
curr[not_na_index] = sapply(not_na_index,
function(cell_index)
gsub(cell_pattern,
paste0("\\1", numeric_values[cell_index], "\\3"),
curr[cell_index])
)
} else {
curr = round2(curr, digits = digits)
}
tbl[[i]] = curr
}
tbl
}
data(mtcars)
expss_digits(3)
mtcars = apply_labels(mtcars,
mpg = "Miles/(US) gallon",
cyl = "Number of cylinders",
disp = "Displacement (cu.in.)",
hp = "Gross horsepower",
drat = "Rear axle ratio",
wt = "Weight (lb/1000)",
qsec = "1/4 mile time",
vs = "Engine",
vs = c("V-engine" = 0,
"Straight engine" = 1),
am = "Transmission",
am = c("Automatic" = 0,
"Manual"=1),
gear = "Number of forward gears",
carb = "Number of carburetors"
)
mtcars_table = cross_cpct(mtcars,
list(cyl, gear),
list(total(), vs, am)
)
res = significance_cpct(mtcars_table)
round_table_values(res) |
Hi @gdemin, That's great! Thank you so much for spending time on Christmas day to give me the workaround solution. |
Fixed in version 0.11.6 |
Hi @gdemin,
Thank you for the great package. This thread is just for asking question than reporting an issue.
I have a couple of questions regarding the mismatches between significance test output of R and SPSS that need your help as below:
1. Rounding: it appears that when the cell percentage have exactly number 5 behind the decimal - e.g 12.5, then it is rounded half down to 12 instead of round half up to 13 in the output of significance test. The rounding numbers still work well when we don't perform the test.
This is R-script I used:
And here is the SPSS syntax..
The comparison results of significance test will then
![image](https://user-images.githubusercontent.com/93310387/203708179-25d6a154-a8ed-44cc-9dbb-3287b8e1e6fc.png)
The results without significance test are still fine
![image](https://user-images.githubusercontent.com/93310387/203708814-87c81b6f-8834-4404-8f3f-4b6aab49179d.png)
2. There are some pair comparison that marks as significant in R but not the same case in SPSS.
![image](https://user-images.githubusercontent.com/93310387/203709224-3b6e8db4-b30f-4327-a067-a1cdd9f276fa.png)
Especially, when a proportion is 1, the significance test is also performed in R.
However, the document of SPSS Statistic Algorithms 22 - page 264 states that the test will not be performed in this case
![image](https://user-images.githubusercontent.com/93310387/203697328-cb623f16-0668-4a79-ada5-4b8904987310.png)
I think I have used the R function in an inefficient way so that it leads to the mismatched.
Could you please help take a look and give me some advise on this matter?
In the attachment, there are data file, R script, SPSS script, and comparison results between R and SPSS for your reference.
The first sheet of the Excel file is the significance results and the second sheet is the results without the test.
Thank you in advance!
Reproducible examples.zip
The text was updated successfully, but these errors were encountered: