-
-
Notifications
You must be signed in to change notification settings - Fork 16
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blog/chi-square-test-of-independence-in-r/ #46
Comments
Comment written by Herivelto Cordeiro dos Santos on November 30, 2020 11:55:35: Hi Antoine! I realy liked your blog, I think is clear and focus! Let me ask you about this one... I have tried the scripts method #1 and #2 and I got different p-values. Do you know what could be the reason for that as they should be equal? My table is matrix (c(63,78,94,65), ncol=2). |
Comment written by Antoine Soetewey on November 30, 2020 12:20:12: Thank you for your feedback! I just tried on my side, and I have the same p-values with your data, see my code here. One potential reason you have different p-values is due to the fact that first method use the Yate's continuity correction by default. Add the argument (I've added a note at the end of this section following your comment.) Hope this helps. Regards, |
Hi Antoine! Thank you so much for your blog, it's very helpful! I have a question for you: I have a data frame with several (10) different categorical variables that I would like to test for possible correlations between each other. Is there a way that I can test them all at the same time, like you explained for the quantitative variables? Or is it really only possible for two variables at the same time? Not sure how I would do this for 10 variables... |
Dear Clarice, Do you want to compute correlation coefficients or perform chi-square tests? You mentioned correlations but you posted the comment on the article about chi-square test, so I'm not sure. For correlation, if your categorical variables are ordinal, you can simply use The standard Chi-square test for independence (with the Hope this helps. Regards, |
Hey Antoine! Thank you so much for your quick answer. My goal is to find out whether the different variables correlate with each other or not, so I can exclude them before computing a model.
From your blog I learned that it’s not possible to compute correlation coefficients between two categorical variables (if I understood that correctly?) but only to do a contingency analysis.
Some of my categorical variables are ordinal with 3-4 levels, most of them are nominal though. I tried to use the corr() function that you suggested, but unfortunately I just can’t make it work with my R version… not sure why.
So I’ll have to find another way I guess!
Best wishes,
Hanna
… Am 07.01.2021 um 16:51 schrieb Antoine Soetewey ***@***.***>:
Hi Antoine! Thank you so much for your blog, it's very helpful! I have a question for you: I have a data frame with several (10) different categorical variables that I would like to test for possible correlations between each other. Is there a way that I can test them all at the same time, like you explained for the quantitative variables? Or is it really only possible for two variables at the same time? Not sure how I would do this for 10 variables...
Thanks in advance!
Regards,
Clarice
Dear Clarice,
Do you want to compute correlation coefficients or perform chi-square tests? You mentioned correlations but you posted the comment on the article about chi-square test, so I'm not sure.
For correlation, if your categorical variables are ordinal, you can simply use corr(dat, method = "spearman"), where dat is the name of your dataframe.
The standard Chi-square test for independence (with the chisq.test() function and presented in this article) is only possible between two categorical variables, so you'd need to tweak your code a bit to do it for all possible pairs of variables.
Hope this helps.
Regards,
Antoine
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#46 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASMKU2TX3SMDGHNWJ3O5SLTSYXKBTANCNFSM4VY6QD6Q>.
|
You understood correctly:
Regards, |
Alright, I get it. Sorry for the confusion and thanks for your help!
Best wishes,
Hanna
… Am 12.01.2021 um 10:42 schrieb Antoine Soetewey ***@***.***>:
Hey Antoine! Thank you so much for your quick answer. My goal is to find out whether the different variables correlate with each other or not, so I can exclude them before computing a model. From your blog I learned that it’s not possible to compute correlation coefficients between two categorical variables (if I understood that correctly?) but only to do a contingency analysis. Some of my categorical variables are ordinal with 3-4 levels, most of them are nominal though. I tried to use the corr() function that you suggested, but unfortunately I just can’t make it work with my R version… not sure why. So I’ll have to find another way I guess! Best wishes, Hanna
… <x-msg://4/#>
Am 07.01.2021 um 16:51 schrieb Antoine Soetewey @.***>: Hi Antoine! Thank you so much for your blog, it's very helpful! I have a question for you: I have a data frame with several (10) different categorical variables that I would like to test for possible correlations between each other. Is there a way that I can test them all at the same time, like you explained for the quantitative variables? Or is it really only possible for two variables at the same time? Not sure how I would do this for 10 variables... Thanks in advance! Regards, Clarice Dear Clarice, Do you want to compute correlation coefficients or perform chi-square tests? You mentioned correlations but you posted the comment on the article about chi-square test, so I'm not sure. For correlation, if your categorical variables are ordinal, you can simply use cor(dat, method = "spearman"), where dat is the name of your dataframe. The standard Chi-square test for independence (with the chisq.test() function and presented in this article) is only possible between two categorical variables, so you'd need to tweak your code a bit to do it for all possible pairs of variables. Hope this helps. Regards, Antoine — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#46 (comment) <#46 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ASMKU2TX3SMDGHNWJ3O5SLTSYXKBTANCNFSM4VY6QD6Q <https://github.com/notifications/unsubscribe-auth/ASMKU2TX3SMDGHNWJ3O5SLTSYXKBTANCNFSM4VY6QD6Q>.
You understood correctly:
You can compute the correlation between your ordinal variables (thanks to the cor() function, with only one r and not two as you wrote in your comment),
But for your nominal variables, you cannot compute the correlation. You'll need to apply the Chi-square test of independence (with the chisq.test() function).
Regards,
Antoine
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#46 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASMKU2UHPW6LG7B4L3IXRI3SZQKSFANCNFSM4VY6QD6Q>.
|
Hello Mr Antoine, |
Hello, Here is a reproducible example using for loop:
Hope this helps. Regards, |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Chi-square test of independence in R - Stats and R
Learn when and how to use the Chi-square test of independence in R. See also how it works in practice and how to interpret the results of the Chi-square test
https://statsandr.com/blog/chi-square-test-of-independence-in-r/
The text was updated successfully, but these errors were encountered: