Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculating m when reduced vector contains repeated values #16

Closed
wleoncio opened this issue Sep 18, 2024 · 1 comment
Closed

Calculating m when reduced vector contains repeated values #16

wleoncio opened this issue Sep 18, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@wleoncio
Copy link
Member

Consider a case where the reduced vector contains duplicated values. For example, {12, 12, 16, 17, 13}. If we apply the ordering process (e.g. through the reducedVector() function in permChacko), we get:

Original vector: 12     12      16      17      13
Reduced vector : 12     12      15.333

With respective weights 1, 1, and 3 (since we had two reductions involving {17, 13} and then {16, 15}).

My question now is: what is the value of m, the number of distinct quantities? Should I take the word "distinct" literally (as permChacko 1.0.0 does) and say m = 2 or should m = 3 simply because that's the length of the reduced vector? The answer directly affects which elements appear in the test statistic.

I've tried to find an answer on both Chacko papers as well as on Brunk (1958), Section 6, which is a reference to the ordering process on Chacko (1963), all to no avail.

@wleoncio wleoncio added the question Further information is requested label Sep 18, 2024
@wleoncio wleoncio self-assigned this Sep 18, 2024
@wleoncio
Copy link
Member Author

Replies from Graeme and Morten, respectively (TL;DR: m = 3)

My instinct is that we have three weights here so m = 3. However I am happy to hear different views.

My instincts agree, but there's two ways of looking at it. Either you have
12, 12, 15.333 with weights 1, 1, 3, and m = 3

or

12, 15.333 with weights 2, 3, and m = 2

These two will give identical values of the test statistic; however, the reference chi-squared distribution has m-1 degrees of freedom, so when the test statistic is compared with the reference distribution, we will get two different P-values. My instinct would be to interpret "distinct" as "non-decreasing", i.e. as complying with the assumption of the ordering, thus m=3.

wleoncio added a commit that referenced this issue Sep 18, 2024
wleoncio added a commit that referenced this issue Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant