Remove bad check in cronbach_alpha calculation #11

Closed
wants to merge 1 commit into
from

Projects

None yet

4 participants

@jeremyevans

There is no reason to return nil if a single vector has a 0 variance. For example, let's say you are giving a test, and every single taker gets the easiest question correct. The variance for that question vector is 0, but Cronbach's alpha can still be calculated correctly for the entire dataset.

@jeremyevans jeremyevans Remove bad check in cronbach_alpha calculation
There is no reason to return nil if a single vector has a 0 variance.  For example, let's say you are giving a test, and every single taker gets the easiest question correct.  The variance for that question vector is 0, but Cronbach's alpha can still be calculated correctly for the entire dataset.
002b873
@clbustos
Owner

Is a design decision. You could calculate alpha using all variables with variance>0, but you should give an advice to the user first. Because Statsample is a in-program library, not a REPL one, I decided to be strict with this one.
Maybe a more general function should be created, that will be resilient to problems on input.

@jeremyevans

I guess I don't understand why having a vector with variance = 0 indicates any problem with the input. It seems to be to be a normal situation that the library can and should handle. Now if all vectors have variance 0, I can see returning nil. Maybe change any? to all?

@clbustos
Owner

From a psychometric perspective, one or more items with 0 variance is very serious, because implies a bad selection of items. The meaning of the index (lower bound of correlation for equal size tau-equivalen measurement) not longer apply directly, because the library omits one or more variables. So, I should give a warning about it.
Anyway, as R does, I can provide an option to relax requirements, as na.rm on mean function.

@jeremyevans

I'd say that what you said is true for large datasets. In my case, I was calculating alpha from a small dataset (16 takers), and there were multiple questions that everyone got right. Since alpha can be calculated correctly even if some vectors have variance = 0, I don't see the reason to purposely refuse to calculate it. It should be up to the user to determine the meaning of the result, the library's responsibility is just to perform the calculation.

At the very least, if you are going to refuse to calculate alpha because of artificial restrictions, please raise an error with a descriptive message indicating why. Returning nil is bad as it doesn't indicate why the calculation was not done. When I first used the library and got nil, I thought I was doing something wrong, and it caused quite a bit of extra debugging time.

@clbustos
Owner

Ok, you convince me. I will put an option to raise a error (strict mode), but we should delete any vector with variance=0.

@agarie

Hi @justin808, thanks for the pull request! We're currently in the process of centralizing SciRuby's gems in the organization repositories. Can you reopen your PR on sciruby/statsample?

Thanks! I'll take a look at your PR as soon as I finish moving the other gems' issues there. :)

@justin808

Hi @agarie How do I reopen my PR? Can you please give me a link?

@agarie

Hey @justin808, I found a page on the documentation showing how to change to which repository you send the PR to: https://help.github.com/articles/using-pull-requests/#changing-the-branch-range-and-destination-repository.

So, if you close these two, you probably can create new PRs pointed to SciRuby/statsample.

@jeremyevans jeremyevans added a commit that referenced this pull request Mar 18, 2015
@jeremyevans jeremyevans Remove bad check in cronbach_alpha calculation
@clbustos agreed this change should be made in #11, but never merged it
3eaa53e
@jeremyevans

This was fixed in the new upstream, so it can be closed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment