Bug in stats.fisher_exact #3014

Closed
WarrenWeckesser opened this Issue Oct 23, 2013 · 4 comments

4 participants

@WarrenWeckesser

A problem with stats.fisher_exact was reported on stackoverflow:
http://stackoverflow.com/questions/19548854/python-scipy-stats-module-valueerror-axis-entry-is-out-of-bounds

The culprit appears to be this line:
https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L2633
I assume that np.max(pexact, pmode) should be np.maximum(pexact, pmode).

As it is now, pmode is being passed to np.max() as its axis argument. Usually pmode is less than 1, so the axis value is truncated to 0 and an error is not generated. For the special case reported in the Stack Overflow question, stats.fisher_exact([[1,2],[9,84419233]]), pmode is 1.00000001738, so an error is raised.

@josef-pkt
SciPy member

the change to np.maximum looks correct to me, but I also don't know the algorithm.

pmode shouldn't be really above one since it's a probability.

I don't think this bug could have produced numbers that are much wrong, since the condition only holds if pexact and pmode are almost the same.

@andreas-h

I also don't know the algorithm. But will happily prepare a PR if you think this doesn't need further investigation.

@andreas-h

Why the abs in the denominator of https://github.com/scipy/scipy/blob/master/scipy/stats/stats.py#L2583? Both pmode and pexact should be positive.

Why the float in the numerator? Both pmode and pexact are results of hypergeom.pmf, so should both be float, so their difference should be float.

@rgommers: Since this seems to be "your" code, can you comment?

@rgommers
SciPy member

IIRC that was due to a bug in stats.hypergeom, where p-value could be negative. Fixed in 0.14.0, so abs can be removed from denominator.

max --> maximum fix looks right to me.

`float: don't remember. Could be useless, or also a bug fix. Anyway, looks to me like it can be removed now.

@andreas-h andreas-h added a commit to andreas-h/scipy that referenced this issue Feb 18, 2014
@andreas-h andreas-h BUG: fix use of np.max in stats.fisher_exact
fixes #3014
312ddb9
@rgommers rgommers added this to the 0.14.0 milestone Feb 18, 2014
@rgommers rgommers closed this in #3347 Feb 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment