Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

binomial(n,k<0) is no longer always 0 #19158

Closed
wants to merge 5 commits into from
Closed

Conversation

smichr
Copy link
Member

@smichr smichr commented Apr 21, 2020

References to other Issues or PRs

Fixes #19082
Fixes #19030
Fixes #14577
Fixes #14612
work of #18960 continues from this by defining multinomial for arbitrary input.

Brief description of what is fixed or changed

A new branch of behavior has been opened up for binomial(n,k).

The former behavior was to return 0 whenever k was negative. Now

(n, k) == (n, n - k) for all n and k but the identity (n, k) = (n-1, k-1) + (n-1, k) no longer holds for (0,0).

>>> binomial(0,0)
1
>>> binomial(-1,-1),binomial(-1,0)
(1, 1)

In addition, all rewriting now has to check that the sign of
the arguments will produce a rewrite value that is true for
all values matching the assumptions of the arguments. As a case
in point, consider the Piecewise form for binomial(x,y)

    def _eval_rewrite_as_Piecewise(self, n, k, **kwargs):
        from sympy import gamma, Abs, Lt
        isint = lambda x: Eq(x, floor(x))
        nneg = lambda x: Or(Eq(x, 0), Eq(x/Abs(x), 1))
        for i in igen():
            if not n.has(i) and not k.has(i):
                break
        return Piecewise(
            (factorial(n)/factorial(k)/factorial(n - k),
                And(isint(n), isint(k),
                nneg(n), nneg(k), nneg(n - k))),
            (ff(n, k)/factorial(k),
                And(isint(k), nneg(k))),
            (ff(n, n - k)/factorial(n - k),
                And(isint(n - k), nneg(n - k))),
            (0, And(*[i for x in (n, k, n - k)
                for i in [isint(x), Lt(x, 0)]])),
            (gamma(n + 1)/gamma(k + 1)/gamma(n - k + 1),
                True))

The checks for non-negative values are in place to guarantee that
the given expression will be true for arguments with matching
assumptions. The 4th condition (x, y and x - y all negative integers)
is there to guard from returning gamma for that case which would give
nan when the actual answer is 0.

Other comments

Release Notes

  • functions
    • binomial(n,k) no longer gives 0 for negative values of k
    • binomial can now be rewritten as Piecewise and all other rewrites respect assumptions on args parsing

@sympy-bot
Copy link

sympy-bot commented Apr 21, 2020

Hi, I am the SymPy bot (v158). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • functions
    • binomial(n,k) no longer gives 0 for negative values of k (#19158 by @smichr)

    • binomial can now be rewritten as Piecewise and all other rewrites respect assumptions on args parsing (#19158 by @smichr)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.6.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

Fixes #19082
Fixes #19030
Fixes #14577
Fixes #14612
work of #18960 continues from this by defining `multinomial` for arbitrary input.


#### Brief description of what is fixed or changed

A new branch of behavior has been opened up for binomial(n,k).

The former behavior was to return 0 whenever k was negative. Now

`(n, k) == (n, n - k)` for all `n` and `k` but the identity `(n, k) = (n-1, k-1) + (n-1, k)` no longer holds for (0,0).
```python
>>> binomial(0,0)
1
>>> binomial(-1,-1),binomial(-1,0)
(1, 1)
```
In addition, all rewriting now has to check that the sign of
the arguments will produce a rewrite value that is true for
all values matching the assumptions of the arguments. As a case
in point, consider the Piecewise form for binomial(x,y)
```python
    def _eval_rewrite_as_Piecewise(self, n, k, **kwargs):
        from sympy import gamma, Abs, Lt
        isint = lambda x: Eq(x, floor(x))
        nneg = lambda x: Or(Eq(x, 0), Eq(x/Abs(x), 1))
        for i in igen():
            if not n.has(i) and not k.has(i):
                break
        return Piecewise(
            (factorial(n)/factorial(k)/factorial(n - k),
                And(isint(n), isint(k),
                nneg(n), nneg(k), nneg(n - k))),
            (ff(n, k)/factorial(k),
                And(isint(k), nneg(k))),
            (ff(n, n - k)/factorial(n - k),
                And(isint(n - k), nneg(n - k))),
            (0, And(*[i for x in (n, k, n - k)
                for i in [isint(x), Lt(x, 0)]])),
            (gamma(n + 1)/gamma(k + 1)/gamma(n - k + 1),
                True))
```
The checks for non-negative values are in place to guarantee that
the given expression will be true for arguments with matching
assumptions. The 4th condition (x, y and x - y all negative integers)
is there to guard from returning gamma for that case which would give
nan when the actual answer is 0.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->


#### Other comments


#### Release Notes

<!-- Write the release notes for this release below. See
https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more information
on how to write release notes. The bot will check your release notes
automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* functions
  * binomial(n,k) no longer gives 0 for negative values of k
  * binomial can now be rewritten as Piecewise and all other rewrites respect assumptions on args parsing

<!-- END RELEASE NOTES -->

@smichr smichr changed the title interpret binomial(n,k) for k < 0 binomial(n,k<0) no longer gives uniform 0 for value Apr 21, 2020
@codecov
Copy link

codecov bot commented Apr 21, 2020

Codecov Report

Merging #19158 into master will increase coverage by 0.066%.
The diff coverage is 92.920%.

@@              Coverage Diff              @@
##            master    #19158       +/-   ##
=============================================
+ Coverage   75.653%   75.720%   +0.066%     
=============================================
  Files          651       650        -1     
  Lines       169353    169462      +109     
  Branches     39969     40001       +32     
=============================================
+ Hits        128122    128317      +195     
+ Misses       35628     35555       -73     
+ Partials      5603      5590       -13     

@asmeurer
Copy link
Member

My main concern with changing the definition of binomial is how it affects simplification. Certain identities are not true for certain definitions, which means we shouldn't apply them in simplification unless the assumptions are met. Which identities are the most important in simplification, and for which definition(s) do they hold?

@iammosespaulr
Copy link
Contributor

Which identities are the most important in simplification, and for which definition(s) do they hold?

@asmeurer if you take a look below, you'll see that the first 5 identities hold without any assumptions (except finite=True)
https://github.com/sympy/sympy/blob/457fd213b4d17ce90c7db664e7948a4d472400ba/sympy/functions/combinatorial/tests/test_comb_factorials.py#L423-L437


w.r.t simplification, it tends to call rewrite_as_gamma and we've got that covered. It'll work given the correct assumptions.

@@ -16,7 +16,7 @@
def test_rational_algorithm():
f = 1 / ((x - 1)**2 * (x - 2))
assert rational_algorithm(f, x, k) == \
(-2**(-k - 1) + 1 - (factorial(k + 1) / factorial(k)), 0, 0)
(-2**(-k - 1) - k, 0, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually simplified this part even further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're equivalent btw

>>> (-2**(-k - 1) + 1 - (factorial(k + 1) / factorial(k))).equals(-2**(-k - 1) - k)
True

@asmeurer
Copy link
Member

Another thing that I'd like to see is documentation that discusses the definition, what the advantages are, alternate definitions (it should at least be clear that there are alternate definitions), and so on. It can go in the docstring, unless it ends up being really long, in which case it may be better as a separate document in the Sphinx docs.

@asmeurer
Copy link
Member

And we should make it clear in the release notes that this is a breaking change.

result = result * d // i
return Ntype(result)
else:
return Mul(*[n + i for i in range(1 - k, 1)])/factorial(k)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screen Shot 2020-04-22 at 1 56 45 AM
link here
This is what our implementation does.

>>> n = -3; k = -5
>>> ff(n,k)/factorial(k)
nan
>>> k = n - k
>>> ff(n,k)/factorial(k) <- we choose the second option
6

and yes binomial(-3, -5) == 6

@iammosespaulr
Copy link
Contributor

@smichr the sympy-bot only has one author in it's release notes, even though the commits have 2 authors. Bug ?

@asmeurer
Copy link
Member

Yeah, the SymPy Bot doesn't know about the co-authored-by flag. I opened an issue here sympy/sympy-bot#85. If it isn't fixed before this is merged you can just update the entry in the wiki manually.

@smichr
Copy link
Member Author

smichr commented Apr 21, 2020

this is a breaking change.

@iammosespaulr , correct me if I am wrong, but the only thing that will break is the unlimited rewriting of binomial when assumptions do not allow. So what this means is that one must now declare their k values as positive to get the previous rewriting behavior. (I say positive because some rewriting won't happen if both n and k can be zero.)

@asmeurer
Copy link
Member

If it isn't fixed before this is merged you can just update the entry in the wiki manually.

@smichr you could also fix this by making the author metadata in the first commit point to @iammosespaulr. Git's own metadata can only have one author, which is why the co-authored-by thing in the commit message exists.

@iammosespaulr , correct me if I am wrong, but the only thing that will break is the unlimited rewriting of binomial when assumptions do not allow. So what this means is that one must now declare their k values as positive to get the previous rewriting behavior. (I say positive because some rewriting won't happen if both n and k can be zero.)

I mean breaking in the sense that the definition of the function is different. So if someone relied on the old definition somehow, then this will break things for them. It might be worth in the documentation, in the part that discusses alternate definitions, to give Piecewise expressions for them for anyone who prefers to work with them.

@sylee957
Copy link
Member

Should this be considered as fixing the wrong result or making a backward incompatible change?
I think that this an ambiguous place between, but I'm afraid that we may need to provide both binomial definitions with a new function like binomial2 if we take backward compatibility most seriously.

On the contrary, we should investigate if it is easy to build the definition of Feller/Comtet/Knuth on top of this. I suppose that it can be built with a simple piecewise wrapper, and this can be documented.

@iammosespaulr
Copy link
Contributor

this is a breaking change.

@iammosespaulr , correct me if I am wrong, but the only thing that will break is the unlimited rewriting of binomial when assumptions do not allow.

Yes!

So what this means is that one must now declare their k values as positive to get the previous rewriting behavior. (I say positive because some rewriting won't happen if both n and k can be zero.)

Yeah!, nonnegative will do, cause we have gamma(n + 1) and while n is nonnegative we're good to go 😄

@iammosespaulr
Copy link
Contributor

It might be worth in the documentation, in the part that discusses alternate definitions, to give Piecewise expressions for them for anyone who prefers to work with them.

Gotcha

@iammosespaulr
Copy link
Contributor

iammosespaulr commented Apr 22, 2020

Should this be considered as fixing the wrong result or making a backward incompatible change?

Fixing the wrong result, not necessarily wrong but, it worked well for the previous definition cause it was limited.

On the contrary, we should investigate if it is easy to build the definition of Feller/Comtet/Knuth on top of this. I suppose that it can be built with a simple piecewise wrapper, and this can be documented.

tbh I'm not exactly sure which implementation this is, cause FCK doesn't work for negative integers and JRL uses a limiting gamma value 🤷‍♂️.
our's is consistent with the version mathematica uses tho

#19067 needs to be fixed before a textbook implementation is possible ?

@smichr
Copy link
Member Author

smichr commented Apr 22, 2020

Should this be considered as fixing the wrong result or making a backward incompatible change?

As a suggestion for the future, it would be better to raise NotImplemented for an unhandled range of values (in this case k < 0) rather than making an arbitrary value "for convenience" (as was the previous behavior).

@smichr smichr changed the title binomial(n,k<0) no longer gives uniform 0 for value binomial(n,k<0) is no longer always 0 Apr 22, 2020
@asmeurer
Copy link
Member

It's not a wrong result. The different definitions are all valid. There's no unique way of extending the binomial to negative arguments, so you have to define how to do it. See the discussions in #14577

@asmeurer
Copy link
Member

As a suggestion for the future, it would be better to raise NotImplemented for an unhandled range of values (in this case k < 0) rather than making an arbitrary value "for convenience" (as was the previous behavior).

The existing definition is not a random thing that was chosen "for convenience". It's a well established definition that has mathematical backing. See #14577 and the John Cook blog post.

I don't think "the values are all 0" should be considered inherently a bad thing. That's what falls out of the limiting definition of gammas. One could just as well ask why factorials of negative integers are infinite, when it could be more "useful" for them to be some finite value. But the gamma function has a very clean mathematical definition, and nice properties (like being the only log-convex extension of factorial), and it just happens to have poles at negative integers. So we should be focused on which definition has the better inherent properties, not just which has values that seem nicer at a glance because they aren't 0.

@smichr
Copy link
Member Author

smichr commented Apr 22, 2020

not a random thing that was chosen "for convenience"

I am just quoting the documentation (which may be wrong). Whereas factorial says

By convention (consistent with the gamma function and the
binomial coefficients), factorial of a negative integer is complex infinity.

binomial says

For the sake of convenience for negative integer k this function will return zero no matter what
valued is the other argument.

And that automatically means the relationship (n,k) = (n, n - k) is being ignored since (3,-2) could be (3, 5). But I really don't know much about these things and will rely on @iammosespaulr for the heavy lifting here (or anyone else that wants to help).

@asmeurer
Copy link
Member

binomial(3, 5) is 0.

@iammosespaulr
Copy link
Contributor

I think binomial(-3 , -5) is more appropriate 😅 .
binomial(-3, 2) is 6 and now binomial(-3 , -5) is also 6. earlier it used to be zero

binomial(3, 5) is 0.

@iammosespaulr
Copy link
Contributor

and @asmeurer I've got an idea about the gamma rewrite. One that is consistent with all integers negative and positive.
Yes gamma is infinity for any integers other than .
I propose doing something like this.
Screen Shot 2020-04-23 at 2 04 06 AM
the version I just came up with seems to work flawlessly for both negative and positive ints.
@smichr if this works out, we wouldn't need to use conditionals in rewrite_gamma

@asmeurer
Copy link
Member

This is the sort of thing that should be documented (even if we keep the current definition it should be documented). The definition of binomial(n, k) for 1 <= k <= n integer is completely unambiguous, because its defined combinatorially (the number of ways of choosing k elements from n, or the size of the set {A | card(A)=k and x ⊆ {1, ..., n}}). The extension for k > n or k = 0 follows from the common conventions on counting. The ambiguity comes when n or k are negative integers, or more generally when they are arbitrary complex numbers. You can't define those cases combinatorially, so you have to extend the base definition somehow. You can use n!/(k!(n-k)!), and use gammas instead of factorials, but the issue is that for negative integers, the factorial is infinite, so you have to take limits. But there are two variables, so the limit isn't unique (it depends on what order you take limits).

I am just quoting the documentation (which may be wrong). Whereas factorial says

I see. That's a poorly chosen phrase in the documentation. There is a motivation for the definition. It just isn't unique in the same way factorial is. There aren't strong arguments to make factorial(-n) anything other than infinity. The only factorial identity is n! = n*(n-1)!, which works with the definition (0! = 1 = 0*oo is an indeterminate), but it does work out with the gamma extension, as well as the combinatorial definition of binomial. But for binomial there are arguments for both definitions, because some identities work for one and some for the other. And there isn't a strong combinatorial reason to choose one over the other, at least that I'm aware of (if there is let me know).

@smichr
Copy link
Member Author

smichr commented Apr 22, 2020

I think binomial(-3 , -5) is more appropriate

yes...that's what I meant. Ugh. I should just stay out of this.

@iammosespaulr
Copy link
Contributor

And there isn't a strong combinatorial reason to choose one over the other, at least that I'm aware of (if there is let me know).

@asmeurer the reason why I advocate for -ve int n's and -k's are because.

Take the example ( y + x ) -3
( I'm setting y = 1 for simplicity )

(1 + x)-3 = 1 - 3 x1 + 6 x2 - 10 x3 + 15 x4 + O(x5) (Taylor series)
(1 + x)-3 = (1/x)3 - 3/x4 + 6/x5 - 10/x6 + O((1/x)7) (Laurent series)

binomial(-3, 3)= 10, binomial(-3, 4) = 15 ,binomial(-3, -5)= 6, binomial(-3, -6) = 10 and so on

so binomial(n, k) can be interpreted as the coefficient of the term with power k in a polynomial with two variables of order n

I think this beautifully captures the essence of having binomial support all integers. ( this interpretation works for complex numbers too )
This also easily extends to multinomials.

@asmeurer
Copy link
Member

so binomial(n, k) can be interpreted as the coefficient of the term with power k in a polynomial with two variables of order n

#14577 (comment)

@iammosespaulr
Copy link
Contributor

@asmeurer also, extending binomials to -ve integers gives multinomial a whole new level of symmetry

the order of the arguments shouldn't matter in multinomial !


With extension to negative ints...

binomial(n , k) = multinomial(k , n - k) = multinomial(n - k , k) = binomial(n , n - k)

i.e multinomial(i , j) == multinomial(j , i)


Without extension

binomial(n, k) == multinomial(k, n - k)
e.g
binomial(-3  , -5) == 0 == multinomial(-5 , 2)
binomial(-3 , 2) == 6 == multinomial(2, -5)
so
multinomial(i , j) != multinomial(j , i)

This is just wrong! cause multinomial(i , j) == multinomial(j , i)

smichr and others added 4 commits April 27, 2020 10:54
The reason this commit is so long is that an entire new branch of
behavior has been opened up. So the assumption that

(n, k) = (n-1, k-1) + (n-1, k)

no longer holds for (0,0).

In addition, all rewriting now has to check that the sign of
the arguments will produce a rewrite value that is true for
all values matching the assumptions of the arguments. As a case
in point, consider the Piecewise form for binomial(x,y)

    def _eval_rewrite_as_Piecewise(self, n, k, **kwargs):
        from sympy import gamma, Abs, Lt
        isint = lambda x: Eq(x, floor(x))
        nneg = lambda x: Or(Eq(x, 0), Eq(x/Abs(x), 1))
        for i in igen():
            if not n.has(i) and not k.has(i):
                break
        return Piecewise(
            (factorial(n)/factorial(k)/factorial(n - k),
                And(isint(n), isint(k),
                nneg(n), nneg(k), nneg(n - k))),
            (ff(n, k)/factorial(k),
                And(isint(k), nneg(k))),
            (ff(n, n - k)/factorial(n - k),
                And(isint(n - k), nneg(n - k))),
            (0, And(*[i for x in (n, k, n - k)
                for i in [isint(x), Lt(x, 0)]])),
            (gamma(n + 1)/gamma(k + 1)/gamma(n - k + 1),
                True))

The checks for non-negative values are in place to guarantee that
the given expression will be true for arguments with matching
assumptions. The 4th condition (x, y and x - y all negative integers)
is there to guard from returning gamma for that case which would give
nan when the actual answer is 0.

Co-authored-by: Moses Paul R <iammosespaulr@gmail.com>
Co-authored-by: Chris Smith <smichr@gmail.com>
It is tempting to return an unconditional

gamma(n + 1)/(gamma(k + 1)*gamma(n - k + 1))

The problem is that when an arg in the denominator is zoo
then there is the immediate possibility of getting zoo/(zoo*zoo)
when k = -1, n = -2 is substituted, and this would give nan instead
of 0. If only a single argument of the 2 denominators were zoo then
the correct value of 0 would be returned. The problem is that if
the denominator is unevaluated and the expression is multiplied by
a gamma with the same argument the two arguments would silently
cancel:

    >>> gamma(n + 1)/(gamma(k + 1)*gamma(n - k + 1))*gamma(k + 1)
    gamma(n + 1)/gamma(n - k + 1)
    >>> _.subs(k,-2).subs(n,0)
    1/2

This value of 1/2 is the correct limiting value but if the terms
were not cancelled, a nan would be the result since 0*zoo is nan.
@smichr smichr force-pushed the negk branch 2 times, most recently from 5be4592 to f8d6070 Compare April 27, 2020 19:08
@czgdp1807
Copy link
Member

Any news on this? @smichr

Copy link
Member

@asmeurer asmeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see documentation on this before this is merged. #19158 (comment)

@iammosespaulr
Copy link
Contributor

Another thing that I'd like to see is documentation that discusses the definition, what the advantages are, alternate definitions (it should at least be clear that there are alternate definitions), and so on. It can go in the docstring, unless it ends up being really long, in which case it may be better as a separate document in the Sphinx docs.

It's almost done, I added definitions for the main implementation.
we worked out most of the kinks w.r.t rewriting as gamma, so now simplification works for both neg int's etc.
Working on docs for ☝️ .

@smichr
Copy link
Member Author

smichr commented Apr 28, 2020

Any news on this? @smichr

work is continuing at #18960 -- the addition of multinomial is trivial by comparison so let's doo all the review there.

@smichr smichr closed this Apr 28, 2020
@smichr smichr deleted the negk branch May 4, 2020 04:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants