Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concrete: Implement Zeilberger's algorithm for finding recurrences #14701

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

Kircheisser
Copy link

@Kircheisser Kircheisser commented May 6, 2018

I have written an implementation of Zeilberger's algorithm, if this is of interest. It allows one to compute a great deal of definite hypergeometric sums that can't be done by for example gosper.

For example, if we want to calculate sum of F(n, k) = (-1)^k * binomial(2n, k)^3 from k = 0 to 2n, this algorithm provides us a first order recurrence in n that F satisfies, this leads to a first order recurrence our sum satisfies. This recurrence is then quickly solved and sum found to be (-1)^n (3n)! / (n!)^3.

  • concrete
    • Added an implementation of Zeilbergers algorithm for computing definite hypergeometric sum and integrated this implementation into the methods that compute definite sums, this allows some new sums to be computed.

Other comments

If this is of interest there are other more general things that can be done in the same vein.

This implements Zeilberger's algorithm for finding recurrences that
hypergeometric functions satisfy. Given a hypergeometric function
F(n, k), we find a function G and polynomials a_0, ... , a_J in n
such that

a_0F(n, k) + a_1F(n+1, k) + ... + a_JF(n+J, k) = G(n, k+1) - G(n, k)

This is of great benefit when finding definite sums that do not have
closed form expressions for their indefinite sums.

A simple example,

>>> F = binomial(n, k)
>>> zb_recur(F, 1, n, k)
([-2, 1], k/(k - n -1))

This tells us that for F(n, k) = binomial(n, k) we have
-2F(n, k) + F(n+1, k) = G(n, k+1) - G(n, k)
where G(n, k) = F * k/(k - n -1) (and 0 for k>n)

Now denote the sum of F(n, k) from k = 0 to n by f(n), summing
both sides of the recurrences from k = 0 to n+1, the right hand side
telescopes and we arrive at -2f(n) + f(n + 1) = 0. After checking an
initial condition we find f(n) = 2^n.

For another example taking
F(n, k) = (-1)^k * binomial(n, k) / binomial(x+k, k)
we quickly find the sum from k = 0 to n to be x / (x + n)
At some point in the algorithm we (clumsily) find the first symbol
in an expression, we intended to break out after finding it and failed
to do so.
return [c[1] for c in coeffs[0:J + 1]], combsimp(G / F)

"""
Tests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests should be kept separate.

Also changed formatting of zb_recur docstring as did not intend for
it to provide tests via ./bin/doctest.
@asmeurer
Copy link
Member

asmeurer commented May 6, 2018

This needs tests.

@asmeurer
Copy link
Member

asmeurer commented May 6, 2018

This is promising. The example you gave gives:

>>> pprint(summation((-1)**k * binomial(2*n, k)**3, (k, 0, 2*n)))
⎧         3
⎪⎛  2⋅n  ⎞   ┌─  ⎛        1, 1, 1, 1        │  ⎞    ┌─  ⎛-2⋅n, -2⋅n, -2⋅n │  ⎞
⎪⎜       ⎟ ⋅ ├─  ⎜                          │ 1⎟ +  ├─  ⎜                 │ 1⎟  for -6⋅n - 2 < 0
⎪⎝2⋅n + 1⎠  4╵ 3 ⎝2⋅n + 2, 2⋅n + 2, 2⋅n + 2 │  ⎠   3╵ 2 ⎝      1, 1       │  ⎠
⎪
⎪                              2⋅n
⎪                              ____
⎨                              ╲
⎪                               ╲              3
⎪                                ╲      k ⎛2⋅n⎞
⎪                                ╱  (-1) ⋅⎜   ⎟                                    otherwise
⎪                               ╱         ⎝ k ⎠
⎪                              ╱
⎪                              ‾‾‾‾
⎩                             k = 0

which looks like the right answer, except SymPy can't simplify it:

>>> a = summation((-1)**k * binomial(2*n, k)**3, (k, 0, 2*n))
>>> b = (-1)**n*factorial(3*n)/factorial(n)**3
>>> [hyperexpand(a.subs(n, i)) == b.subs(n, i) for i in range(10)]
[True, True, True, True, True, True, True, True, True, True]

Have you looked into what it will take to include this in Sum.doit?

There are 3 examples in the docstring of zb_recur. We check the
entire result of 2 and just the recurrence of 1. In a different file
we have a number of tests, for convenience we just check that
the function G/F returned by the function is correct. It is highly
unlikely this function be correct and the recurrence wrong.
@Kircheisser
Copy link
Author

Kircheisser commented May 6, 2018

I have looked into it briefly. It shouldn't be so hard. Certainly given the recurrence for F(n, k) it is simple to get to the sum itself. However it involves solving a recurrence relation with polynomial coefficients, e.g. via rsolve.

This is fine for order one, but rsolve does not seem to produce the most general answers for higher order recurrences, which may lead to it not always finding a solution when it should.

Certainly I am happy to include it in.

b_deg = _find_b_deg(p_2, p_3, p)

if b_deg < 0:
return "try higher order"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functions should not return strings. If there is an error, it should raise an exception (like raise ValueError("try higher order")).

if b_deg < 0:
return "try higher order"

b = Poly(sum(w(i) * k**i for i in range(b_deg + 1)), k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just pass Poly a list of coefficients. It will be more efficient that way too.

@Abdullahjavednesar Abdullahjavednesar added the PR: author's turn The PR has been reviewed and the author needs to submit more changes. label May 7, 2018
This function attempts to compute tricky definite sums by find a
recursion with zb_recur, this function presently has some issues
and is not doing as much as it could be. Nonetheless it allows one
to compute sums which sympy is seemingly unable to do (e.g. both
examples in the docstring).

An example,

F = k * binomial(2 * n + 1, 2 * k + 1), calling
zb_sum(F, n, 1, n, k) gives

4*2**(2*n - 3)*factorial(n - 1/2)/factorial(n - 3/2)

which is correct, and simplifies to 4^(n-1) * (2n - 1).

Presently zb_sum has some problems when it comes to actually making
use of the recurrence (which is working fine), for example rsolve
does not produce the correct result for the recurrence that
(-1)^k * binomial(2n, k)^3 satisfies and as a result can not arrive at
the sum.

Other changes -

Now zb_recur gives a value error rather than returning a string.

In zb_recur we at some point create a polynomial in k and we are
able to just provide the coefficients in a list, so we do that.

The docstring of zb_recur was slightly changed. Previously it
asserted that something it returned was a polynomial, while this
can be made to be true. There is no make it be so, the docstring
now better reflects reality.
@Kircheisser
Copy link
Author

Kircheisser commented May 7, 2018

I have added a function which attempts to do sums with zb_recur. It does some things well and others not so much, for sure it needs some thought (from me).

An example lifted from the docstring.

Take F = (-1)**k * binomial(x - k + 1, k) * binomial(x - 2 * k, n - k) then zb_sum(F, n, 2, n, k)
returns (-1)**n/2 + 1/2

A big issue however is rsolve. It for example is not able to solve the recurrence for (-1)^k * binomial(2n,k)^3 (zb_recur for sure provides the correct recurrence). Many other examples exist (perhaps most that zb_recur will provide).

A particularly simple example of where this is an issue, we attempt to compute the sum of
F(n, k) = binomial(n, k) * binomial(2k, k) / (-2)^k.

zb_recur informs us that the sum satisfies the recurrence f(n) - (n + 2)*f(n + 2)/(n + 1) = 0, which is correct. But rsolve is unable to solve it.

In fact, there is a first order recursion rsolve fails at (simpler than the (-1)^k * binomial(2n, k)^3 one)

Take r = n*f(n) + f(n+1) then rsolve(r, f(n)) returns None

Such trivial recursions are likely quite common in these sorts of hypergeometric sums - there are a number of identities where things are say 1 for n = 0 and otherwise 0.

I would be happy to look at rsolve some time to see if I can fix these things.

@asmeurer
Copy link
Member

asmeurer commented May 7, 2018

So it sounds like we may need to improve rsolve before this can be really be useful. Even so, it may be useful to integrate this into Sum, even if it can only compute a subset of what it could because of rsolve.

Changed the arguments so that the function can be called just with
what is provided to sum. For now we will check conditions so that
vaniish is true and won't attempt to use zb_sum if it is not.

Also corrected some limits that would be wrong for more exotic
upper limits than n + c.
@Abdullahjavednesar Abdullahjavednesar added PR: sympy's turn and removed PR: author's turn The PR has been reviewed and the author needs to submit more changes. labels May 8, 2018
Now zb_sum can be called with just the information provided to it
by summation. So it can now conceptually be included. There is now
a function which tries to work out if a hypergeometric term vanishes
past some window by looking at the roots of F(n, k + 1) / F(n, k).

Noticed that thing were being called hypergeometric functions when
what was meant was hypergeometric term, so that's changed.
Need this so that it simplifies correctly.
Now summation calls zb_sum if it has decided it has something
hypergeometric and is just about to return the piecewise representation.

We call zb_sum with J = 1 and J = 2 only to make it less likely the sum
gets caught in zb_sum for a long time. This still allows for a great
deal of sums, 'most' of the ones of interest that can be done are
liable to be caught just with J = 1, maybe.

Now for example summation(binomial(n, k) + 1, (k, 0, n)) will return
2**n + n + 1.

Will see if this breaks anything and if not have a clean up and check
everything is fine.
@Kircheisser
Copy link
Author

Kircheisser commented May 10, 2018

I have implemented zb_sum into sum now, it will definitely need more checking by me. but it can now do sums it couldn't do before e.g. the sum (-1)**k * binomial(x - k + 1, k) * binomial(x - 2 * k, n - k) k=0 to n one, and if rsolve could solve the first order recurrence it is given it would solve sum (-1)^k * binomial(2n, k)^3 from k = 0 to 2n.

Anywho, I have broken some tests, I will need to see what's happened.

The first one I see

def test_issue_2787():
    n, k = symbols('n k', positive=True, integer=True)
    p = symbols('p', positive=True)
    binomial_dist = binomial(n, k)*p**k*(1 - p)**(n - k)
    s = Sum(binomial_dist*k, (k, 0, n))
    res = s.doit().simplify()
    assert res == Piecewise(
        (n*p, p/Abs(p - 1) <= 1),
        ((-p + 1)**n*Sum(k*p**k*(-p + 1)**(-k)*binomial(n, k), (k, 0, n)),
        True))

It looks like summation with zb_sum is just returning np.

Edit - I only see now that sympy is doing some clever things piecewise not assuming n is integer aha, pretty cool. I'll have to change things so that zb_sum is only called with this assumption.

@asmeurer
Copy link
Member

If Zeilberger isn't able to give convergence conditions you might want to move it below the meijerg algorithms that do give them so that they are tried first.

@Kircheisser
Copy link
Author

Kircheisser commented May 10, 2018

Sorry which convergence do you mean? Zeilberger (I believe, I am somewhat learning as I go along) should be such that when it tries to sum say from 0 to n (I only say things about finite sums), the resulting expression will be the correct one for positive integer n where it's defined provided the summand is defined everywhere in the sum.

For the above binomial distribution example, do we not want it to just return np, given that n is positive integer?

@asmeurer
Copy link
Member

Ah you're right. I don't know if the condition given for p is useful. The bounds given by Sum or Integral aren't guaranteed to be tight.

If it only deals with finite sums I think having Zeilberger first should be fine.

zb_sum is only valid for such n and summation allows for more than
this, so we need to check.

Changed a test in sums_products, previously a piecewise result
was being returned, while this was correct, zb_sum produces the
exact result.
Before zb_sum could potentially recursively call itself if there
were strange upper limits on the sum like n^2. If it couldn't solve
the resulting thing this could cause an infinite loop.
Added try-catch to zb_sum as it seemed the neatest way to deal with
G_a or G_b not being defined, or rsolve recieving too complicated
an expression.

Summation used to call zb_sum with J=1 and J=2, but this can lead to
some sums taking quite a lot of time while zb_recur tries to solve the
J=2 case, and while often a recurrence is produced, rsolve is unable
to solve it so ultimately None is returned anyway.
@moorepants
Copy link
Member

@Kircheisser Can you add release notes to the first comment box as per: https://github.com/sympy/sympy/wiki/Writing-Release-Notes

@moorepants
Copy link
Member

This looks fine to me. I'm not familiar with the methods or code. The only thing questionable is the changed test.

@moorepants moorepants added PR: author's turn The PR has been reviewed and the author needs to submit more changes. and removed PR: sympy's turn labels Oct 7, 2018
@Kircheisser
Copy link
Author

Hey, just seen this, I'll add these release notes.

@sympy-bot
Copy link

sympy-bot commented Oct 13, 2018

Hi, I am the SymPy bot (v149). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • concrete
    • Added an implementation of Zeilbergers algorithm for computing definite hypergeometric sum and integrated this implementation into the methods that compute definite sums, this allows some new sums to be computed. (#14701 by @Kircheisser and @sidhantnagpal)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.6.

Note: This comment will be updated with the latest check if you edit the pull request. You need to reload the page to see it.

Click here to see the pull request description that was parsed.

I have written an implementation of Zeilberger's algorithm, if this is of interest. It allows one to compute a great deal of definite hypergeometric sums that can't be done by for example gosper. 

For example, if we want to calculate sum of F(n, k) = (-1)^k * binomial(2n, k)^3 from k = 0 to 2n, this algorithm provides us a first order recurrence in n that F satisfies, this leads to a first order recurrence our sum satisfies. This recurrence is then quickly solved and sum found to be (-1)^n (3n)! / (n!)^3. 

<!-- BEGIN RELEASE NOTES -->
* concrete
    * Added an implementation of Zeilbergers algorithm for computing definite hypergeometric sum and         integrated this implementation into the methods that compute definite sums, this allows some new sums to be computed.
<!-- END RELEASE NOTES -->

#### Other comments
If this is of interest there are other more general things that can be done in the same vein.

@sidhantnagpal sidhantnagpal added PR: sympy's turn and removed PR: author's turn The PR has been reviewed and the author needs to submit more changes. labels Oct 19, 2018
@sidhantnagpal
Copy link
Member

I have resolved the conflict (it was related to the ordering of imports). Let's see if this can be completed.

@czgdp1807
Copy link
Member

@Kircheisser Are you still working on it? Please resolve the conflicts. Thanks.

@czgdp1807
Copy link
Member

ping @smichr @asmeurer Can we continue this?

@czgdp1807 czgdp1807 added PR: author's turn The PR has been reviewed and the author needs to submit more changes. and removed PR: sympy's turn labels Nov 19, 2019
@czgdp1807 czgdp1807 added Please take over and removed PR: author's turn The PR has been reviewed and the author needs to submit more changes. labels Dec 24, 2019
@Chinrank
Copy link

Hello, I am Kircheisser (though I have since forgotten my credentials for that account). I completely forgot about this after I got a job as a software developer. I still understand what is going on and am happy to continue; and I would hope I can do a better job programming wise now, if there is still an interest?

@asmeurer
Copy link
Member

I think so, unless the algorithms have already been implemented. You can check some of the test cases in SymPy master. You'll need to start a new PR if you can't push to your old account.

@sachin-4099
Copy link
Contributor

I will try to work on this along with @Chinrank to merge this PR.

@Chinrank
Copy link

Chinrank commented Feb 4, 2020

That's great @sachin-4099, I intend to look at it this weekend.

My belief is that we should go through what is already there, document it, confirm its correctness, and write it a bit more nicely. In the code there is a comment referencing where the algorithm can be found, the algorithm itself is reasonably complex but what is written already should be a straightforward implementation of it. Provided you are not already familar with this algorithm I think a good litmus test for this being a reasonable peice of work is that you are satisfied it is correct and understand it.

@Kircheisser
Copy link
Author

I managed to login to my account. Looking through the code and rereading the algorithm, the implementation looks to me to be correct. @sachin-4099 let me know what you think and I can make further changes.

@czgdp1807
Copy link
Member

Thanks @Kircheisser for the update. I have removed the Please take over label. Feel free to push more commits to this PR. Thanks for the contributions.

@abhinav-anand-addepar
Copy link
Member

@Kircheisser This seems good addition to sympy. Are you still working on this? If not, then i would love to take over this pr.

@czgdp1807
Copy link
Member

@abhinav28071999 Would you still like to take this over?

@eagleoflqj eagleoflqj added the Almost Done Active PR that may only needs minor tweak. Shouldn't be burried in long PR list. label May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Almost Done Active PR that may only needs minor tweak. Shouldn't be burried in long PR list. concrete Merge conflict
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet