Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combination (Math) implementation in NLTK #1181

Closed
alvations opened this Issue Oct 28, 2015 · 6 comments

Comments

Projects
None yet
3 participants
@alvations
Copy link
Contributor

alvations commented Oct 28, 2015

Strangely, python don't seem to provide a quick/easy way to get mathematical combinations/choose (nCr). It's sort of essential for the RIBES algorithms and most rank-based Machine Translation metrics.

Would it be okay to add this function to NLTK? It seems to be the fastest computation of combinations.

def choose(n, k):
    """
    This function is a fast way to calculate binomial coefficients, commonly
    known as nCk, i.e. the number of combinations of n things taken k at a time. 
    (https://en.wikipedia.org/wiki/Binomial_coefficient).

        >>> choose(4, 2)
        6
        >>> choose(6, 2)
        15

    :param n: The number of things.
    :type n: int
    :param r: The number of times a thing is taken.
    :type r: int
    """
    if 0 <= k <= n:
        ntok, ktok = 1, 1
        for t in range(1, min(k, n - k) + 1):
            ntok *= n
            ktok *= t
            n -= 1
        return ntok // ktok
    else:
        return 0

I could do a wrap for the code shipped with native python but it's kind of silly to generate combinations everytime i need to count nCr and then do a length from it, i.e.

import itertools
def choose(n,k):
    return len(list(itertools.combinations(range(n),k)))

If it's alright to add the combination calculation function to NLTK where should it go? nltk.util? Or should there be a nltk.math?

Or maybe it's already in the pipeline in newer versions of python. Does anyone know about that?

@alvations

This comment has been minimized.

Copy link
Contributor Author

alvations commented Oct 28, 2015

Currently, the combination binomial coefficient is implemented in nltk.translate.ribes_score.choose() at https://github.com/alvations/nltk/blob/ribes/nltk/translate/ribes_score.py#L184 in a sandbox branch but I'm not sure whether it should reside there.

@hoontw

This comment has been minimized.

Copy link
Contributor

hoontw commented Oct 29, 2015

+1 for nCr function. I wonder if any of the external packages used in nltk already have that.

@stevenbird

This comment has been minimized.

Copy link
Member

stevenbird commented Oct 30, 2015

+1 yes good idea to check whether it's provided in any of the other packages we import

@alvations

This comment has been minimized.

Copy link
Contributor Author

alvations commented Oct 30, 2015

scipy does that but it's slower than this implementation =(

@stevenbird

This comment has been minimized.

Copy link
Member

stevenbird commented Oct 31, 2015

That's surprising. It might be worth mentioning that to them. Anyway, I'm happy for us to include nCr. It probably belongs in nltk.util.

@alvations

This comment has been minimized.

Copy link
Contributor Author

alvations commented Nov 2, 2015

@alvations alvations closed this Nov 7, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.