Skip to content

classification performance measures #1577

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jseabold opened this issue Apr 8, 2014 · 4 comments
Open

classification performance measures #1577

jseabold opened this issue Apr 8, 2014 · 4 comments

Comments

@jseabold
Copy link
Member

jseabold commented Apr 8, 2014

Add classification performance statistics

def precision(pred_table):
    """
    Precision given pred_table. Binary classification only. Assumes group 0
    is the True.

    Analagous to (absence of) Type I errors. Probability that a randomly
    selected document is classified correctly. I.e., no false negatives.
    """
    tp, fp, fn, tn = map(float, pred_table.flatten())
    return tp / (tp + fp)


def recall(pred_table):
    """
    Precision given pred_table. Binary classification only. Assumes group 0
    is the True.

    Analagous to (absence of) Type II errors. Out of all the ones that are
    true, how many did you predict as true. I.e., no false positives.
    """
    tp, fp, fn, tn = map(float, pred_table.flatten())
    try:
        return tp / (tp + fn)
    except ZeroDivisionError:
        return np.nan


def accuracy(pred_table):
    """
    Precision given pred_table. Binary classification only. Assumes group 0
    is the True.
    """
    tp, fp, fn, tn = map(float, pred_table.flatten())
    return (tp + tn) / (tp + tn + fp + fn)


def fscore_measure(pred_table, b=1):
    """
    For b, 1 = equal importance. 2 = recall is twice important. .5 recall is
    half as important, etc.
    """
    r = recall(pred_table)
    p = precision(pred_table)
    try:
        return (1 + b**2) * r*p/(b**2 * p + r)
    except ZeroDivisionError:
        return np.nan

Also missing ROC curve.

@ysunmi0427
Copy link

@jseabold I would like to work on this enhancement. Would you like to guide me to start working on this?

@rluedde
Copy link

rluedde commented Jul 7, 2020

Correct me if I'm wrong but it looks like this has already been done? In file statsmodels/tools/eval_measures.py

@ysunmi0427 Do you agree?

@anuragwatane
Copy link

We would like to work on this issue.

@jseabold jseabold removed this from the 0.7 milestone Apr 6, 2021
@jseabold
Copy link
Member Author

jseabold commented Apr 6, 2021

If anyone would want to work on this, there is a start on a solution in #1650. Reading that would be a good place to start. That branch needs some tests and some small function API changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants