# Fast computation of roc auc metric in Python

Python code to compute the roc auc metric.  That code runs about twcie as fast as the corresponding scikit-learn function.

In [1]:
import numpy as np 
from numba import jit

@jit
def fast_auc(y_true, y_prob):
    y_true = np.asarray(y_true)
    y_true = y_true[np.argsort(y_prob)]
    nfalse = 0
    auc = 0
    n = len(y_true)
    for i in range(n):
        y_i = y_true[i]
        nfalse += (1 - y_i)
        auc += y_i * nfalse
    auc /= (nfalse * (n - nfalse))
    return auc

Let's create a random example.

In [2]:
y_true = np.random.randint(0,2,1000000)
y_pred = np.random.rand(1000000)

The roc auc should be close to 0.5 for random prediction.

In [3]:
fast_auc(y_true, y_pred)

0.501004845745664

It is the case.  Let's see what scikit-learn code does here.

In [4]:
from sklearn.metrics import roc_auc_score

roc_auc_score(y_true, y_pred)

0.50100484574566395

Seems we are in good shape as the result is very close.

A little sanity check.

In [5]:
fast_auc(y_true, y_true)

1.0

Which one is faster?

In [6]:
%timeit fast_auc(y_true, y_pred)

10 loops, best of 3: 130 ms per loop


In [7]:
%timeit roc_auc_score(y_true, y_pred)

1 loop, best of 3: 275 ms per loop


My code is more than twice as fast.