# Converts a logistic regression into C

The logistic regression is trained in python and executed in C.

In [1]:
from jyquickhelper import add_notebook_menu
add_notebook_menu()

## Train a linear regression

In [2]:
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data[:, :2]
y = iris.target
y[y == 2] = 1
lr = LogisticRegression()
lr.fit(X, y)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

## Export  into C

In [3]:
# grammar is the expected scoring model.
from mlprodict.grammar_sklearn import sklearn2graph
gr = sklearn2graph(lr, output_names=['Prediction', 'Score'])
gr

<mlprodict.grammar.gmlactions.MLModel at 0x152f261be0>

We can even check what the function should produce as a score. Types are strict.

In [4]:
import numpy
X = numpy.array([[numpy.float32(1), numpy.float32(2)]])
e2 = gr.execute(Features=X[0, :])
print(e2)

[ 0.         -6.34157228]


We compare with scikit-learn.

In [5]:
lr.decision_function(X[0:1, :])

array([-6.34157245])

Conversion into C:

In [6]:
res = gr.export(lang='c', hook={'array': lambda v: v.tolist(), 'float32': lambda v: float(v)})
print(res["code"])

int LogisticRegression (float* pred, float* Features)
{
    // 90985339872-LogisticRegression - children
    // 90985339648-concat - children
    // 90985339592-sign - children
    // 90985339536-+ - children
    // 90985030320-adot - children
    float pred0c0c00c0[2] = {(float)2.4957928882125406, (float)-4.010113006761804};
    float* pred0c0c00c1 = Features;
    // 90985030320-adot - itself
    float pred0c0c00;
    adot_float_float(&pred0c0c00, pred0c0c00c0, pred0c0c00c1, 2);
    // 90985030320-adot - done
    float pred0c0c01 = (float)-0.8171393275260925;
    // 90985339536-+ - itself
    float pred0c0c0 = pred0c0c00 + pred0c0c01;
    // 90985339536-+ - done
    // 90985339592-sign - itself
    float pred0c0;
    sign_float(&pred0c0, pred0c0c0);
    // 90985339592-sign - done
    // 90985339648-concat - itself
    float pred0[2];
    concat_float_float(pred0, pred0c0, pred0c0c0);
    // 90985339648-concat - done
    memcpy(pred, pred0, 2*sizeof(float));
    // 90985339872-Logistic

We execute the code with module [cffi](https://cffi.readthedocs.io/en/latest/).

In [7]:
from mlprodict.cc import compile_c_function
fct = compile_c_function(res["code"], 2)
fct

<function mlprodict.cc.c_compilation.compile_c_function.<locals>.wrapper_float>

In [8]:
e2 = fct(X[0, :])
e2

array([ 0.        , -6.34157276], dtype=float32)

## Time comparison

In [9]:
%timeit lr.decision_function(X[0:1, :])

30 µs ± 7.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [10]:
%timeit fct(X[0, :])

12.5 µs ± 3.57 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


There is a significant speedup on this example. It could be even faster by removing some Python part and optimizing the code produced by [cffi](https://cffi.readthedocs.io/en/latest/). We can also save the creation of the array which contains the output by reusing an existing one.

In [11]:
out = fct(X[0, :])

In [12]:
%timeit fct(X[0, :], out)

8.91 µs ± 1.67 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
