FTRL-Proximal

This is an implementation of the FTRL-Proximal algorithm in C with python bindings. FTRL-Proximal is an algorithm for online learning which is quite successful in solving sparse problems. The implementation is based on the algorithm from the "Ad Click Prediction: a View from the Trenches" paper.

Some of the features:

Uses Open MP to parallelize training, and hence is very fast
The python code can operate directly on scipy CSR matrices

Pre-requisites

Dependensies:

It needs: numpy, scipy and open mp
If you use anaconda, it already has numpy, scipy
to install GOMP_4.0 for anaconda, use conda install libgcc

Building

cmake . && make
mv libftrl.so ftrl/
python setup.py install

If you don't have cmake, it's easy to install:

mkdir cmake && cd cmake
wget https://cmake.org/files/v3.10/cmake-3.10.0-Linux-x86_64.sh
bash cmake-3.10.0-Linux-x86_64.sh --skip-license
export CMAKE_HOME=`pwd`
export PATH=$PATH:$CMAKE_HOME/bin

Example

import numpy as np
import scipy.sparse as sp

from sklearn.metrics import roc_auc_score

import ftrl

X = [
    [1, 1, 0, 0, 0],
    [1, 1, 1, 0, 0],
    [0, 1, 1, 1, 0],
    [0, 0, 1, 1, 1],
    [0, 0, 0, 1, 1],   
]

X = sp.csr_matrix(X)
y = np.array([1, 1, 1, 0, 0], dtype='float32')

model = ftrl.FtrlProximal(alpha=1, beta=1, l1=10, l2=0)

# make 10 passes over the data
for i in range(10):
    model.fit(X, y)
    y_pred = model.predict(X)
    auc = roc_auc_score(y, y_pred)
    print('%02d: %.5f' % (i + 1, auc))

We can also use it to solve the regression problem:

from sklearn.metrics import mean_squared_error

y = np.array([1, 2, 3, 4, 5], dtype='float32')

model = ftrl.FtrlProximal(alpha=0.5, beta=1, l1=0, l2=0, model_type='regression')

# make 10 passes over the data
for i in range(10):
    model.fit(X, y)
    y_pred = model.predict(X)
    mse = mean_squared_error(y, y_pred)
    print('%02d: %.5f' % (i + 1, mse))

Use case

This library was used for the Criteo Ad Placement Challenge and showed very competitive performance. You can have a look at the solution here: https://github.com/alexeygrigorev/nips-ad-placement-challenge

In particular, it performed significantly faster than sklearn's Logistic Regression (a wrapper for LIBLINEAR):

sklearn: 1.2 hours to train, auc=0.734
libftrl-python: 2 minutes to train, auc=0.734

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
ftrl		ftrl
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md
build_wheel.sh		build_wheel.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FTRL-Proximal

Pre-requisites

Building

Example

Use case

About

Releases

Packages

Languages

alexeygrigorev/libftrl-python

Folders and files

Latest commit

History

Repository files navigation

FTRL-Proximal

Pre-requisites

Building

Example

Use case

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages