A Python wrapper for the libffm library.
git clone firstname.lastname@example.org:turi-code/GraphLab-Create-SDK.git sdk git clone email@example.com:turi-code/python-libffm.git ffm cd ffm make
To run the following examples you will also need to register for GraphLab Create. This software is free for non-commercial use and has a 30 day free trial otherwise.
After that, try running the basic example:
If you want to try a less synthetic example, download the 1TB Criteo dataset. First test things out with a small sample of the dataset.
gzip -cd day_0.gz| head -n 1000000 > criteo-sample.tsv
Next we have a sample script for performing some of the same types of feature engineering that the contest winners have been using:
Train a FFM model on this data.
You should see something like the following (which appears to be overfitting in this example):
PROGRESS: iter tr_logloss va_logloss PROGRESS: 0 0.12794 0.12353 PROGRESS: 1 0.10907 0.12636 PROGRESS: 2 0.09263 0.13318 PROGRESS: 3 0.07679 0.14200 PROGRESS: 4 0.06411 0.15130 PROGRESS: 5 0.05484 0.16034 ...
The package makes it easy to train models directly from SFrames.
import ffm train = gl.SFrame('examples/small.tr.sframe') test = gl.SFrame('examples/small.te.sframe') m = ffm.FFM(lam=.1) m.fit(train, target='y', nr_iters=50) yhat = m.predict(test)
Each column is interpreted as a separate "field" in the model. Only dict columns are currently supported, where the keys of each dict are integers that represent the feature id.
libfmm.cpp: uses C++ macros provided by Turi's SDK to wrap
libffm's methods as Python classes and methods.
fmm.py: a scikit-learn-style wrapper.
lib/: the original library, where cout statements have been replaced with Turi's
progress_streamto allow progress printing to Python.
examples/: example scripts for training models using the sample data provided with the original package as well as with data similar to Kaggle's criteo competition.
For more on how and why we made this, see the blog post.
This package provided under the 3-clause BSD license.