# Training larger model
In this example we'll train a recommender with more data. Specifically, we are going to be using the entire StackOverflow dataset, with over 600,000 users and 11M questions.

In [1]:
import numpy as np
import scipy.sparse as sp

from lightfm import LightFM
from lightfm.datasets import fetch_stackexchange
from lightfm.evaluation import auc_score

# Downloading the data may take a short while
data = fetch_stackexchange('stackoverflow',
                           test_set_fraction=0.1,
                          indicator_features=False,
                          tag_features=True)

train = data['train']
test = data['test']
features = data['item_features']
labels = data['item_feature_labels']


In [2]:
print('Training set: {:,} users and {:,} items with {:,} interactions.'
     .format(train.shape[0], train.shape[1], train.getnnz()))
print('{:,} distinct tag features with labels like {}.'
     .format(features.shape[1], labels[:3].tolist()))

Training set: 621,594 users and 11,280,896 items with 15,650,381 interactions.
44,256 distinct tag features with labels like [u'c#', u'winforms', u'type-conversion'].


Because the dataset is rather large, let's first run a single epoch and time it.

The timings here are on a dual-core, relatively underpowered laptop: if you have a large multi-core server, the performance should be much better, and scale almost linearly with the number of cores you have available.

In [3]:
model = LightFM(loss='warp',
               no_components=32)

In [4]:
%time model = model.fit(train, item_features=features, epochs=1, num_threads=2)

CPU times: user 14min 45s, sys: 280 ms, total: 14min 45s
Wall time: 7min 26s


Evaluation can also be slow: to evaluate a learning-to-rank model, we need to compute full rankings for all users we have data for in the test set. 

In [5]:
print('Test set has {:,} users with test interactions.'.format((test.getnnz(axis=1) > 0).sum()))

Test set has 159,551 users with test interactions.


This is a bit too large for efficient evaluation, as we will have to calculate model scores for all items for each user. To reduce this, let's randomly select a (much) smaller sample of users for evaluation.

In [6]:
test_small = (sp.diags([(np.random.random(size=test.shape[0]) < 0.001).astype(np.float32)], [0], format='csr')
              * test.tocsr())
print('Test set has {:,} users with test interactions.'.format((test_small.getnnz(axis=1) > 0).sum()))

Test set has 151 users with test interactions.


In [7]:
%time auc = auc_score(model, test_small, item_features=features, train_interactions=train, num_threads=2)

CPU times: user 5min 51s, sys: 48 ms, total: 5min 51s
Wall time: 3min 29s
