# Collaborative filtering

In [None]:
from fastai.gen_doc.nbdoc import *
from fastai.collab import * 
from fastai.docs import *

This package contains all the necessary functions to quickly train a model for a collaborative filtering task.

## Overview

Collaborative filtering is when you're tasked to predict how much a user is going to like a certain item. The fastai library contains a `CollabFilteringDataset` class that will help you create datasets suitable for training, and a function `get_colab_learner` to build a simple model to go with it. Let's first see how we can get started before devling in the documentation.

For our example, we'll use a small subset of the [MovieLens](https://grouplens.org/datasets/movielens/) dataset. In there, we have to predict the rating a user gave a given movie (from 0 to 5). It comes in the form of a csv file where each line is the rating of a movie by a given person.

In [None]:
ratings = get_movie_lens()
ratings.head()

Unnamed: 0,userId,movieId,rating,timestamp
0,73,1097,4.0,1255504951
1,561,924,3.5,1172695223
2,157,260,3.5,1291598691
3,358,1210,5.0,957481884
4,130,316,2.0,1138999234


We'll first turn the `userId` and `movieId` columns in category codes, so that instead of random-ish numbers, we have indexes from 0 to the the number of users/movies - 1. This step would be even more important if our csv had names of users, or names of items in it.

In [None]:
series2cat(ratings, 'userId','movieId')

### Global Variable Definitions:

In [None]:
show_doc(ColabFilteringDataset)

### <a id=ColabFilteringDataset></a><em>class</em> `ColabFilteringDataset`
(<code>user</code>:<code>Series</code>, <code>item</code>:<code>Series</code>, <code>ratings</code>:<code>DataFrame</code>) -> <code>NoneType</code> :: Inherits ([<code>DatasetBase</code>](fastai.data.html#DatasetBase))<div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L9">[source]</a></div>


Base dataset for collaborative filtering

[`ColabFilteringDataset`](/colab.html#ColabFilteringDataset)

In [None]:
show_doc(ColabFilteringDataset.from_csv)

#### <a id=from_csv></a>`from_csv`
(<code>csv_name</code>:<code>str</code>, <code>kwargs</code>) -> <code>Tuple</code>[<code>str</code>, <code>str</code>]<div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L47">[source]</a></div>


Splits a given table in a csv in a training and validation set

`ColabFilteringDataset.from_csv`

In [None]:
show_doc(ColabFilteringDataset.from_df)

#### <a id=from_df></a>`from_df`
(<code>rating_df</code>:<code>DataFrame</code>, <code>pct_val</code>:<code>float</code>=`0.2`, <code>user_name</code>:`Optional`[<code>str</code>]=`None`, <code>item_name</code>:`Optional`[<code>str</code>]=`None`, <code>rating_name</code>:`Optional`[<code>str</code>]=`None`) -> <code>Tuple</code>[<code>str</code>, <code>str</code>]<div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L32">[source]</a></div>


Splits a given dataframe in a training and validation set

`ColabFilteringDataset.from_df`

In [None]:
show_doc(EmbeddingDotBias)

### <a id=EmbeddingDotBias></a><em>class</em> `EmbeddingDotBias`
(<code>n_factors</code>:<code>int</code>, <code>n_users</code>:<code>int</code>, <code>n_items</code>:<code>int</code>, <code>min_score</code>:<code>float</code>=`None`, <code>max_score</code>:<code>float</code>=`None`) :: Inherits ([<code>Module</code>](https://pytorch.org/docs/stable/nn.html#torch.nn.Module))<div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L53">[source]</a></div>


Base model for callaborative filtering

[`EmbeddingDotBias`](/colab.html#EmbeddingDotBias)

In [None]:
show_doc(EmbeddingDotBias.forward)

#### <a id=forward></a>`forward`
(<code>users</code>:<code>LongTensor</code>, <code>items</code>:<code>LongTensor</code>) -> <code>Tensor</code><div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L62">[source]</a></div>


Should be overridden by all subclasses.

.. note::
    Although the recipe for forward pass needs to be defined within
    this function, one should call the :class:`Module` instance afterwards
    instead of this since the former takes care of running the
    registered hooks while the latter silently ignores them.

`EmbeddingDotBias.forward`

In [None]:
show_doc(get_collab_learner)

#### <a id=get_collab_learner></a>`get_collab_learner`
(<code>n_factors</code>:<code>int</code>, <code>data</code>:[<code>DataBunch</code>](fastai.data.html#DataBunch), <code>min_score</code>:<code>float</code>=`None`, <code>max_score</code>:<code>float</code>=`None`, <code>loss_fn</code>:<code>Callable</code>[<code>Tensor</code>, <code>Tensor</code>, <code>OneEltTensor</code>]=`<function mse_loss at 0x11bdebc80>`, <code>kwargs</code>) -> [<code>Learner</code>](fastai.basic_train.html#Learner)<div style="text-align: right"><a href="https://github.com/fastai/fastai_pytorch/blob/master/fastai/colab.py#L68">[source]</a></div>


Creates a Learner for collaborative filtering

[`get_collab_learner`](/colab.html#get_collab_learner)