<a href="https://colab.research.google.com/github/neoyipeng2018/happier/blob/master/Happier_(test).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Test case
Created a sample data set with 2 similar users, 1 and 2. For 9 activities, they have very similar ratings. For the 10th and 11th activity, only one user has done it, so we can use it to see how accurately it predicts the ratings for the other users

## Updating and importing fastai

In [1]:
!curl https://course.fast.ai/setup/colab | bash

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   321  100   321    0     0   1535      0 --:--:-- --:--:-- --:--:--  1535
Updating fastai...
[31mspacy 2.0.18 has requirement numpy>=1.15.0, but you'll have numpy 1.14.6 which is incompatible.[0m
Done.


In [0]:
from fastai.collab import *
from fastai.tabular import *

## Getting sample data
From git

In [3]:
!git clone --recursive https://github.com/neoyipeng2018/happier.git 

Cloning into 'happier'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects:   8% (1/12)   [Kremote: Counting objects:  16% (2/12)   [Kremote: Counting objects:  25% (3/12)   [Kremote: Counting objects:  33% (4/12)   [Kremote: Counting objects:  41% (5/12)   [Kremote: Counting objects:  50% (6/12)   [Kremote: Counting objects:  58% (7/12)   [Kremote: Counting objects:  66% (8/12)   [Kremote: Counting objects:  75% (9/12)   [Kremote: Counting objects:  83% (10/12)   [Kremote: Counting objects:  91% (11/12)   [Kremote: Counting objects: 100% (12/12)   [Kremote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects:   9% (1/11)   [Kremote: Compressing objects:  18% (2/11)   [Kremote: Compressing objects:  27% (3/11)   [Kremote: Compressing objects:  36% (4/11)   [Kremote: Compressing objects:  45% (5/11)   [Kremote: Compressing objects:  54% (6/11)   [Kremote: Compressing objects:  63% (7/11)   [Kremote: Compressing obj

In [18]:
ratings = pd.read_csv('happier/sample.csv')
ratings.head(6)

Unnamed: 0,userId,activityId,rating,timestamp
0,1,1,3.0,1
1,2,1,4.0,2
2,1,2,3.0,3
3,2,2,2.0,4
4,1,3,1.0,5
5,2,3,1.0,6


In [0]:
data = CollabDataBunch.from_df(ratings, seed=42, bs=10)

## Basic Linear model (essentially sigmoid/logistic)
Using embedding size of 50

In [0]:
y_range = [0,5.5]

In [0]:
learn = collab_learner(data, n_factors=50, y_range=y_range)

In [43]:
learn.model

EmbeddingDotBias(
  (u_weight): Embedding(3, 50)
  (i_weight): Embedding(11, 50)
  (u_bias): Embedding(3, 1)
  (i_bias): Embedding(11, 1)
)

In [44]:
learn.fit_one_cycle(12, 1e-1)

epoch,train_loss,valid_loss
1,2.604629,2.038578
2,2.532891,2.038153
3,2.444727,2.085938
4,2.144872,1.541443
5,1.782877,0.855113
6,1.494981,0.719653
7,1.320429,0.705728
8,1.196937,0.692869
9,1.097701,0.693210
10,1.009081,0.697472
11,0.937875,0.703308
12,0.884744,0.705842


### Time to test the model
We expect user 2 to rate activity 10 high (close to 5) and user 1 to rate activity 11 low (close to 0)

In [45]:
ratings.tail(2)

Unnamed: 0,userId,activityId,rating,timestamp
18,1,10,5.0,19
19,2,11,1.0,20


In [54]:
df = pd.DataFrame(data={'userId':[2],'activityId':[10],'rating':[0],'timestamp':[0]})
df

Unnamed: 0,activityId,rating,timestamp,userId
0,10,0,0,2


In [55]:
pred = min(learn.predict(df.iloc[0])[1],5)
f'Model predicted user 2 would rate activity 10 as {pred}'

'Model predicted user 2 would rate activity 10 as 5'

In [56]:
df = pd.DataFrame(data={'userId':[1],'activityId':[11],'rating':[0],'timestamp':[0]})
df

Unnamed: 0,activityId,rating,timestamp,userId
0,11,0,0,1


In [57]:
pred = min(learn.predict(df.iloc[0])[1],5)
f'Model predicted user 1 would rate activity 11 as {pred}'

'Model predicted user 1 would rate activity 11 as 2.2434725761413574'

## Basic Neural Net model
Embedding layers followed by a 2 Linear/Relu/Batchnorm layer blocks.

In [0]:
learn = collab_learner(data, y_range=y_range, use_nn=True, emb_szs={'userId':50, 'activityId':50}, layers = [100,50], emb_drop=0.5)

In [34]:
learn.model

EmbeddingNN(
  (embeds): ModuleList(
    (0): Embedding(3, 50)
    (1): Embedding(11, 50)
  )
  (emb_drop): Dropout(p=0.5)
  (bn_cont): BatchNorm1d(0, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layers): Sequential(
    (0): Linear(in_features=100, out_features=100, bias=True)
    (1): ReLU(inplace)
    (2): BatchNorm1d(100, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Linear(in_features=100, out_features=50, bias=True)
    (4): ReLU(inplace)
    (5): BatchNorm1d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (6): Linear(in_features=50, out_features=1, bias=True)
  )
)

In [35]:
learn.fit_one_cycle(12, 1e-1)

epoch,train_loss,valid_loss
1,3.262094,2.029315
2,2.622941,1.805037
3,2.433406,1.176069
4,2.486806,1.121737
5,2.153610,1.591873
6,2.005908,1.577900
7,1.794547,1.779902
8,1.704020,1.770603
9,1.558691,1.444977
10,1.405872,1.372299
11,1.306211,1.473061
12,1.239546,1.569193


### Time to test the model
We expect user 2 to rate activity 10 high (close to 5) and user 1 to rate activity 11 low (close to 0)

In [36]:
ratings.tail(2)

Unnamed: 0,userId,activityId,rating,timestamp
18,1,10,5.0,19
19,2,11,1.0,20


In [37]:
df = pd.DataFrame(data={'userId':[2],'activityId':[10],'rating':[0],'timestamp':[0]})
df

Unnamed: 0,activityId,rating,timestamp,userId
0,10,0,0,2


In [39]:
pred = min(learn.predict(df.iloc[0])[1],5)[0]
f'Model predicted user 2 would rate activity 10 as {pred}'

'Model predicted user 2 would rate activity 10 as 4.4239583015441895'

In [40]:
df = pd.DataFrame(data={'userId':[1],'activityId':[11],'rating':[0],'timestamp':[0]})
df

Unnamed: 0,activityId,rating,timestamp,userId
0,11,0,0,1


In [41]:
pred = min(learn.predict(df.iloc[0])[1],5)[0]
f'Model predicted user 1 would rate activity 11 as {pred}'

'Model predicted user 1 would rate activity 11 as 1.389351725578308'

### Extra: How someone tackled the cold start problem - Create a meta model
At my place of work we have the following approach to a cold-start problem:

For context, I am at a financial institution and we have trained a collaborative filtering model on data purchased from a third party which consists of ~40k businesses detailing their cash management needs. Specifically, it might look like business Y has reported it uses lock-box, fraud-management, equity-management, etc. (from a list of about 40 products).

How we use this to build a recommender for our own customers (who were not part of the model training process): the model works by embedding each unique user and product into a space of some fixed dimension and modeling the probability as a dot product or perhaps shallow neural network. So given a new user, if we knew where they stood in the embedding dimension we would be able to apply our model and say how likely does it think this user is to want product X.

We build a second model (as Jeremy has suggested) using auxiliary data, things like sales volume, number of employees, SIC codes (what ‘kind’ of business it is) and train this model with the following 40k datapoints: we know these features about the businesses in our purchased dataset, and we build a regressor that maps from these “identifying features” to its position in embedding dimension with RMSE error. Once we are satisfied we can situate a new client reasonably well based on these identifying features, we can take a new user, apply model 2 to situate them in embedding space, and then apply the original collaborative filtering model.

Happy to discuss if someone is curious!

Link: https://forums.fast.ai/t/lesson-4-advanced-discussion/30319/37