### Матричные факторизации

В данной работе вам предстоит познакомиться с практической стороной матричных разложений.
Работа поделена на 4 задания:
1. Вам необходимо реализовать SVD разложения используя SGD на explicit данных
2. Вам необходимо реализовать матричное разложения используя ALS на implicit данных
3. Вам необходимо реализовать матричное разложения используя BPR(pair-wise loss) на implicit данных
4. Вам необходимо реализовать матричное разложения используя WARP(list-wise loss) на implicit данных


In [116]:
!pip install implicit lightfm faiss
!apt-get install libopenblas-dev
!apt-get install libomp-dev

Reading package lists... Done
Building dependency tree       
Reading state information... Done
libopenblas-dev is already the newest version (0.2.20+ds-4).
0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
libomp-dev is already the newest version (5.0.1-1).
0 upgraded, 0 newly installed, 0 to remove and 30 not upgraded.


In [2]:
import implicit
import pandas as pd
import numpy as np
import scipy.sparse as sp

from lightfm.datasets import fetch_movielens

In [118]:
!wget --no-check-certificate https://files.grouplens.org/datasets/movielens/ml-1m.zip
!mkdir RecSysHSE
!unzip ml-1m.zip -d RecSysHSE/

--2021-03-25 19:42:51--  https://files.grouplens.org/datasets/movielens/ml-1m.zip
Resolving files.grouplens.org (files.grouplens.org)... 128.101.65.152
Connecting to files.grouplens.org (files.grouplens.org)|128.101.65.152|:443... connected.
  Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response... 200 OK
Length: 5917549 (5.6M) [application/zip]
Saving to: ‘ml-1m.zip.1’


2021-03-25 19:42:52 (25.9 MB/s) - ‘ml-1m.zip.1’ saved [5917549/5917549]

mkdir: cannot create directory ‘RecSysHSE’: File exists
Archive:  ml-1m.zip
replace RecSysHSE/ml-1m/movies.dat? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
  inflating: RecSysHSE/ml-1m/movies.dat  
  inflating: RecSysHSE/ml-1m/ratings.dat  
  inflating: RecSysHSE/ml-1m/README  
  inflating: RecSysHSE/ml-1m/users.dat  


В данной работе мы будем работать с explicit датасетом movieLens, в котором представленны пары user_id movie_id и rating выставленный пользователем фильму

Скачать датасет можно по ссылке https://grouplens.org/datasets/movielens/1m/

In [3]:
ratings = pd.read_csv('RecSysHSE/ml-1m/ratings.dat', delimiter='::', header=None, 
        names=['user_id', 'movie_id', 'rating', 'timestamp'], 
        usecols=['user_id', 'movie_id', 'rating'], engine='python')

In [4]:
movie_info = pd.read_csv('RecSysHSE/ml-1m/movies.dat', delimiter='::', header=None, 
        names=['movie_id', 'name', 'category'], engine='python')

Explicit данные

In [5]:
ratings.head(10)

Unnamed: 0,user_id,movie_id,rating
0,1,1193,5
1,1,661,3
2,1,914,3
3,1,3408,4
4,1,2355,5
5,1,1197,3
6,1,1287,5
7,1,2804,5
8,1,594,4
9,1,919,4


Для того, чтобы преобразовать текущий датасет в Implicit, давайте считать что позитивная оценка это оценка >=4

In [6]:
implicit_ratings = ratings.loc[(ratings['rating'] >= 4)]

In [7]:
implicit_ratings.head(10)

Unnamed: 0,user_id,movie_id,rating
0,1,1193,5
3,1,3408,4
4,1,2355,5
6,1,1287,5
7,1,2804,5
8,1,594,4
9,1,919,4
10,1,595,5
11,1,938,4
12,1,2398,4


Удобнее работать с sparse матричками, давайте преобразуем DataFrame в CSR матрицы

In [8]:
users = implicit_ratings["user_id"]
movies = implicit_ratings["movie_id"]
user_item = sp.coo_matrix((np.ones_like(users), (users, movies)))
user_item_t_csr = user_item.T.tocsr()
user_item_csr = user_item.tocsr()

В качестве примера воспользуемся ALS разложением из библиотеки implicit

Зададим размерность латентного пространства равным 64, это же определяет размер user/item эмбедингов

In [9]:
model = implicit.als.AlternatingLeastSquares(factors=64, iterations=100, calculate_training_loss=True)



В качестве loss здесь всеми любимый RMSE

In [10]:
model.fit(user_item_t_csr)

HBox(children=(FloatProgress(value=0.0), HTML(value='')))




Построим похожие фильмы по 1 movie_id = Истории игрушек

In [11]:
movie_info.head(5)

Unnamed: 0,movie_id,name,category
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama
4,5,Father of the Bride Part II (1995),Comedy


In [12]:
get_similars = lambda item_id, model : [movie_info[movie_info["movie_id"] == x[0]]["name"].to_string() 
                                        for x in model.similar_items(item_id)]

Как мы видим, симилары действительно оказались симиларами.

Качество симиларов часто является хорошим способом проверить качество алгоритмов.

P.S. Если хочется поглубже разобраться в том как разные алгоритмы формируют разные латентные пространства, рекомендую загружать полученные вектора в tensorBoard и смотреть на сформированное пространство

In [13]:
get_similars(1, model)

['0    Toy Story (1995)',
 '3045    Toy Story 2 (1999)',
 "2286    Bug's Life, A (1998)",
 '33    Babe (1995)',
 '584    Aladdin (1992)',
 '2315    Babe: Pig in the City (1998)',
 '360    Lion King, The (1994)',
 '1526    Hercules (1997)',
 '1838    Mulan (1998)',
 '2618    Tarzan (1999)']

Давайте теперь построим рекомендации для юзеров

Как мы видим юзеру нравится фантастика, значит и в рекомендациях ожидаем увидеть фантастику

In [14]:
get_user_history = lambda user_id, implicit_ratings : [movie_info[movie_info["movie_id"] == x]["name"].to_string() 
                                            for x in implicit_ratings[implicit_ratings["user_id"] == user_id]["movie_id"]]

In [15]:
get_user_history(4, implicit_ratings)

['3399    Hustler, The (1961)',
 '2882    Fistful of Dollars, A (1964)',
 '1196    Alien (1979)',
 '1023    Die Hard (1988)',
 '257    Star Wars: Episode IV - A New Hope (1977)',
 '1959    Saving Private Ryan (1998)',
 '476    Jurassic Park (1993)',
 '1180    Raiders of the Lost Ark (1981)',
 '1885    Rocky (1976)',
 '1081    E.T. the Extra-Terrestrial (1982)',
 '3349    Thelma & Louise (1991)',
 '3633    Mad Max (1979)',
 '2297    King Kong (1933)',
 '1366    Jaws (1975)',
 '1183    Good, The Bad and The Ugly, The (1966)',
 '2623    Run Lola Run (Lola rennt) (1998)',
 '2878    Goldfinger (1964)',
 '1220    Terminator, The (1984)']

Получилось! 

Мы действительно порекомендовали пользователю фантастику и боевики, более того встречаются продолжения тех фильмов, которые он высоко оценил

In [16]:
get_recommendations = lambda user_id, model : [movie_info[movie_info["movie_id"] == x[0]]["name"].to_string() 
                                               for x in model.recommend(user_id, user_item_csr)]

In [17]:
get_recommendations(4, model)

['585    Terminator 2: Judgment Day (1991)',
 '1271    Indiana Jones and the Last Crusade (1989)',
 '1182    Aliens (1986)',
 '2502    Matrix, The (1999)',
 '1284    Butch Cassidy and the Sundance Kid (1969)',
 '1178    Star Wars: Episode V - The Empire Strikes Back...',
 '3402    Close Encounters of the Third Kind (1977)',
 '847    Godfather, The (1972)',
 '2460    Planet of the Apes (1968)',
 '1892    Rain Man (1988)']

Теперь ваша очередь реализовать самые популярные алгоритмы матричных разложений

Что будет оцениваться:
1. Корректность алгоритма
2. Качество получившихся симиларов
3. Качество итоговых рекомендаций для юзера

### Задание 1. Не использую готовые решения, реализовать SVD разложение используя SGD на explicit данных

### Задание 2. Не использую готовые решения, реализовать матричное разложение используя ALS на implicit данных

In [18]:
from scipy.sparse.linalg import spsolve
from scipy import sparse
from tqdm.auto import tqdm
from sklearn.neighbors import NearestNeighbors

class ALS():
  def __init__(self, data, r_lambda = 40, nf = 200, alpha = 40):
    self.data = data
    self.r_lambda = r_lambda
    self.nf = nf
    self.alpha = alpha

    self.nu = data.shape[0]
    self.ni = data.shape[1]

    self.X = sparse.csr_matrix(np.random.rand(self.nu, self.nf) * 0.01) #user latent matrix
    self.Y = sparse.csr_matrix(np.random.rand(self.ni, self.nf) * 0.01) #item latent matrix

    self.C = alpha * self.data #confidence matrix

  def optimize_user(self):
      yT = self.Y.T
      yT_Cu_y = yT.dot(self.Y)

      for u in tqdm(range(self.nu), leave=False):
          c = self.C[u, :].toarray() 
          p = c.copy()
          p[p > 0] = 1

          d = sparse.diags(c, [0])
          conf = d + sparse.eye(self.Y.shape[0])
          
          yT_Cu_pu = yT.dot(conf).dot(p.T)
          lI = yT.dot(d).dot(self.Y)
          self.X[u] = spsolve(yT_Cu_y + lI, yT_Cu_pu)

  def optimize_item(self):
      xT = self.X.T
      xT_Ci_x = xT.dot(self.X)

      for i in tqdm(range(self.ni), leave=False):
          c = self.C[:, i].T.toarray() 
          p = c.copy()
          p[p > 0] = 1

          d = sparse.diags(c, [0])
          conf = d + sparse.eye(self.X.shape[0])

          xT_Ci_pi = xT.dot(conf).dot(p.T)
          lI = xT.dot(d).dot(self.X)
          self.Y[i] = spsolve(xT_Ci_x + lI, xT_Ci_pi)
  
  def fit_KNN(self):
    self.nbrs = NearestNeighbors(n_neighbors=100).fit(self.Y)

  def similar_items(self, item_id):
    distances, indices = self.nbrs.kneighbors(self.Y[item_id])
    distances = distances[0][:10]
    indices = indices[0][:10]
    return zip(list(indices), list(distances))
  
  def recommend(self, user_id, user_item_csr):
    distances, indices = self.nbrs.kneighbors(self.X[user_id])
    distances = distances[0]
    indices = indices[0]
    recommendation = []
    user_data = user_item_csr[user_id, :].toarray().flatten()
    for i, d in zip(list(indices), list(distances)):
        if user_data[i] == 0: recommendation.append((i, d))
    return recommendation[:10]

In [20]:
als = ALS(user_item_csr)

for i in range(2):
      als.optimize_user()
      als.optimize_item()


HBox(children=(FloatProgress(value=0.0, max=6041.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=3953.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=6041.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=3953.0), HTML(value='')))

In [21]:
als.fit_KNN()

In [22]:
get_similars(1, als)

['0    Toy Story (1995)',
 '2111    Torn Curtain (1966)',
 '3663    Fury, The (1978)',
 '175    Lord of Illusions (1995)',
 '2035    Tex (1982)',
 '2780    Queens Logic (1991)',
 '2371    Another Day in Paradise (1998)',
 '130    Jade (1995)',
 '3665    Prince of the City (1981)',
 '2599    Swamp Thing (1982)']

In [23]:
get_recommendations(4, als)

['1178    Star Wars: Episode V - The Empire Strikes Back...',
 '604    Fargo (1996)',
 '2125    Untouchables, The (1987)',
 '1192    Star Wars: Episode VI - Return of the Jedi (1983)',
 '1182    Aliens (1986)',
 '1271    Indiana Jones and the Last Crusade (1989)',
 '1284    Butch Cassidy and the Sundance Kid (1969)',
 '1201    Psycho (1960)',
 '3555    Shanghai Noon (2000)',
 '1568    Hunt for Red October, The (1990)']

### Задание 3. Не использую готовые решения, реализовать матричное разложение BPR на implicit данных

In [24]:
import torch
from torch.utils.data import Dataset
from tqdm.auto import tqdm
class PrepareDataset(Dataset):
  def __init__(self, data):
    self.user_number = data.shape[0]
    self.item_number = data.shape[1]

    self.positive = {}
    self.positive_pairs = []
    users, items = data.nonzero()
    for user, item in zip(users, items):
        if user not in self.positive.keys(): self.positive[user] = []
        self.positive[user].append(item)
        self.positive_pairs.append((user, item))
    self.all_users = list(self.positive.keys())

  def __getitem__(self, i):
    user = self.all_users[i]
    negative = np.random.randint(0, self.item_number) 
    positive = np.random.choice(self.positive[user], 1)[0] 
    while negative in self.positive[user]: 
      negative = np.random.randint(0, self.item_number) 
    return user, positive, negative

  def __len__(self):
    return len(self.all_users)


In [25]:
import torch
dataset = PrepareDataset(user_item)
dataloader = torch.utils.data.DataLoader(
    dataset,
    batch_size=64,
    shuffle=True,
    drop_last=True,
    num_workers=2,
    pin_memory=True,
)

In [26]:
import torch
import torch.nn as nn
from sklearn.neighbors import NearestNeighbors

class BPR(nn.Module):
  def __init__(self, nf = 64):
    super(BPR, self).__init__()
    user_number = dataset.user_number
    item_number = dataset.item_number
    
    self.nf = nf
    self.user_emb = nn.Embedding(user_number, self.nf)
    self.item_emb = nn.Embedding(item_number, self.nf)

    nn.init.normal_(self.user_emb.weight, std=0.01)
    nn.init.normal_(self.item_emb.weight, std=0.01)

  def forward(self, user, item_i, item_j):
    user = self.user_emb(user)
    item_i = self.item_emb(item_i)
    item_j = self.item_emb(item_j)

    prediction_i = (user * item_i).sum(dim=-1)
    prediction_j = (user * item_j).sum(dim=-1)
   
    return prediction_i, prediction_j

  def fit_KNN(self):
    self.X = self.user_emb.weight.detach().cpu().numpy()
    self.Y = self.item_emb.weight.detach().cpu().numpy()

    self.X = self.X.reshape(-1,64)

    self.nbrs = NearestNeighbors(n_neighbors=100).fit(self.Y)

  def similar_items(self, item_id):
    distances, indices = self.nbrs.kneighbors(self.Y[item_id].reshape(1, -1))
    distances = distances[0][:10]
    indices = indices[0][:10]
    return zip(list(indices), list(distances))
  
  def recommend(self, user_id, user_item_csr):
    distances, indices = self.nbrs.kneighbors(self.X[user_id].reshape(1, -1))
    distances = distances[0]
    indices = indices[0]
    recommendation = []
    user_data = user_item_csr[user_id, :].toarray().flatten()
    for i, d in zip(list(indices), list(distances)):
        if user_data[i] == 0: recommendation.append((i, d))
    return recommendation[:10]

In [27]:
bpr = BPR().to('cuda')

In [28]:
import torch.optim as optim
optimizer = optim.Adam(bpr.parameters())

In [29]:
for epoch in range(70):
  train_loss = 0
  for user, positive, negative in tqdm(dataloader, leave=False):
    optimizer.zero_grad()

    user = user.cuda()
    positive = positive.cuda()
    negative = negative.cuda()

    pred_i, pred_j = bpr(user, positive, negative)
    loss = -(pred_i - pred_j).sigmoid().log().sum()
    
    train_loss += loss.item()

    loss.backward()
    optimizer.step()
  
  train_loss /= len(dataloader)
  print('Epoch {}: loss = {:.5f}'.format(epoch, train_loss))

HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 0: loss = 44.36158


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 1: loss = 44.35756


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 2: loss = 44.34785


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 3: loss = 44.32889


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 4: loss = 44.29044


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 5: loss = 44.20076


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 6: loss = 44.01517


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 7: loss = 43.61272


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 8: loss = 42.86509


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 9: loss = 41.63302


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 10: loss = 39.81862


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 11: loss = 37.30613


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 12: loss = 34.46879


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 13: loss = 31.57035


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 14: loss = 28.63795


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 15: loss = 25.84411


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 16: loss = 23.60860


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 17: loss = 21.89035


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 18: loss = 20.28318


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 19: loss = 19.01249


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 20: loss = 17.53947


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 21: loss = 16.82644


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 22: loss = 16.08925


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 23: loss = 15.77649


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 24: loss = 15.54648


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 25: loss = 14.52459


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 26: loss = 14.63101


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 27: loss = 13.90514


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 28: loss = 14.06139


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 29: loss = 13.50927


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 30: loss = 13.51672


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 31: loss = 13.44833


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 32: loss = 13.09240


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 33: loss = 13.63652


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 34: loss = 13.16884


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 35: loss = 13.10024


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 36: loss = 12.75135


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 37: loss = 13.19280


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 38: loss = 12.06841


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 39: loss = 12.55978


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 40: loss = 12.66473


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 41: loss = 12.36668


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 42: loss = 12.36913


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 43: loss = 12.52010


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 44: loss = 11.70323


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 45: loss = 12.03721


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 46: loss = 12.20656


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 47: loss = 12.35571


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 48: loss = 11.86238


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 49: loss = 12.20940


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 50: loss = 12.67693


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 51: loss = 11.95577


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 52: loss = 11.08206


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 53: loss = 11.40734


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 54: loss = 11.37830


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 55: loss = 12.06861


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 56: loss = 11.68362


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 57: loss = 11.85500


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 58: loss = 11.95824


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 59: loss = 11.73510


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 60: loss = 11.39891


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 61: loss = 11.06750


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 62: loss = 11.65210


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 63: loss = 11.53992


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 64: loss = 11.24533


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 65: loss = 11.42805


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 66: loss = 11.23210


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 67: loss = 11.06913


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 68: loss = 11.54966


HBox(children=(FloatProgress(value=0.0, max=94.0), HTML(value='')))

Epoch 69: loss = 11.44806


In [30]:
bpr.fit_KNN()

In [31]:
get_similars(1, bpr)

['0    Toy Story (1995)',
 '353    Four Weddings and a Funeral (1994)',
 '584    Aladdin (1992)',
 '2031    Splash (1984)',
 '2728    Big (1988)',
 '2432    October Sky (1999)',
 '1726    As Good As It Gets (1997)',
 '1636    Truman Show, The (1998)',
 '586    Dances with Wolves (1990)',
 '3045    Toy Story 2 (1999)']

In [32]:
get_recommendations(4, bpr)

['2298    King Kong (1976)',
 '2917    Robocop 2 (1990)',
 '970    Picnic (1955)',
 '88    Nick of Time (1995)',
 '932    Lost Horizon (1937)',
 '2471    Corruptor, The (1999)',
 '928    Adventures of Robin Hood, The (1938)',
 '3310    On the Beach (1959)',
 '917    Foreign Correspondent (1940)',
 '1033    Beautiful Thing (1996)']

### Задание 4. Не использую готовые решения, реализовать матричное разложение WARP на implicit данных