### Матричные факторизации

В данной работе вам предстоит познакомиться с практической стороной матричных разложений.
Работа поделена на 4 задания:
1. Вам необходимо реализовать SVD разложения используя SGD на explicit данных
2. Вам необходимо реализовать матричное разложения используя ALS на implicit данных
3. Вам необходимо реализовать матричное разложения используя BPR на implicit данных
4. Вам необходимо реализовать матричное разложения используя WARP на implicit данных

Мягкий дедлайн 13 Октября (пишутся замечания, выставляется оценка, есть возможность исправить до жесткого дедлайна)

Жесткий дедлайн 20 Октября (Итоговая проверка)

In [1]:
import implicit
import pandas as pd
import numpy as np
import scipy.sparse as sp

from lightfm.datasets import fetch_movielens



В данной работе мы будем работать с explicit датасетом movieLens, в котором представленны пары user_id movie_id и rating выставленный пользователем фильму

Скачать датасет можно по ссылке https://grouplens.org/datasets/movielens/1m/

In [2]:
ratings = pd.read_csv('./data/ratings.dat', delimiter='::', header=None, 
        names=['user_id', 'movie_id', 'rating', 'timestamp'], 
        usecols=['user_id', 'movie_id', 'rating'], engine='python')

In [3]:
ratings

Unnamed: 0,user_id,movie_id,rating
0,1,1193,5
1,1,661,3
2,1,914,3
3,1,3408,4
4,1,2355,5
...,...,...,...
1000204,6040,1091,1
1000205,6040,1094,5
1000206,6040,562,5
1000207,6040,1096,4


In [4]:
movie_info = pd.read_csv('./data/movies.dat', delimiter='::', header=None, encoding="ISO-8859-1",
        names=['movie_id', 'name', 'category'], engine='python')

Explicit данные

In [5]:
ratings.head(10)

Unnamed: 0,user_id,movie_id,rating
0,1,1193,5
1,1,661,3
2,1,914,3
3,1,3408,4
4,1,2355,5
5,1,1197,3
6,1,1287,5
7,1,2804,5
8,1,594,4
9,1,919,4


Для того, чтобы преобразовать текущий датасет в Implicit, давайте считать что позитивная оценка это оценка >=4

In [6]:
implicit_ratings = ratings.loc[(ratings['rating'] >= 4)]

In [7]:
implicit_ratings.head(10)

Unnamed: 0,user_id,movie_id,rating
0,1,1193,5
3,1,3408,4
4,1,2355,5
6,1,1287,5
7,1,2804,5
8,1,594,4
9,1,919,4
10,1,595,5
11,1,938,4
12,1,2398,4


Удобнее работать с sparse матричками, давайте преобразуем DataFrame в CSR матрицы

In [8]:
users = implicit_ratings["user_id"]
movies = implicit_ratings["movie_id"]
user_item = sp.coo_matrix((np.ones_like(users), (users, movies)))
user_item_t_csr = user_item.T.tocsr()
user_item_csr = user_item.tocsr()

В качестве примера воспользуемся ALS разложением из библиотеки implicit

Зададим размерность латентного пространства равным 64, это же определяет размер user/item эмбедингов

In [9]:
model = implicit.als.AlternatingLeastSquares(factors=64, iterations=10, calculate_training_loss=True)

In [10]:
user_item_csr.nonzero()

(array([   1,    1,    1, ..., 6040, 6040, 6040], dtype=int32),
 array([   1,   48,  150, ..., 3735, 3751, 3819], dtype=int32))

В качестве loss здесь всеми любимый RMSE

In [11]:
model.fit(user_item_t_csr)

  0%|          | 0/10 [00:00<?, ?it/s]

Построим похожие фильмы по 1 movie_id = Истории игрушек

In [12]:
movie_info.head(5)

Unnamed: 0,movie_id,name,category
0,1,Toy Story (1995),Animation|Children's|Comedy
1,2,Jumanji (1995),Adventure|Children's|Fantasy
2,3,Grumpier Old Men (1995),Comedy|Romance
3,4,Waiting to Exhale (1995),Comedy|Drama
4,5,Father of the Bride Part II (1995),Comedy


In [13]:
get_similars = lambda item_id, model : [movie_info[movie_info["movie_id"] == x[0]]["name"].to_string() 
                                        for x in model.similar_items(item_id)]

Как мы видим, симилары действительно оказались симиларами.

Качество симиларов часто является хорошим способом проверить качество алгоритмов.

P.S. Если хочется поглубже разобраться в том как разные алгоритмы формируют разные латентные пространства, рекомендую загружать полученные вектора в tensorBoard и смотреть на сформированное пространство

In [14]:
get_similars(1, model)

['0    Toy Story (1995)',
 '3045    Toy Story 2 (1999)',
 "2286    Bug's Life, A (1998)",
 '584    Aladdin (1992)',
 '2692    Iron Giant, The (1999)',
 '1838    Mulan (1998)',
 '2252    Pleasantville (1998)',
 '33    Babe (1995)',
 '1526    Hercules (1997)',
 '2618    Tarzan (1999)']

Давайте теперь построим рекомендации для юзеров

Как мы видим юзеру нравится фантастика, значит и в рекомендациях ожидаем увидеть фантастику

In [15]:
get_user_history = lambda user_id, implicit_ratings : [movie_info[movie_info["movie_id"] == x]["name"].to_string() 
                                            for x in implicit_ratings[implicit_ratings["user_id"] == user_id]["movie_id"]]

In [16]:
get_user_history(4, implicit_ratings)

['3399    Hustler, The (1961)',
 '2882    Fistful of Dollars, A (1964)',
 '1196    Alien (1979)',
 '1023    Die Hard (1988)',
 '257    Star Wars: Episode IV - A New Hope (1977)',
 '1959    Saving Private Ryan (1998)',
 '476    Jurassic Park (1993)',
 '1180    Raiders of the Lost Ark (1981)',
 '1885    Rocky (1976)',
 '1081    E.T. the Extra-Terrestrial (1982)',
 '3349    Thelma & Louise (1991)',
 '3633    Mad Max (1979)',
 '2297    King Kong (1933)',
 '1366    Jaws (1975)',
 '1183    Good, The Bad and The Ugly, The (1966)',
 '2623    Run Lola Run (Lola rennt) (1998)',
 '2878    Goldfinger (1964)',
 '1220    Terminator, The (1984)']

Получилось! 

Мы действительно порекомендовали пользователю фантастику и боевики, более того встречаются продолжения тех фильмов, которые он высоко оценил

In [17]:
get_recommendations = lambda user_id, model : [movie_info[movie_info["movie_id"] == x[0]]["name"].to_string() 
                                               for x in model.recommend(user_id, user_item_csr)]

In [18]:
get_recommendations(4, model)

['585    Terminator 2: Judgment Day (1991)',
 '1284    Butch Cassidy and the Sundance Kid (1969)',
 '1178    Star Wars: Episode V - The Empire Strikes Back...',
 '1271    Indiana Jones and the Last Crusade (1989)',
 '1182    Aliens (1986)',
 '1884    French Connection, The (1971)',
 '2502    Matrix, The (1999)',
 '2875    Dirty Dozen, The (1967)',
 '2880    Dr. No (1962)',
 '1892    Rain Man (1988)']

Теперь ваша очередь реализовать самые популярные алгоритмы матричных разложений

Что будет оцениваться:
1. Корректность алгоритма
2. Качество получившихся симиларов
3. Качество итоговых рекомендаций для юзера

### Задание 1. Не использую готовые решения, реализовать SVD разложение используя SGD на explicit данных

In [19]:
from modules.svd import SVD

In [20]:
user_item = sp.coo_matrix((ratings['rating'], (ratings['user_id'], ratings['movie_id'])))

In [23]:
svd = SVD(64, n_iter=50, random_state=42)

In [24]:
svd.fit(user_item)

  0%|          | 0/50 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

Fitting:   0%|          | 0/1000209 [00:00<?, ?it/s]

<modules.svd.SVD at 0x7ffd0213b460>

In [26]:
get_similars(1, svd)

['3045    Toy Story 2 (1999)',
 "2286    Bug's Life, A (1998)",
 '584    Aladdin (1992)',
 '1132    Wrong Trousers, The (1993)',
 '591    Beauty and the Beast (1991)',
 '3115    Montana (1998)',
 '2692    Iron Giant, The (1999)',
 '360    Lion King, The (1994)',
 '3682    Chicken Run (2000)',
 '1250    Back to the Future (1985)']

In [28]:
get_recommendations(4, svd)

['3352    Animal House (1978)',
 '900    Casablanca (1942)',
 "941    It's a Wonderful Life (1946)",
 '1186    Lawrence of Arabia (1962)',
 '1267    Ben-Hur (1959)',
 '2506    Dreamlife of Angels, The (La Vie rêvée des ang...',
 '1729    Men With Guns (1997)',
 '3801    Shane (1953)',
 '3031    River Runs Through It, A (1992)',
 '1234    Treasure of the Sierra Madre, The (1948)']

### Задание 2. Не использую готовые решения, реализовать матричное разложение используя ALS на implicit данных

In [31]:
from modules.als import ALS

In [39]:
als = ALS(64, n_iter=20, lambda_=1e-5, random_state=42)

In [40]:
als.fit(user_item_csr)

  0%|          | 0/20 [00:00<?, ?it/s]

<modules.als.ALS at 0x7ffd22277ca0>

In [42]:
get_similars(1, als)

['3045    Toy Story 2 (1999)',
 "2286    Bug's Life, A (1998)",
 '33    Babe (1995)',
 '584    Aladdin (1992)',
 '2315    Babe: Pig in the City (1998)',
 '2692    Iron Giant, The (1999)',
 '2252    Pleasantville (1998)',
 '360    Lion King, The (1994)',
 '3088    Stuart Little (1999)',
 '3817    Went to Coney Island on a Mission From God... ...']

In [43]:
get_recommendations(4, als)

['585    Terminator 2: Judgment Day (1991)',
 '1271    Indiana Jones and the Last Crusade (1989)',
 '1284    Butch Cassidy and the Sundance Kid (1969)',
 '1178    Star Wars: Episode V - The Empire Strikes Back...',
 '1182    Aliens (1986)',
 '2502    Matrix, The (1999)',
 '847    Godfather, The (1972)',
 '3402    Close Encounters of the Third Kind (1977)',
 '2460    Planet of the Apes (1968)',
 '1884    French Connection, The (1971)']

### Задание 3. Не использую готовые решения, реализовать матричное разложение BPR на implicit данных

In [44]:
from modules.bpr import BPR

In [45]:
bpr = BPR(64, lambda_=1e-5, random_state=42)

In [46]:
bpr.fit(user_item_csr)

  0%|          | 0/250 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

<modules.bpr.BPR at 0x7fb029679610>

In [47]:
get_similars(1, bpr)

['1245    Groundhog Day (1993)',
 '2789    American Beauty (1999)',
 '3045    Toy Story 2 (1999)',
 '1250    Back to the Future (1985)',
 '2327    Shakespeare in Love (1998)',
 '1959    Saving Private Ryan (1998)',
 '257    Star Wars: Episode IV - A New Hope (1977)',
 '108    Braveheart (1995)',
 '1192    Star Wars: Episode VI - Return of the Jedi (1983)',
 '2693    Sixth Sense, The (1999)']

In [48]:
get_recommendations(4, bpr)

['2789    American Beauty (1999)',
 '1192    Star Wars: Episode VI - Return of the Jedi (1983)',
 '589    Silence of the Lambs, The (1991)',
 '1178    Star Wars: Episode V - The Empire Strikes Back...',
 '585    Terminator 2: Judgment Day (1991)',
 '2502    Matrix, The (1999)',
 '2693    Sixth Sense, The (1999)',
 '108    Braveheart (1995)',
 '453    Fugitive, The (1993)',
 '1179    Princess Bride, The (1987)']

### Задание 4. Не использую готовые решения, реализовать матричное разложение WARP на implicit данных

In [49]:
from modules.warp import WARP

In [58]:
warp = WARP(64, n_iter=30, random_state=42)

In [59]:
warp.fit(user_item_csr)

  0%|          | 0/30 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

Fitting:   0%|          | 0/6041 [00:00<?, ?it/s]

<modules.warp.WARP at 0x7fb02a0b05b0>

In [60]:
get_similars(1, warp)

['3045    Toy Story 2 (1999)',
 '1245    Groundhog Day (1993)',
 '1726    As Good As It Gets (1997)',
 '3352    Animal House (1978)',
 "2849    Ferris Bueller's Day Off (1986)",
 '584    Aladdin (1992)',
 '2728    Big (1988)',
 '1468    Grosse Pointe Blank (1997)',
 '1058    Willy Wonka and the Chocolate Factory (1971)',
 '3141    Fast Times at Ridgemont High (1982)']

In [61]:
get_recommendations(4, warp)

['1178    Star Wars: Episode V - The Empire Strikes Back...',
 '2502    Matrix, The (1999)',
 '1192    Star Wars: Episode VI - Return of the Jedi (1983)',
 '585    Terminator 2: Judgment Day (1991)',
 '108    Braveheart (1995)',
 '1568    Hunt for Red October, The (1990)',
 '1250    Back to the Future (1985)',
 '2847    Total Recall (1990)',
 '2559    Star Wars: Episode I - The Phantom Menace (1999)',
 '453    Fugitive, The (1993)']