# AutoRec cs3639 Recommendation Systems course IDC

### here will be general explanations

In [1]:
import numpy as np
import pandas as pd
import sklearn
import torch
from torch import nn

In [2]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)

Using device: cuda


## datasets

In this project, we will use 2 datasets:
* **movielens**, which can be downloaded using `utils.datasets_download.py` or straight from [here](http://files.grouplens.org/datasets/movielens/).
* **netflixprize**, which can be downloaded from this [semi-parsed version from kaggle](https://www.kaggle.com/netflix-inc/netflix-prize-data) or from this [raw version](https://archive.org/download/nf_prize_dataset.tar)

**NOTE**: for the notebook to run properly, you should save you dataset under `data` folder and `movielens` folder for the movielens dataset and `netflix` folder for the netflixprize dataset.
i.e `data/movielens` folder and `data/netflix` folder respectively.

In [3]:
from src.mf.model import MatrixFactorization
from src.mf.training import MFTrainer

In [4]:
from src.data_prep import movielens_load
train, test = movielens_load(1)
print(train.shape)
train

(80000, 4)


Unnamed: 0,user_id,item_id,rating,timestamp
0,1,1,5,874965758
1,1,2,3,876893171
2,1,3,4,878542960
3,1,4,3,876893119
4,1,5,3,889751712
...,...,...,...,...
79995,943,1067,2,875501756
79996,943,1074,4,888640250
79997,943,1188,3,888640250
79998,943,1228,3,888640275


In [5]:
num_users = train.user_id.max()
num_items = train.item_id.max()
model = MatrixFactorization(num_users, num_items, k=80)
mf_trainer = MFTrainer(train, test, model, epochs=150, lr=0.002, reg=0.001, batch_size=128)
mf_trainer.train_model()

EPOCH 1: Avg losses: train: 6.981, val: 5.174
EPOCH 2: Avg losses: train: 4.361, val: 3.969
EPOCH 3: Avg losses: train: 3.678, val: 3.689
EPOCH 4: Avg losses: train: 3.476, val: 3.502
EPOCH 5: Avg losses: train: 3.174, val: 3.088
EPOCH 6: Avg losses: train: 2.604, val: 2.525
EPOCH 7: Avg losses: train: 2.208, val: 2.311
EPOCH 8: Avg losses: train: 2.106, val: 2.267
EPOCH 9: Avg losses: train: 2.092, val: 2.268
EPOCH 10: Avg losses: train: 2.092, val: 2.262
EPOCH 11: Avg losses: train: 2.089, val: 2.265
EPOCH 12: Avg losses: train: 2.094, val: 2.261
EPOCH 13: Avg losses: train: 2.088, val: 2.274
EPOCH 14: Avg losses: train: 2.093, val: 2.270
EPOCH 15: Avg losses: train: 2.094, val: 2.266
EPOCH 16: Avg losses: train: 2.095, val: 2.264
EPOCH 17: Avg losses: train: 2.092, val: 2.264
EPOCH 18: Avg losses: train: 2.092, val: 2.270
EPOCH 19: Avg losses: train: 2.093, val: 2.263
EPOCH 20: Avg losses: train: 2.094, val: 2.253
EPOCH 21: Avg losses: train: 2.088, val: 2.266
EPOCH 22: Avg losses: 

KeyboardInterrupt: 

In [1]:
from src.data_prep import movielens_create_ratings
train, test = movielens_create_ratings(1)

In [2]:
from src.autorec.model import AutoRec
from src.autorec.training import AutoRecTrainer

In [3]:
num_users, num_items = train.shape
model = AutoRec(num_hidden=512, num_features=num_users)

In [4]:
autorec_trainer = AutoRecTrainer(train, test, model, epochs=100, batch_size=64, lr=0.001, reg=0.001)
autorec_trainer.train_model()

EPOCH 1: Avg losses: train: 1.631, val: 1.096
EPOCH 2: Avg losses: train: 1.018, val: 0.982
EPOCH 3: Avg losses: train: 0.951, val: 0.971
EPOCH 4: Avg losses: train: 0.918, val: 0.926
EPOCH 5: Avg losses: train: 0.894, val: 0.919
EPOCH 6: Avg losses: train: 0.871, val: 0.899
EPOCH 7: Avg losses: train: 0.858, val: 0.888
EPOCH 8: Avg losses: train: 0.842, val: 0.887
EPOCH 9: Avg losses: train: 0.830, val: 0.867
EPOCH 10: Avg losses: train: 0.816, val: 0.860
EPOCH 11: Avg losses: train: 0.806, val: 0.861
EPOCH 12: Avg losses: train: 0.795, val: 0.850
EPOCH 13: Avg losses: train: 0.785, val: 0.850
EPOCH 14: Avg losses: train: 0.783, val: 0.848
EPOCH 15: Avg losses: train: 0.775, val: 0.857
EPOCH 16: Avg losses: train: 0.765, val: 0.834
EPOCH 17: Avg losses: train: 0.755, val: 0.833
EPOCH 18: Avg losses: train: 0.749, val: 0.812
EPOCH 19: Avg losses: train: 0.741, val: 0.818
EPOCH 20: Avg losses: train: 0.733, val: 0.826
EPOCH 21: Avg losses: train: 0.725, val: 0.824
EPOCH 22: Avg losses: 