## Recommender system

### Env Setup (Using conda)
```bash
conda create -n recsys 
conda install scikit-surprise pandas pickle5
```

### Env config
OS: Window 10 64bit

Python: 3.9.12

### Import library

In [None]:
import pandas as pd
import pickle
from surprise import SVD
from surprise import Reader, Dataset
from surprise.model_selection import cross_validate

### Import training data

In [None]:
rating = pd.read_csv("./training.txt", names=["user", "item", "rating"], sep=",", header=None)
reader = Reader(rating_scale = (0, 5))
training = Dataset.load_from_df(rating, reader)

### 10-fold cross-validation

In [10]:
algo = SVD(n_epochs = 5, n_factors = 10, verbose = True)
cross_validate(algo, training, measures=['RMSE'], cv=10, n_jobs = -1, verbose=True)

Evaluating RMSE of algorithm SVD on 10 split(s).

                  Fold 1  Fold 2  Fold 3  Fold 4  Fold 5  Fold 6  Fold 7  Fold 8  Fold 9  Fold 10 Mean    Std     
RMSE (testset)    0.8482  0.8494  0.8478  0.8497  0.8495  0.8488  0.8493  0.8473  0.8491  0.8498  0.8489  0.0008  
Fit time          190.27  186.92  186.08  190.58  189.19  187.12  188.84  190.21  188.64  184.29  188.22  1.95    
Test time         33.54   33.26   33.59   33.11   33.23   32.44   32.32   31.66   31.65   30.97   32.58   0.87    


{'test_rmse': array([0.84823811, 0.84943687, 0.84783578, 0.84968161, 0.84951116,
        0.84877141, 0.84926796, 0.84732361, 0.84908188, 0.84975719]),
 'fit_time': (190.2735013961792,
  186.9219970703125,
  186.08299946784973,
  190.5819993019104,
  189.19499969482422,
  187.11599445343018,
  188.8420009613037,
  190.21349906921387,
  188.64299893379211,
  184.29099798202515),
 'test_time': (33.54449820518494,
  33.25850057601929,
  33.59200072288513,
  33.11400079727173,
  33.23400139808655,
  32.43899965286255,
  32.32249903678894,
  31.660500526428223,
  31.649001836776733,
  30.96950054168701)}

### Data training

In [17]:
training = training.build_full_trainset()
algo.fit(training)

Processing epoch 0
Processing epoch 1
Processing epoch 2
Processing epoch 3
Processing epoch 4


<surprise.prediction_algorithms.matrix_factorization.SVD at 0x19baea3c520>

### Save the model in local storage

In [18]:
with open("model.sav","wb") as f:
    pickle.dump(algo, f)
with open("model.sav","rb") as f:
    model = pickle.load(f)

### Import testing data and predict the rating

In [23]:
testing = pd.read_csv("./testing.txt", names=["user", "item", "rating"])

predicted_ratings = []
for row in testing.iterrows():
    user, item = row[1][0], row[1][1]
    predicted_ratings.append(model.predict(user, item).est)

### Save predictions in prediction file

In [24]:
testing["rating"] = predicted_ratings
testing.to_csv("./prediction.txt", index=False, header=False)