# LibFM - Cross Validation

Here, we rewrite the methods we use for the run with libFM in a way compatible with the cross-validation of the rest of the project. We import them from the .py scripts and run them to show an example.

Usual imports first.

In [None]:
import pywFM
import numpy as np
import pandas as pd

from helpers import df_load
from cv import cross_validationALSBias_demo, cross_validationMCMC_demo
from predictionAlgorithms import ALSBias_pywFM, MCMC_pywFM

## 1. Run ALS with Bias

In [None]:
cross_validationALSBias_demo()

In [None]:
ALSBias_pywFM(df_load("data_train.csv"),df_load("sampleSubmission.csv"))

## 2. Run MCMC

In [None]:
cross_validationMCMC_demo()

In [None]:
MCMC_pywFM(df_load("data_train.csv"),df_load("sampleSubmission.csv"), num_iter =5)

## 3. Logs of the runs

```
features_te, target_te, features_tr, target_tr =  df_to_sparse_split(df_load("data_train.csv"),0.1)
best_error, best_std, best_rank, best_r0, best_r1, best_r2 = ALS_CV(features_tr, target_tr, features_te, target_te, num_iter = 40, std_init_vec=[0.01,0.05,0.1], rank_vec=[7,9], r0_reg_vec=[0.5, 2], r1_reg_vec=[0.5, 2],r2_reg_vec=[0.5, 2])
```
Error =  0.997084  (for ALS with  40 iterations, std_init = 0.1 , k= 7 , r0_reg= 0.5 , r1_reg= 2 , r2_reg = 2 )


```
features_te, target_te, features_tr, target_tr =  df_to_sparse_split(df_load("data_train.csv"),0.1)
best_error, best_std, best_rank, best_r0, best_r1, best_r2 = ALS_CV(features_tr, target_tr, features_te, target_te,
        num_iter = 40, std_init_vec=[0.5,1], rank_vec=[7,8], r0_reg_vec=[0.5, 2], r1_reg_vec=[2,3],r2_reg_vec=[2,3])
```

Error =  0.993506  (for ALS with  40 iterations, std_init = 0.5 , k= 7 , r0_reg= 0.5 , r1_reg= 3 , r2_reg = 3 )

```
best_error, best_std, best_rank, best_r0, best_r1, best_r2 = ALS_CV(features_tr, target_tr, features_te, target_te,
        num_iter = 40, std_init_vec=[0.375,0.43], rank_vec=[7], r0_reg_vec=[0.5], r1_reg_vec=[15,20],r2_reg_vec=[20,25])
```
Error =  0.983401  (for ALS with  40 iterations, std_init = 0.43 , k= 7 , r0_reg= 0.5 , r1_reg= 15 , r2_reg = 25 )

```
features_te, target_te, features_tr, target_tr =  df_to_sparse_split(df_load("data_train.csv"),0.1)
MCMC_CV(features_tr, target_tr, features_te, target_te, num_iter = 60, std_init_vec = [0.5,0.8,1,1.5,2,5])
```

- Error =  0.979054  (for MCMC with  60 iterations, std_init = 0.5 )
- Error =  0.979325  (for MCMC with  60 iterations, std_init = 0.8 )
- Error =  0.978956  (for MCMC with  60 iterations, std_init = 1 )
- Error =  0.979788  (for MCMC with  60 iterations, std_init = 1.5 )
- Error =  0.986542  (for MCMC with  60 iterations, std_init = 2 )
- Error =  1.01712  (for MCMC with  60 iterations, std_init = 5 )
