<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>

<i>Licensed under the MIT License.</i>

# xDeepFM : the eXtreme Deep Factorization Machine 
This notebook will give you a quick example of how to train an xDeepFM model. 
xDeepFM \[1\] is a deep learning-based model aims at capturing both lower- and higher-order feature interactions for precise recommender systems. Thus it can learn feature interactions more effectively and manual feature engineering effort can be substantially reduced. To summarize, xDeepFM has the following key properties:
* It contains a component, named CIN, that learns feature interactions in an explicit fashion and in vector-wise level;
* It contains a traditional DNN component that learns feature interactions in an implicit fashion and in bit-wise level.
* The implementation makes this model quite configurable. We can enable different subsets of components by setting hyperparameters like `use_Linear_part`, `use_FM_part`, `use_CIN_part`, and `use_DNN_part`. For example, by enabling only the `use_Linear_part` and `use_FM_part`, we can get a classical FM model.


## Global Settings and Imports

In [1]:
import sys
sys.path.append("../../")
import papermill as pm
import tensorflow as tf

from reco_utils.recommender.deeprec.deeprec_utils import *
from reco_utils.recommender.deeprec.models.xDeepFM import *
from reco_utils.recommender.deeprec.IO.iterator import *

print("System version: {}".format(sys.version))
print("Tensorflow version: {}".format(tf.__version__))

  from ._conv import register_converters as _register_converters


System version: 3.5.5 |Anaconda custom (64-bit)| (default, May 13 2018, 21:12:35) 
[GCC 7.2.0]
Tensorflow version: 1.10.1


### Parameters

In [2]:
#EPOCHS_FOR_SYNTHETIC_RUN = 15
#EPOCHS_FOR_CRITEO_RUN = 30
#BATCH_SIZE_SYNTHETIC = 128
#BATCH_SIZE_CRITEO = 4096

## Download data
xDeepFM uses the FFM format as data input: `<label> <field_id>:<feature_id>:<feature_value>`  
Each line represents an instance, `<label>` is a binary value with 1 meaning positive instance and 0 meaning negative instance. 
Features are divided into fields. For example, user's gender is a field, it contains three possible values, i.e. male, female and unknown. Occupation can be another field, which contains many more possible values than the gender field. Both field index and feature index are starting from 1. <br>
Now let's start with movielens dataset.

In [3]:
#Pre-processing movielens dataset. Please go to ../../tests/resources/deeprec/movielens folder. 
#ML-100K2Libffm.py loads user rating data with movie gener data
# ML-100K2Libffm.py transforms into libffm format <field_id>:<feature_id>:<feature_value>. 

#wget http://files.grouplens.org/datasets/movielens/ml-100k.zip
#unzip ml-100k.zip
#python ML-100K2Libffm.py

In [4]:
data_path = '../../tests/resources/deeprec/movielens'
yaml_file = os.path.join(data_path, r'network_xdeepFM.yaml')
#train_file = os.path.join(data_path, r'ua.base.classification.final')
#train_file = os.path.join(data_path, r'ua.base.regression.final')
#valid_file = os.path.join(data_path, r'ua.test.regression.final')
#test_file = os.path.join(data_path, r'ua.test.regression.final')

####the following files are for classification
train_file = os.path.join(data_path, r'ua.base.classification.final')
valid_file = os.path.join(data_path, r'ua.test.classification.final')
#test_file = os.path.join(data_path, r'ua.test.classification.final')
test_file = os.path.join(data_path, r'ua.test.classification_topN.final')

#test_file = os.path.join(data_path, r'ua.test.classification_topN.final')
output_file = os.path.join(data_path, r'output.txt')

#if not os.path.exists(yaml_file):
#    download_deeprec_resources(r'https://recodatasets.blob.core.windows.net/deeprec/', data_path, 'xdeepfmresources.zip')


## Create hyper-parameters
prepare_hparams() will create a full set of hyper-parameters for model training, such as learning rate, feature number, and dropout ratio. We can put those parameters in a yaml file, or pass parameters as the function's parameters (which will overwrite yaml settings).

In [5]:
hparams = prepare_hparams(yaml_file) ##
print(hparams)

[('DNN_FIELD_NUM', None), ('FEATURE_COUNT', 213), ('FIELD_COUNT', 5), ('MODEL_DIR', None), ('PAIR_NUM', None), ('SUMMARIES_DIR', None), ('activation', ['relu', 'relu']), ('attention_activation', None), ('attention_dropout', 0.0), ('attention_layer_sizes', None), ('batch_size', 128), ('cross_activation', 'identity'), ('cross_l1', 0.0), ('cross_l2', 0.0), ('cross_layer_sizes', [100, 100, 50]), ('cross_layers', None), ('data_format', 'ffm'), ('dim', 10), ('doc_size', None), ('dropout', [0.0, 0.0]), ('dtype', 32), ('embed_l1', 0.0), ('embed_l2', 0.0), ('enable_BN', False), ('entityEmb_file', None), ('entity_dim', None), ('entity_embedding_method', None), ('entity_size', None), ('epochs', 10), ('fast_CIN_d', 0), ('filter_sizes', None), ('init_method', 'tnormal'), ('init_value', 0.01), ('is_clip_norm', 0), ('iterator_type', None), ('kg_file', None), ('kg_training_interval', 5), ('layer_l1', 0.0), ('layer_l2', 0.0), ('layer_sizes', [400, 400]), ('learning_rate', 0.001), ('load_model_name', No

## Create data loader
Designate a data iterator for the model. xDeepFM uses FFMTextIterator. 

In [6]:
input_creator = FFMTextIterator

## Create model
When both hyper-parameters and data iterator are ready, we can create a model:

In [7]:
model = XDeepFMModel(hparams, input_creator)

## sometimes we don't want to train a model from scratch
## then we can load a pre-trained model like this: 
#model.load_model(r'your_model_path')

Add CIN part.


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Now let's see what is the model's performance at this point (without starting training):

### Train and fit model on validation data

In [8]:
model.fit(train_file, valid_file)

step 20 , total_loss: 0.6903, data_loss: 0.6903
step 40 , total_loss: 0.6876, data_loss: 0.6876
step 60 , total_loss: 0.6820, data_loss: 0.6820
step 80 , total_loss: 0.7653, data_loss: 0.7653
step 100 , total_loss: 0.6709, data_loss: 0.6709
step 120 , total_loss: 0.6529, data_loss: 0.6529
step 140 , total_loss: 0.6902, data_loss: 0.6902
step 160 , total_loss: 0.6162, data_loss: 0.6162
step 180 , total_loss: 0.6748, data_loss: 0.6748
step 200 , total_loss: 0.7213, data_loss: 0.7213
step 220 , total_loss: 0.6183, data_loss: 0.6183
step 240 , total_loss: 0.6238, data_loss: 0.6238
step 260 , total_loss: 0.6671, data_loss: 0.6671
step 280 , total_loss: 0.5815, data_loss: 0.5815
step 300 , total_loss: 0.6722, data_loss: 0.6722
step 320 , total_loss: 0.7872, data_loss: 0.7872
step 340 , total_loss: 0.6791, data_loss: 0.6791
step 360 , total_loss: 0.6202, data_loss: 0.6202
step 380 , total_loss: 0.7010, data_loss: 0.7010
step 400 , total_loss: 0.6599, data_loss: 0.6599
step 420 , total_loss: 0

step 80 , total_loss: 0.6821, data_loss: 0.6821
step 100 , total_loss: 0.6905, data_loss: 0.6905
step 120 , total_loss: 0.6283, data_loss: 0.6283
step 140 , total_loss: 0.7133, data_loss: 0.7133
step 160 , total_loss: 0.6490, data_loss: 0.6490
step 180 , total_loss: 0.6986, data_loss: 0.6986
step 200 , total_loss: 0.7408, data_loss: 0.7408
step 220 , total_loss: 0.5873, data_loss: 0.5873
step 240 , total_loss: 0.5347, data_loss: 0.5347
step 260 , total_loss: 0.6016, data_loss: 0.6016
step 280 , total_loss: 0.5482, data_loss: 0.5482
step 300 , total_loss: 0.6572, data_loss: 0.6572
step 320 , total_loss: 0.5996, data_loss: 0.5996
step 340 , total_loss: 0.6610, data_loss: 0.6610
step 360 , total_loss: 0.5230, data_loss: 0.5230
step 380 , total_loss: 0.6623, data_loss: 0.6623
step 400 , total_loss: 0.6724, data_loss: 0.6724
step 420 , total_loss: 0.7089, data_loss: 0.7089
step 440 , total_loss: 0.7482, data_loss: 0.7482
step 460 , total_loss: 0.6120, data_loss: 0.6120
step 480 , total_loss

step 140 , total_loss: 0.6729, data_loss: 0.6729
step 160 , total_loss: 0.6329, data_loss: 0.6329
step 180 , total_loss: 0.6859, data_loss: 0.6859
step 200 , total_loss: 0.7361, data_loss: 0.7361
step 220 , total_loss: 0.5721, data_loss: 0.5721
step 240 , total_loss: 0.5464, data_loss: 0.5464
step 260 , total_loss: 0.5985, data_loss: 0.5985
step 280 , total_loss: 0.5458, data_loss: 0.5458
step 300 , total_loss: 0.6616, data_loss: 0.6616
step 320 , total_loss: 0.4879, data_loss: 0.4879
step 340 , total_loss: 0.6758, data_loss: 0.6758
step 360 , total_loss: 0.5256, data_loss: 0.5256
step 380 , total_loss: 0.6367, data_loss: 0.6367
step 400 , total_loss: 0.6630, data_loss: 0.6630
step 420 , total_loss: 0.7034, data_loss: 0.7034
step 440 , total_loss: 0.7279, data_loss: 0.7279
step 460 , total_loss: 0.6288, data_loss: 0.6288
step 480 , total_loss: 0.6498, data_loss: 0.6498
step 500 , total_loss: 0.5864, data_loss: 0.5864
step 520 , total_loss: 0.5195, data_loss: 0.5195
step 540 , total_los

<reco_utils.recommender.deeprec.models.xDeepFM.XDeepFMModel at 0x7fc81978e748>

### Evaluate model on test data

In [9]:
res_syn = model.run_eval(test_file)
print(res_syn)
pm.record("res_syn", res_syn)

{'exp_var': -6.733170509338379, 'mae': 0.55921715, 'logloss': 0.9299, 'rmse': 0.5893216, 'rsquare': -66.95369215530899, 'auc': 0.5113}


  This is separate from the ipykernel package so we can avoid doing imports until


### Evaluation of  top N recommendation

In [None]:
res_syn = model.run_eval_topN(test_file,hparams)

## Reference
\[1\] Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., & Sun, G. (2018). xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems.Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining, KDD 2018, London, UK, August 19-23, 2018.<br>
\[2\] The Criteo datasets: http://labs.criteo.com/category/dataset/. 