<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>

<i>Licensed under the MIT License.</i>

# xDeepFM : the eXtreme Deep Factorization Machine 
This notebook will give you a quick example of how to train an xDeepFM model. 
xDeepFM \[1\] is a deep learning-based model aims at capturing both lower- and higher-order feature interactions for precise recommender systems. Thus it can learn feature interactions more effectively and manual feature engineering effort can be substantially reduced. To summarize, xDeepFM has the following key properties:
* It contains a component, named CIN, that learns feature interactions in an explicit fashion and in vector-wise level;
* It contains a traditional DNN component that learns feature interactions in an implicit fashion and in bit-wise level.
* The implementation makes this model quite configurable. We can enable different subsets of components by setting hyperparameters like `use_Linear_part`, `use_FM_part`, `use_CIN_part`, and `use_DNN_part`. For example, by enabling only the `use_Linear_part` and `use_FM_part`, we can get a classical FM model.


## Global Settings and Imports

In [1]:
import sys
sys.path.append("../../")
import papermill as pm
import tensorflow as tf

from reco_utils.recommender.deeprec.deeprec_utils import *
from reco_utils.recommender.deeprec.models.xDeepFM import *
from reco_utils.recommender.deeprec.IO.iterator import *

print("System version: {}".format(sys.version))
print("Tensorflow version: {}".format(tf.__version__))

  from ._conv import register_converters as _register_converters


System version: 3.5.5 |Anaconda custom (64-bit)| (default, May 13 2018, 21:12:35) 
[GCC 7.2.0]
Tensorflow version: 1.10.1


### Parameters

In [2]:
#EPOCHS_FOR_SYNTHETIC_RUN = 15
#EPOCHS_FOR_CRITEO_RUN = 30
#BATCH_SIZE_SYNTHETIC = 128
#BATCH_SIZE_CRITEO = 4096

## Download data
xDeepFM uses the FFM format as data input: `<label> <field_id>:<feature_id>:<feature_value>`  
Each line represents an instance, `<label>` is a binary value with 1 meaning positive instance and 0 meaning negative instance. 
Features are divided into fields. For example, user's gender is a field, it contains three possible values, i.e. male, female and unknown. Occupation can be another field, which contains many more possible values than the gender field. Both field index and feature index are starting from 1. <br>
Now let's start with movielens dataset.

In [3]:
#Pre-processing movielens dataset. Please go to ../../tests/resources/deeprec/movielens folder. 
#ML-100K2Libffm.py loads user rating data with movie gener data
# ML-100K2Libffm.py transforms into libffm format <field_id>:<feature_id>:<feature_value>. 

#wget http://files.grouplens.org/datasets/movielens/ml-100k.zip
#unzip ml-100k.zip
#python ML-100K2Libffm.py

In [4]:
data_path = '../../tests/resources/deeprec/movielens'
yaml_file = os.path.join(data_path, r'network_xdeepFM.yaml')
#train_file = os.path.join(data_path, r'ua.base.classification.final')
#train_file = os.path.join(data_path, r'ua.base.regression.final')
#valid_file = os.path.join(data_path, r'ua.test.regression.final')
#test_file = os.path.join(data_path, r'ua.test.regression.final')

####the following files are for classification
train_file = os.path.join(data_path, r'ua.base.classification.final')
valid_file = os.path.join(data_path, r'ua.test.classification.final')
test_file = os.path.join(data_path, r'ua.test.classification.final')

#test_file = os.path.join(data_path, r'ua.test.classification_topN.final')
output_file = os.path.join(data_path, r'output.txt')

#if not os.path.exists(yaml_file):
#    download_deeprec_resources(r'https://recodatasets.blob.core.windows.net/deeprec/', data_path, 'xdeepfmresources.zip')


## Create hyper-parameters
prepare_hparams() will create a full set of hyper-parameters for model training, such as learning rate, feature number, and dropout ratio. We can put those parameters in a yaml file, or pass parameters as the function's parameters (which will overwrite yaml settings).

In [6]:
hparams = prepare_hparams(yaml_file) ##
print(hparams)

[('DNN_FIELD_NUM', None), ('FEATURE_COUNT', 213), ('FIELD_COUNT', 5), ('MODEL_DIR', None), ('PAIR_NUM', None), ('SUMMARIES_DIR', None), ('activation', ['relu', 'relu']), ('attention_activation', None), ('attention_dropout', 0.0), ('attention_layer_sizes', None), ('batch_size', 128), ('cross_activation', 'identity'), ('cross_l1', 0.0), ('cross_l2', 0.0), ('cross_layer_sizes', [100, 100, 50]), ('cross_layers', None), ('data_format', 'ffm'), ('dim', 10), ('doc_size', None), ('dropout', [0.0, 0.0]), ('dtype', 32), ('embed_l1', 0.0), ('embed_l2', 0.0), ('enable_BN', False), ('entityEmb_file', None), ('entity_dim', None), ('entity_embedding_method', None), ('entity_size', None), ('epochs', 10), ('fast_CIN_d', 0), ('filter_sizes', None), ('init_method', 'tnormal'), ('init_value', 0.01), ('is_clip_norm', 0), ('iterator_type', None), ('kg_file', None), ('kg_training_interval', 5), ('layer_l1', 0.0), ('layer_l2', 0.0), ('layer_sizes', [400, 400]), ('learning_rate', 0.001), ('load_model_name', No

## Create data loader
Designate a data iterator for the model. xDeepFM uses FFMTextIterator. 

In [7]:
input_creator = FFMTextIterator

## Create model
When both hyper-parameters and data iterator are ready, we can create a model:

In [8]:
model = XDeepFMModel(hparams, input_creator)

## sometimes we don't want to train a model from scratch
## then we can load a pre-trained model like this: 
#model.load_model(r'your_model_path')

Add CIN part.


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Now let's see what is the model's performance at this point (without starting training):

In [None]:
print(model.run_eval(test_file))

AUC=0.5 is a state of random guess. We can see that before training, the model behaves like random guessing. Next we want to train the model on a training set, and check the performance on a validation dataset. Training the model is as simple as a function call:

In [9]:
model.fit(train_file, valid_file)

step 20 , total_loss: 0.6897, data_loss: 0.6897
step 40 , total_loss: 0.6849, data_loss: 0.6849
step 60 , total_loss: 0.6796, data_loss: 0.6796
step 80 , total_loss: 0.7847, data_loss: 0.7847
step 100 , total_loss: 0.6977, data_loss: 0.6977
step 120 , total_loss: 0.6618, data_loss: 0.6618
step 140 , total_loss: 0.7070, data_loss: 0.7070
step 160 , total_loss: 0.6447, data_loss: 0.6447
step 180 , total_loss: 0.6987, data_loss: 0.6987
step 200 , total_loss: 0.7039, data_loss: 0.7039
step 220 , total_loss: 0.6430, data_loss: 0.6430
step 240 , total_loss: 0.6372, data_loss: 0.6372
step 260 , total_loss: 0.6415, data_loss: 0.6415
step 280 , total_loss: 0.6041, data_loss: 0.6041
step 300 , total_loss: 0.6991, data_loss: 0.6991
step 320 , total_loss: 0.8631, data_loss: 0.8631
step 340 , total_loss: 0.6868, data_loss: 0.6868
step 360 , total_loss: 0.6192, data_loss: 0.6192
step 380 , total_loss: 0.6641, data_loss: 0.6641
step 400 , total_loss: 0.6668, data_loss: 0.6668
step 420 , total_loss: 0

step 80 , total_loss: 0.7178, data_loss: 0.7178
step 100 , total_loss: 0.7420, data_loss: 0.7420
step 120 , total_loss: 0.6171, data_loss: 0.6171
step 140 , total_loss: 0.6798, data_loss: 0.6798
step 160 , total_loss: 0.5726, data_loss: 0.5726
step 180 , total_loss: 0.6281, data_loss: 0.6281
step 200 , total_loss: 0.7530, data_loss: 0.7530
step 220 , total_loss: 0.6576, data_loss: 0.6576
step 240 , total_loss: 0.5658, data_loss: 0.5658
step 260 , total_loss: 0.6383, data_loss: 0.6383
step 280 , total_loss: 0.5884, data_loss: 0.5884
step 300 , total_loss: 0.6432, data_loss: 0.6432
step 320 , total_loss: 0.6640, data_loss: 0.6640
step 340 , total_loss: 0.7083, data_loss: 0.7083
step 360 , total_loss: 0.5434, data_loss: 0.5434
step 380 , total_loss: 0.7015, data_loss: 0.7015
step 400 , total_loss: 0.6670, data_loss: 0.6670
step 420 , total_loss: 0.7314, data_loss: 0.7314
step 440 , total_loss: 0.7399, data_loss: 0.7399
step 460 , total_loss: 0.5929, data_loss: 0.5929
step 480 , total_loss

step 140 , total_loss: 0.6682, data_loss: 0.6682
step 160 , total_loss: 0.5386, data_loss: 0.5386
step 180 , total_loss: 0.6219, data_loss: 0.6219
step 200 , total_loss: 0.7412, data_loss: 0.7412
step 220 , total_loss: 0.6084, data_loss: 0.6084
step 240 , total_loss: 0.5623, data_loss: 0.5623
step 260 , total_loss: 0.6326, data_loss: 0.6326
step 280 , total_loss: 0.5914, data_loss: 0.5914
step 300 , total_loss: 0.6427, data_loss: 0.6427
step 320 , total_loss: 0.4490, data_loss: 0.4490
step 340 , total_loss: 0.6956, data_loss: 0.6956
step 360 , total_loss: 0.5195, data_loss: 0.5195
step 380 , total_loss: 0.6544, data_loss: 0.6544
step 400 , total_loss: 0.6635, data_loss: 0.6635
step 420 , total_loss: 0.7136, data_loss: 0.7136
step 440 , total_loss: 0.7365, data_loss: 0.7365
step 460 , total_loss: 0.5926, data_loss: 0.5926
step 480 , total_loss: 0.6251, data_loss: 0.6251
step 500 , total_loss: 0.6335, data_loss: 0.6335
step 520 , total_loss: 0.5495, data_loss: 0.5495
step 540 , total_los

<reco_utils.recommender.deeprec.models.xDeepFM.XDeepFMModel at 0x7fca841b0518>

Again, let's see what is the model's performance now (after training):

In [10]:
res_syn = model.run_eval(test_file)
print(res_syn)
pm.record("res_syn", res_syn)

{'mae': 0.44340557, 'rsquare': 0.07173032307047911, 'logloss': 0.6442, 'rmse': 0.47549975, 'exp_var': 0.07379478216171265, 'auc': 0.6605}


  This is separate from the ipykernel package so we can avoid doing imports until


Other ranking metric: map_at_k = 0.008857395925597878
ndcg_at_k = 1.0
precision_at_k = 1.0
recall_at_k = 0.008857395925597878


## Reference
\[1\] Lian, J., Zhou, X., Zhang, F., Chen, Z., Xie, X., & Sun, G. (2018). xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems.Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery \& Data Mining, KDD 2018, London, UK, August 19-23, 2018.<br>
\[2\] The Criteo datasets: http://labs.criteo.com/category/dataset/. 