batch_size | Model | max-AUC |
---|---|---|
32 | Two-Tower | 0.877 |
32 | DIN(without Dice) | 0.893 |
- Python 3.6
- Numpy 1.18.5
- Pandas 1.1.3
- TensorFlow 2.3.1
wget -c http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Electronics_5.json.gz
gzip -d reviews_Electronics_5.json.gz
wget -c http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/meta_Electronics.json.gz
gzip -d meta_Electronics.json.gz
- Step 1: generate tfrecord dataset
python generate_tfrecord.py
or use spark to generate tfrecord
(I use zeppelin and code save in generate_tfrecord.scala)
- Step 2: training and evaluation
python main.py
you need confirm tfrecord dataset path and param "data_gen_method"("spark" or "python")
- you can change Two-Tower model to DIN model in main.py's model_fn