Skip to content

aliang-rec/Amazon-DIN-TFrecord-estimator

 
 

Repository files navigation

Two-Tower Model and DIN model(without Dice) use TF-estimator API at Amazon Electronics dataset

batch_size Model max-AUC
32 Two-Tower 0.877
32 DIN(without Dice) 0.893

Requirements

  • Python 3.6
  • Numpy 1.18.5
  • Pandas 1.1.3
  • TensorFlow 2.3.1

Amazon Electronics dataset download

wget -c http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/reviews_Electronics_5.json.gz  
gzip -d reviews_Electronics_5.json.gz  
wget -c http://snap.stanford.edu/data/amazon/productGraph/categoryFiles/meta_Electronics.json.gz  
gzip -d meta_Electronics.json.gz

Training and Evaluation

  • Step 1: generate tfrecord dataset
python generate_tfrecord.py

or use spark to generate tfrecord
(I use zeppelin and code save in generate_tfrecord.scala)

  • Step 2: training and evaluation
python main.py

you need confirm tfrecord dataset path and param "data_gen_method"("spark" or "python")

  • you can change Two-Tower model to DIN model in main.py's model_fn

Reference:

https://github.com/zhougr1993/DeepInterestNetwork.git

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 80.7%
  • Scala 19.3%