Skip to content

smellslikeml/tfr-bert-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TF-Ranking + BERT demo using MovieLens data

This is the sample code for the blog based off the extension example in the official TF-Ranking repo.

Setup

Install the python requirements found in the requirements.txt file

$ pip install -r requirements.txt

Download the official BERT checkpoints for use in tensorflow 2+ and unzip into this directory.

$ wget https://storage.googleapis.com/cloud-tpu-checkpoints/bert/keras_bert/uncased_L-12_H-768_A-12.tar.gz
$ tar -xvzf uncased_L-12_H-768_A-12.tar.gz 

For this demo, we enriched the MovieLens 100k dataset with movie title descriptions. You can download this set from here. Extract the zip file inside the data/ directory.

$ unzip movieLens_tfrank.zip -d ./data/

Run the tfrecord script to create tfrecords from the downloaded movielens data.

$ python create_tfrecords.py

Train and evaluate

Run the train script to train and evaluate the model. The script will create logs that can be visualized by Tensorboard. This script includes default values for all flag variables but feel free to experiment.

# run with default parameters
$ python train.py

or

$ python train.py --train_input_pattern=tfrecords/train.tfrecord \
   --eval_input_pattern=tfrecords/test.tfrecord \
   --vocab_input_pattern=tfrecords/vocab.tfrecord \
   --bert_config_file=uncased_L-12_H-768_A-12/bert_config.json \
   --bert_init_ckpt=uncased_L-12_H-768_A-12/bert_model.ckpt \
   --bert_max_seq_length=64 \
   --model_dir="models/" \
   --loss=softmax_loss \
   --train_batch_size=1 \
   --eval_batch_size=1 \
   --learning_rate=1e-5 \
   --num_train_steps=50 \
   --num_eval_steps=10 \
   --checkpoint_secs=120 \
   --num_checkpoints=20

Generate Recommendations

Run the predict script to generate recommendations for a set of users. We've included a sample csv of user data for which to run inference.

$ python predict.py

About

TF-Ranking + Bert demo using MovieLens dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages