<a href="https://colab.research.google.com/github/hodeld/rv-ranking-problem/blob/master/rv_ranking_problem.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tutorial: TF-Ranking for sparse features



This tutorial is an end-to-end walkthrough of training a TensorFlow Ranking (TF-Ranking) neural network model which incorporates sparse textual features.

A Python script version of this code is available [here](https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py). The script version supports flags for hyperparameters, and advanced use-cases like [Document Interaction Networks](https://arxiv.org/pdf/1910.09676.pdf).

TF-Ranking is a library for solving large scale ranking problems using deep learning. TF-Ranking can handle heterogeneous dense and sparse features, and scales up to millions of data points. For more details, please read the technical paper published on [arXiv](https://arxiv.org/abs/1812.00073).

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

## Motivation

Learning to Rank (LTR) deals with learning to optimally order a list of examples, given some context. For instance, in search applications, examples are documents and context is the query.

These models are usually trained using user relevance feedback, which can be explicit (human ratings) or implicit (clicks).

This tutorial demonstrates how to build ranking estimators over sparse features, such as textual data. Textual data is prevalent in several settings for ranking, and plays a significant role is relevance judgment by a user.

In three different LTR scenarios, the following textual features provide useful signals for ranking:

*   Search: queries and document titles
*   Question Answering: questions and answers
*   Recommendation: titles of items and their descriptions

Hence it is important for LTR models to effectively incorporate textual features.

## Task: Ranking over Question-Answering data



### ANTIQUE: A Question Answering Dataset

For the purpose of this tutorial, we consider ranking problem over ANTIQUE, a question-answering dataset. Given a query, and a list of answers, the objective it to maximize a rank related metric (say NDCG).

[ANTIQUE](http://hamedz.ir/resources/) is a publicly available dataset for open-domain non-factoid question answering, collected over Yahoo! answers.

Each question has a list of answers, whose relevance are graded on a scale of 1-5.

The list size can vary depending on the query, so we use a fixed "list size" of 50, where the list is either truncated or padded with dummy values.

This dataset is a suitable one for learning-to-rank scenario. The dataset is split into 2206 queries for training and 200 queries for testing. For more details, please read the tehcnical paper on [arXiv](https://arxiv.org/pdf/1905.08957.pdf).

In [None]:
#!wget -O "/tmp/vocab.txt" "http://ciir.cs.umass.edu/downloads/Antique/tf-ranking/vocab.txt"
#!wget -O "/tmp/train.tfrecords" "http://ciir.cs.umass.edu/downloads/Antique/tf-ranking/ELWC/train.tfrecords"
#!wget -O "/tmp/test.tfrecords" "http://ciir.cs.umass.edu/downloads/Antique/tf-ranking//ELWC/test.tfrecords"


Download base files

In [None]:
import sys
#choose in runtime type
print ('python-version: ', sys.version)
import pandas as pd

from google.colab import drive
drive.mount("/content/drive", force_remount=False)
main_path = '/content/drive/My Drive/Colab Notebooks/rbs-data/alopt_files/'
timelines_raw = pd.read_csv(main_path+'timelines.csv', index_col =0) #, header=0)
samples = pd.read_csv(main_path+'Samples.csv')
rvs = pd.read_csv(main_path+'RVs.csv')
allevents = pd.read_csv(main_path+'AllEvents.csv', index_col =0)
rvfirstev_raw = pd.read_csv(main_path+'rvfirstev.csv', index_col =0)


#data, categories, tendencies = ReadCsv('/content/drive/My Drive/Colab Notebooks/etiki_data','test.csv','companies.csv', 'categories.csv','references.csv','tendencies.csv', 'topics.csv')


python-version:  3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0]
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


Prepare train, eval and test data

In [None]:
samples.head(3)
#allevents.head(5)
#rvfirstev_raw.head(5)
#timelines_raw.head(3)

Unnamed: 0,Id,Location,Dbid,Day evs,Sevs,Rv eq,Start,End,Rv,Group,Category,Type,Rv ff,Gespever,Hwx,Uma
0,21195,2,21195,21195;21194,32309,,2628,2664,33,3,1,22,33,0,0,0
1,21585,2,21585,21585,32454;32456;32455,,2628,2664,34,3,1,23,34,0,0,0
2,21194,2,21194,21195;21194,32307,,2536,2568,34,3,1,22,34,2,1,0


In [None]:
from datetime import datetime
#help(timelines_raw.loc)
dtform = '%Y-%m-%d %H:%M:%S'
tstart = datetime.strptime(timelines_raw.loc['dt_col', '0'], dtform)
tend = datetime.strptime(timelines_raw.loc['dt_col', '1'], dtform)  
TD_SEQ = (tend - tstart).seconds/60
PPH = 60/TD_SEQ
WEEKS_B = 4 #for cutting timelines
WEEKS_A = WEEKS_B
KMAX = int(timelines_raw.columns[-1]) #last column name as int
print(TD_SEQ, PPH)

timelines = timelines_raw.drop(index='dt_col')

#now get rv with timelines.loc[str(52)]  
timelines.head(3)

15.0 4.0


Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,...,2648,2649,2650,2651,2652,2653,2654,2655,2656,2657,2658,2659,2660,2661,2662,2663,2664,2665,2666,2667,2668,2669,2670,2671,2672,2673,2674,2675,2676,2677,2678,2679,2680,2681,2682,2683,2684,2685,2686,2687
k_col,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1
52,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,23,23,23,23,...,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
33,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,18,18,18,18,...,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
38,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,...,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2


In [None]:
rvfirstev = rvfirstev_raw.copy()
rvfirstev[rvfirstev_raw == 0] = 1
rvfirstev.head(5)

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
evs,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1
52,,,,,,1.0,,1.0,1.0,1.0,1.0,1.0,1.0,1.0,,,,1.0,1.0,1.0,,,1.0,1.0
33,,,,,,1.0,,1.0,1.0,1.0,1.0,1.0,1.0,,1.0,,,1.0,1.0,1.0,1.0,,1.0,1.0
38,,,,,,,,,,,,,,,,,,,,,,,,
34,,,,,2688.0,1.0,,1.0,1.0,1.0,1.0,1.0,1.0,,,,,1.0,1.0,,1.0,,1.0,1.0
39,,,,,,,,,,,1.0,,,,,,,,,,,,,


In [None]:
#for testing
tline = timelines.loc[str(52)]  
#tline.head(3)
#tline = list(tline)


#tline = tline.loc[str(5):str(10)] 
#tline.head(5)

#print (tline[:5]#)


In [None]:
#for testing
ev = allevents.loc[21195]
ev.head(5)
rvid = ev['Rv']
print (rvid)

33


In [None]:
#for testing
firstev = rvfirstev.loc[52, str(4)]
print (firstev)

nan


In [None]:
import copy
import operator

class Sample():

  """class for events for ranking problem"""
  def __init__(self,  sample_li):
    (s_id, location, dbid, day_evs, sevs, rv_eq, 
    start, end, rv, group, cat, 
    evtype, rv_ff, gespever, hwx, uma) = sample_li

    day = (start//24*PPH)*(24*PPH)
    locday =str(location)+'-' + str(day)

    def get_li(li_str):
      if isinstance(li_str, str):
        li = [int(s) for s in li_str.split(';')]
      else:
        li = []
      return li

    day_evs = get_li(day_evs)
    rv_eq = get_li(rv_eq)
    sevs = get_li(sevs)

    self.location = location
    self.day = (start//24*PPH)*(24*PPH)
    self.locday = locday
    self.start = start
    self.end = end
    self.rv = rv
    self.rv_eq = rv_eq
    self.id = s_id
    self.evtype = evtype        
    self.group = group
    self.day_evs = day_evs
    self.sevs = sevs
    self.rv_ff = rv_ff
    self.gespever = gespever
    self.hwx = hwx
    self.uma = uma
    self.rvli = None
class SampleList(list):
  '''base class for list of samples'''
    
  def get(self, variable_value, item_attr  = 'id'): 
      vv = variable_value
      ra = item_attr
      f = operator.attrgetter(ra)
      for s in self:
          if f(s) == vv: 
              return s                
      return None

class RV():
  '''base class for rvs'''
  def __init__(self,  rvvals
                ):
    (rvid, location,
     sex) = rvvals
    self.id = rvid
    self.location = location
    self.sex = sex

class RVList(list):
  '''base class for list of rvs'''
  def filter(self, variable_value, rv_attr): 
        vv = variable_value
        ra = rv_attr
        f = operator.attrgetter(ra)
        newli = RVList(filter(lambda x: f(x) == vv, self)) #calls x.ev
        return newli
    
  def get(self, variable_value, rv_attr  = 'id'): 
      vv = variable_value
      ra = rv_attr
      f = operator.attrgetter(ra)
      for rv in self:
          if f(rv) == vv: 
              return rv                
      return None

  
def get_example_features(s):  
  if s.rvli == None:
    rvli = get_rvlist(s)
    s.rvli = rvli
    set_tlines(s)
  get_pot_rvs(s)

def get_rvlist(s):
  rvli = rvli_d.get(s.locday, None)
  if rvli:
    return rvli
  
  rvlist = rvlist_all.filter(s.location, 'location') #[rv for rv in rvs if rv.location == loc_id]
  rvlist = cut_timelines(rvlist, s) #same for all same day events
  rvli_d[s.locday] = rvlist
  return rvli

def cut_timelines(rvlist, s):
  weeks_before = WEEKS_B
  weeks_after = WEEKS_A
  td_perwk = PPH*24*7
  ist = s.start - (td_perwk*weeks_before)
  if ist < 0:
    ist = 0
  iet = s.start + td_perwk*weeks_after
  if iet > KMAX:
    iet = KMAX

  for rv in rvlist:
    tline = timelines.loc[str(rv.id)]
    rv.tline = tline.loc[str(ist):str(iet)] 
  return rvlist
def set_tlines(s):
  rvli = s.rvli
  evs  = s.sevs #can be empty list
  day_evs = s.day_evs #rendomized only part of it to zero
  evs += day_evs
  sample_li = []
  '''
  -> delete ev and sev for 1 rv 
  -> assign rvlist to sample
  -> copy rvlist
  -> delete ev and sev for next ev in 1 rv
  -> rvlist gets more and more zeros'''
  for eid in day_evs:
    ev = sample_list.get(eid)
    evs += ev.sevs
    sample_li.append(ev)

  for eid in evs:
    ev = allevents.loc[eid]
    ev = allevents.loc[21195]
    rvid = ev['Rv']
    rv = rvli.get(rvid)
    rv.tline.loc[str(ev['Start']):str(ev['End'])] = 0 
  #all day_evs have same rvli
  for s in sample_li:
    s.rvli = copy.deepcopy(rvli)


def get_pot_rvs(s):
  rvlist = s.rvlist
  for rv in rvlist:
    if check_availability(rv, s) == False:
      rvlist.remove(rv)
    if check_evtype(rv, s) == False:
      rvlist.remove(rv)
  check_feat(rv, s) #probably better leave features
    
def check_availability(rv, s):
  if rv.tline.loc[str(s.star):str(s.end)].any() == True: #not all are 0
    return False
  else:
    return True
     
def check_evtype(rv, s):
  firstev = rvfirstev.loc[rv.id, str(s.evtype)] #1, None or date_int
  if firstev == None:
    return False
  if firstev <= s.start:
    return True
  else:
    return False

def check_feat(rv, s):
  if s.gespever > 0: 
    rvlist = s.rvlist
    rvlist = rvlist.filter(s.gespever, 'sex')
  #UMA and HWX just left with features

#iterating over pd rows not efficient -> could all be done with pd Dataframe
#but nested lists not working well with pd (like in samples)

sample_list = SampleList([Sample(s) for i, s in samples.iloc[:5].iterrows()])
rvlist_all = RVList([RV(r) for i, r in rvs.iterrows()])

rvli_d = {}
i = 0
for s in sample_list:
  print ('samplelist', i)
  i += 1
  get_example_features(s) #rvs, SHOULD BE ALWAYS SAME # OF RVS
  rvs = [s.rv ] + list(s.rv_eq) #correct answer
  '''relevanz als key bei RV (feature_key)
  features als key 'FEATure'
  '''
  #relevanz aller rvs in reihenfolge für sample, meistens 1, 0, …
  #zugehörige features des events 
  #zugehörige Y rvs mit X features 








samplelist 0


AttributeError: ignored

In [None]:
%debug

Next, we discuss data formats in more detail, and show how to generate and store dummy ranking data.

## Data Formats


### Data Formats for Ranking

For representing ranking data, [protobuffers](https://developers.google.com/protocol-buffers/) are extensible structures suitable for storing data in a serialized format, either locally or in a distributed manner.

Ranking usually consists of features corresponding to each of the examples being sorted. In addition, features related to query, user or session are also useful for ranking. We refer to these as context features, as these are independent of the examples.

We use the popular [tf.Example](https://www.tensorflow.org/tutorials/load_data/tf_records) proto to represent the features for context, and each of the examples. We use the protobuffer, **ExampleListWithContext** (ELWC), to store context as a tf.Example proto and the list of examples to be ranked as a list of tf.Example protos.

ExampleListWithContext protbuffer is defined [here](https:https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/input.proto#L72).

Let us create some dummy data in ELWC format. We will use this dummy data to show how the proto looks like.

Download and install the TensorFlow 2 package.

In [None]:
print('Installing TensorFlow 2.1.0. This will take a minute, ignore the warnings.')
!pip install -q tensorflow==2.1.0
import tensorflow as tf

# This is needed for tensorboard compatibility.
!pip uninstall -y grpcio
!pip install -q grpcio>=1.24.3

Uninstalling grpcio-1.27.2:
  Successfully uninstalled grpcio-1.27.2


### Dependencies and Global Variables

Let us start by importing libraries that will be used throughout this Notebook. We also enable the "eager execution" mode for convenience and demonstration purposes.

In [None]:
import six
import os
import numpy as np

try:
  import tensorflow_ranking as tfr
except ImportError:
    !pip install -q tensorflow_ranking
    import tensorflow_ranking as tfr

Here we define the train and test paths, along with model hyperparameters.



In [None]:
# Store the paths to files containing training and test instances.
_TRAIN_DATA_PATH = "/tmp/train.tfrecords"
_TEST_DATA_PATH = "/tmp/test.tfrecords"

# Store the vocabulary path for query and document tokens.
_VOCAB_PATH = "/tmp/vocab.txt"

# The maximum number of documents per query in the dataset.
# Document lists are padded or truncated to this size.
_LIST_SIZE = 50

# The document relevance label.
_LABEL_FEATURE = "relevance"

# Padding labels are set negative so that the corresponding examples can be
# ignored in loss and metrics.
_PADDING_LABEL = -1

# Learning rate for optimizer.
_LEARNING_RATE = 0.05

# Parameters to the scoring function.
_BATCH_SIZE = 32
_HIDDEN_LAYER_DIMS = ["64", "32", "16"]
_DROPOUT_RATE = 0.8
_GROUP_SIZE = 1  # Pointwise scoring.

# Location of model directory and number of training steps.
_MODEL_DIR = "/tmp/ranking_model_dir"
_NUM_TRAIN_STEPS = 15 * 1000

## Components of a Ranking Estimator



The overall components of a Ranking Estimator are shown below.

The key components of the library are:

1.   Input Reader
2.   Tranform Function
3.   Scoring Function
4.   Ranking Losses
5.   Ranking Metrics
6.   Ranking Head
7.   Model Builder

These are described in more details in the following sections.

### TensorFlow Ranking Architecture

![tf_ranking_arch](https://user-images.githubusercontent.com/3262617/60061785-5f107980-96ab-11e9-9849-ace2d117220f.png)


### Specifying Features via Feature Columns

[Feature Columns](https://www.tensorflow.org/guide/feature_columns) are TensorFlow abstractions that are used to capture rich information about each feature. It allows for easy transformations for a diverse range of raw features and for interfacing with Estimators.

Consistent with our input formats for ranking, such as ELWC format, we create feature columns for context features and example features.

In [None]:
_EMBEDDING_DIMENSION = 20

def context_feature_columns():
  """Returns context feature names to column definitions."""
  sparse_column = tf.feature_column.categorical_column_with_vocabulary_file(
      key="query_tokens",
      vocabulary_file=_VOCAB_PATH)
  query_embedding_column = tf.feature_column.embedding_column(
      sparse_column, _EMBEDDING_DIMENSION)
  return {"query_tokens": query_embedding_column}

def example_feature_columns():
  """Returns the example feature columns."""
  sparse_column = tf.feature_column.categorical_column_with_vocabulary_file(
      key="document_tokens",
      vocabulary_file=_VOCAB_PATH)
  document_embedding_column = tf.feature_column.embedding_column(
      sparse_column, _EMBEDDING_DIMENSION)
  return {"document_tokens": document_embedding_column}

### Reading Input Data using *input_fn*

The input reader reads in data from persistent storage to produce raw dense and sparse tensors of appropriate type for each feature. Example features are represented by 3-D tensors (where dimensions correspond to queries, examples and feature values). Context features are represented by 2-D tensors (where dimensions correspond to queries and feature values).

In [None]:
def input_fn(path, num_epochs=None):
  context_feature_spec = tf.feature_column.make_parse_example_spec(
        context_feature_columns().values())
  label_column = tf.feature_column.numeric_column(
        _LABEL_FEATURE, dtype=tf.int64, default_value=_PADDING_LABEL)
  example_feature_spec = tf.feature_column.make_parse_example_spec(
        list(example_feature_columns().values()) + [label_column])
  dataset = tfr.data.build_ranking_dataset(
        file_pattern=path,
        data_format=tfr.data.ELWC,
        batch_size=_BATCH_SIZE,
        list_size=_LIST_SIZE,
        context_feature_spec=context_feature_spec,
        example_feature_spec=example_feature_spec,
        reader=tf.data.TFRecordDataset,
        shuffle=False,
        num_epochs=num_epochs)
  features = tf.compat.v1.data.make_one_shot_iterator(dataset).get_next()
  label = tf.squeeze(features.pop(_LABEL_FEATURE), axis=2)
  label = tf.cast(label, tf.float32)

  return features, label

In [None]:
feat, labs = input_fn(_TRAIN_DATA_PATH)
print(_TRAIN_DATA_PATH)
print('label', labs.shape)
for k, item in feat.items():
  print ('feat', k, item.shape)
print ('first 5 labels', labs[0,:5].numpy())
queries_t = tf.sparse.to_dense(feat['query_tokens']) #spare tensor to dense
doc_t = tf.sparse.to_dense(feat['document_tokens'])
#print ('indices', query_st.indices[0][0]) #which indix has first value
print ('query values', queries_t[0])
#check slicing notification!
print ('document values', doc_t[0,:5,:10].numpy()) #first x answers, first y words


W0228 10:11:59.097604 140021051340672 deprecation.py:323] From /usr/local/lib/python2.7/dist-packages/tensorflow_ranking/python/data.py:799: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_deterministic`.


/tmp/train.tfrecords
('label', TensorShape([32, 50]))
('feat', 'query_tokens', TensorShape([32, 36]))
('feat', 'document_tokens', TensorShape([32, 50, 810]))
('first 5 labels', array([1., 2., 3., 2., 3.], dtype=float32))
('query values', <tf.Tensor: shape=(36,), dtype=string, numpy=
array(['why', 'do', 'human', 'bee', '##ing', 'h', '##v', 'to', 'bel',
       '##ive', 'in', 'god', '?', '', '', '', '', '', '', '', '', '', '',
       '', '', '', '', '', '', '', '', '', '', '', '', ''], dtype=object)>)
('document values', array([['they', 'don', "'", 't', 'have', 'to', '.', 'usually', 'people',
        'are'],
       ['it', "'", 's', 'not', 'that', 'we', 'have', 'to', 'believe',
        'in'],
       ['it', "'", 's', 'instinct', '!', 'for', 'survival', '.', '.',
        'du'],
       ['human', 'beings', 'don', "'", 't', 'have', 'to', 'believe',
        'in', 'god'],
       ['they', 'don', "'", 't', 'but', 'it', 'comfort', '##s', 'some',
        'people']], dtype=object))


**BATCH-Size: 32
-> 32 examples mit 50 dokumenten, welche in mit relevanz gelabelt sind!** 

('label', TensorShape([32, 50]))

('feat', 'query_tokens', TensorShape([32, 36])) **-> 32 queries mit 36 context features**

('feat', 'document_tokens', TensorShape([32, 50, 810])) **-> 32 queries mit 50 dokumenten mit 810 features**

### Feature Transformations with *transform_fn*

The transform function takes in the raw dense or sparse features from the input reader, applies suitable transformations to return dense representations for each feature. This is important before passing these features to a neural network, as neural networks layers usually take dense features as inputs.

The transform function handles any custom feature transformations defined by the user. For handling sparse features, like text data, we provide an easy utlity to create shared embeddings, based on the feature columns.

In [None]:
def make_transform_fn():
  def _transform_fn(features, mode):
    """Defines transform_fn."""
    context_features, example_features = tfr.feature.encode_listwise_features(
        features=features,
        context_feature_columns=context_feature_columns(),
        example_feature_columns=example_feature_columns(),
        mode=mode,
        scope="transform_layer")

    return context_features, example_features
  return _transform_fn

In [None]:
transfunc = make_transform_fn()
cont_feats, ex_feats = transfunc(feat, 'test')


In [None]:
cont_feat_t = cont_feats['query_tokens']
examp_feat_t = ex_feats['document_tokens']
print ('cont_feat_t', cont_feat_t.shape)
print ('examp_feat_t', examp_feat_t.shape)

print ('first 5 cont features', cont_feat_t[0,:5].numpy())
print ('first 5 features von 1 doc', examp_feat_t[0,0,:5].numpy())


('cont_feat_t', TensorShape([32, 20]))
('examp_feat_t', TensorShape([32, 50, 20]))
('first 5 cont features', array([ 0.00775362,  0.07860542, -0.05217513, -0.0778469 ,  0.01583379],
      dtype=float32))
('first 5 features von 1 doc', array([-0.00228992, -0.01075753,  0.02653058, -0.03984784, -0.04151261],
      dtype=float32))


**OUTPUT**
('cont_feat_t', TensorShape([32, 20]))
('examp_feat_t', TensorShape([32, 50, 20]))
('first 5 cont features', array([ 0.00775362,  0.07860542, -0.05217513, -0.0778469 ,  0.01583379],
      dtype=float32))
('first 5 features von 1 doc', array([-0.00228992, -0.01075753,  0.02653058, -0.03984784, -0.04151261],
      dtype=float32))

### Feature Interactions using *scoring_fn*

Next, we turn to the scoring function which is arguably at the heart of a TF Ranking model. The idea is to compute a relevance score for a (set of) query-document pair(s). The TF-Ranking model will use training data to learn this function.

Here we formulate a scoring function using a feed forward network. The function takes the features of a single example (i.e., query-document pair) and produces a relevance score.

In [None]:
def make_score_fn():
  """Returns a scoring function to build `EstimatorSpec`."""

  def _score_fn(context_features, group_features, mode, params, config):
    """Defines the network to score a group of documents."""
    with tf.compat.v1.name_scope("input_layer"):
      context_input = [
          tf.compat.v1.layers.flatten(context_features[name])
          for name in sorted(context_feature_columns())
      ]
      group_input = [
          tf.compat.v1.layers.flatten(group_features[name])
          for name in sorted(example_feature_columns())
      ]
      input_layer = tf.concat(context_input + group_input, 1)

    is_training = (mode == tf.estimator.ModeKeys.TRAIN)
    cur_layer = input_layer
    cur_layer = tf.compat.v1.layers.batch_normalization(
      cur_layer,
      training=is_training,
      momentum=0.99)

    for i, layer_width in enumerate(int(d) for d in _HIDDEN_LAYER_DIMS):
      cur_layer = tf.compat.v1.layers.dense(cur_layer, units=layer_width)
      cur_layer = tf.compat.v1.layers.batch_normalization(
        cur_layer,
        training=is_training,
        momentum=0.99)
      cur_layer = tf.nn.relu(cur_layer)
      cur_layer = tf.compat.v1.layers.dropout(
          inputs=cur_layer, rate=_DROPOUT_RATE, training=is_training)
    logits = tf.compat.v1.layers.dense(cur_layer, units=_GROUP_SIZE)
    return logits

  return _score_fn

## Losses, Metrics and Ranking Head

### Evaluation Metrics

We have provided an implementation of several popular Information Retrieval evaluation metrics in the TF Ranking library, which are shown [here](https://github.com/tensorflow/ranking/blob/d8c2e2e64a92923f1448cf5302c92a80bb469a20/tensorflow_ranking/python/metrics.py#L32). The user can also define a custom evaluation metric, as shown in the description below.

In [None]:
def eval_metric_fns():
  """Returns a dict from name to metric functions.

  This can be customized as follows. Care must be taken when handling padded
  lists.

  def _auc(labels, predictions, features):
    is_label_valid = tf_reshape(tf.greater_equal(labels, 0.), [-1, 1])
    clean_labels = tf.boolean_mask(tf.reshape(labels, [-1, 1], is_label_valid)
    clean_pred = tf.boolean_maks(tf.reshape(predictions, [-1, 1], is_label_valid)
    return tf.metrics.auc(clean_labels, tf.sigmoid(clean_pred), ...)
  metric_fns["auc"] = _auc

  Returns:
    A dict mapping from metric name to a metric function with above signature.
  """
  metric_fns = {}
  metric_fns.update({
      "metric/ndcg@%d" % topn: tfr.metrics.make_ranking_metric_fn(
          tfr.metrics.RankingMetricKey.NDCG, topn=topn)
      for topn in [1, 3, 5, 10]
  })

  return metric_fns

### Ranking Losses

We provide several popular ranking loss functions as part of the library, which are shown [here](https://github.com/tensorflow/ranking/blob/d8c2e2e64a92923f1448cf5302c92a80bb469a20/tensorflow_ranking/python/losses.py#L35). The user can also define a custom loss function, similar to ones in tfr.losses.

In [None]:
# Define a loss function. To find a complete list of available
# loss functions or to learn how to add your own custom function
# please refer to the tensorflow_ranking.losses module.

_LOSS = tfr.losses.RankingLossKey.APPROX_NDCG_LOSS
loss_fn = tfr.losses.make_loss_fn(_LOSS)

### Ranking Head

In the Estimator workflow, Head is an abstraction that encapsulates losses and corresponding metrics. Head easily interfaces with the Estimator, needing the user to define a scoring function and specify losses and metric computation.

In [None]:
optimizer = tf.compat.v1.train.AdagradOptimizer(
    learning_rate=_LEARNING_RATE)

def _train_op_fn(loss):
  """Defines train op used in ranking head."""
  update_ops = tf.compat.v1.get_collection(tf.compat.v1.GraphKeys.UPDATE_OPS)
  minimize_op = optimizer.minimize(
      loss=loss, global_step=tf.compat.v1.train.get_global_step())
  train_op = tf.group([update_ops, minimize_op])
  return train_op

ranking_head = tfr.head.create_ranking_head(
      loss_fn=loss_fn,
      eval_metric_fns=eval_metric_fns(),
      train_op_fn=_train_op_fn)

## Putting It All Together in a Model Builder

We are now ready to put all of the components above together and create an `Estimator` that can be used to train and evaluate a model.

In [None]:
model_fn = tfr.model.make_groupwise_ranking_fn(
          group_score_fn=make_score_fn(),
          transform_fn=make_transform_fn(),
          group_size=_GROUP_SIZE,
          ranking_head=ranking_head)

### Train and evaluate the ranker

In [None]:
def train_and_eval_fn():
  """Train and eval function used by `tf.estimator.train_and_evaluate`."""
  run_config = tf.estimator.RunConfig(
      save_checkpoints_steps=1000)
  ranker = tf.estimator.Estimator(
      model_fn=model_fn,
      model_dir=_MODEL_DIR,
      config=run_config)

  train_input_fn = lambda: input_fn(_TRAIN_DATA_PATH)
  eval_input_fn = lambda: input_fn(_TEST_DATA_PATH, num_epochs=1)

  train_spec = tf.estimator.TrainSpec(
      input_fn=train_input_fn, max_steps=_NUM_TRAIN_STEPS)
  eval_spec =  tf.estimator.EvalSpec(
          name="eval",
          input_fn=eval_input_fn,
          throttle_secs=15)
  return (ranker, train_spec, eval_spec)

In [None]:
! rm -rf "/tmp/ranking_model_dir"  # Clean up the model directory.
ranker, train_spec, eval_spec = train_and_eval_fn()
tf.estimator.train_and_evaluate(ranker, train_spec, eval_spec)

### Launch TensorBoard

In [None]:
%load_ext tensorboard
%tensorboard --logdir="/tmp/ranking_model_dir" --port 12345

A sample tensorboard output is shown here, with the ranking metrics.

![tensorboard](https://user-images.githubusercontent.com/3262617/60866646-be0edc00-a1dd-11e9-9599-eefb734ce801.png)


## Generating Predictions

We show how to generate predictions over the features of a dataset. We assume that the label is not present and needs to be inferred using the ranking model.

Similar to the `input_fn` used for training and evaluation,  `predict_input_fn` reads in data in ELWC format and stored as TFRecords to generate features. We set number of epochs to be 1, so that the generator stops iterating when it reaches the end of the dataset. Also the datapoints are not shuffled while reading, so that the behavior of the `predict()` function is deterministic.

In [None]:
def predict_input_fn(path):
  context_feature_spec = tf.feature_column.make_parse_example_spec(
        context_feature_columns().values())
  example_feature_spec = tf.feature_column.make_parse_example_spec(
        list(example_feature_columns().values()))
  dataset = tfr.data.build_ranking_dataset(
        file_pattern=path,
        data_format=tfr.data.ELWC,
        batch_size=_BATCH_SIZE,
        list_size=_LIST_SIZE,
        context_feature_spec=context_feature_spec,
        example_feature_spec=example_feature_spec,
        reader=tf.data.TFRecordDataset,
        shuffle=False,
        num_epochs=1)
  features = tf.compat.v1.data.make_one_shot_iterator(dataset).get_next()
  return features

We generate predictions on the test dataset, where we only consider context and example features and predict the labels. The `predict_input_fn` generates predictions on a batch of datapoints. Batching allows us to iterate over large datasets which cannot be loaded in memory.

In [None]:
predictions = ranker.predict(input_fn=lambda: predict_input_fn("/tmp/test.tfrecords"))

`ranker.predict` returns a generator, which we can iterate over to create predictions, till the generator is exhausted.

In [None]:
x = next(predictions)
assert(len(x) == _LIST_SIZE)  ## Note that this includes padding.