In [1]:
!pip install -q tensorflow-recommenders
!pip install -q --upgrade tensorflow-datasets
!pip install -q tensorflow-ranking

[K     |████████████████████████████████| 88 kB 4.1 MB/s 
[K     |████████████████████████████████| 511.7 MB 4.5 kB/s 
[K     |████████████████████████████████| 438 kB 24.5 MB/s 
[K     |████████████████████████████████| 1.6 MB 43.9 MB/s 
[K     |████████████████████████████████| 5.8 MB 36.5 MB/s 
[K     |████████████████████████████████| 4.3 MB 5.3 MB/s 
[K     |████████████████████████████████| 141 kB 5.2 MB/s 
[?25h

#Packages Version

In [None]:
!pip freeze

absl-py==1.2.0
aiohttp==3.8.1
aiosignal==1.2.0
alabaster==0.7.12
albumentations==0.1.12
altair==4.2.0
appdirs==1.4.4
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
arviz==0.12.1
astor==0.8.1
astropy==4.3.1
astunparse==1.6.3
async-timeout==4.0.2
asynctest==0.13.0
atari-py==0.2.9
atomicwrites==1.4.1
attrs==21.4.0
audioread==2.1.9
autograd==1.4
Babel==2.10.3
backcall==0.2.0
beautifulsoup4==4.6.3
bleach==5.0.1
blis==0.7.8
bokeh==2.3.3
branca==0.5.0
bs4==0.0.1
CacheControl==0.12.11
cached-property==1.5.2
cachetools==4.2.4
catalogue==2.0.8
certifi==2022.6.15
cffi==1.15.1
cftime==1.6.1
chardet==3.0.4
charset-normalizer==2.1.0
click==7.1.2
clikit==0.6.2
cloudpickle==1.3.0
cmake==3.22.5
cmdstanpy==1.0.4
colorcet==3.0.0
colorlover==0.3.0
community==1.0.0b1
contextlib2==0.5.5
convertdate==2.4.0
coverage==3.7.1
coveralls==0.5
crashtest==0.3.1
crcmod==1.7
cufflinks==0.17.3
cvxopt==1.2.7
cvxpy==1.0.31
cycler==0.11.0
cymem==2.0.6
Cython==0.29.30
daft==0.0.4
dask==2.12.0
datascience==0.10.6
debugpy=

#Import Packages

In [2]:
import tensorflow as tf
import pprint

import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds

import tensorflow_ranking as tfr
import tensorflow_recommenders as tfrs
from tensorflow_ranking import utils





# Understanding List tfr.keras.losses.ListMLELoss()

##Purpose of ListMLELoss

Given the following dataset: <br>
Note1: These score columns are the estimated score from the neural network <br>
Note2: This order of preference is given; Can think of it as response variable Y but instead of a label or numeric value, it is an ordered list <br>
<table>
  <thead>
    <tr>
      <td>Product A Score[See Note1]</td>
      <td>Product B Score[See Note1]</td>
      <td>Product C Score[See Note1]</td>
      <td>Order of Preference[See Note2]</td>
    </tr>
  </thead> 
  <tr>
      <td>0.5</td>
      <td>0.8</td>
      <td>0.4</td>
      <td>C > A > B or equivalently [2,1,3]</td>
  </tr>
  <tr>
      <td>-0.3</td>
      <td>0.2</td>
      <td>1.9</td>
      <td>B > A > C or equivalently [2,3,1] </td>
  </tr>
</table>

Our goal here is to compute the negative likelihood of seeing this dataset.
Taking row 1 as an example: 
<table>
  <thead>
    <tr>
      <td>Product A Score</td>
      <td>Product B Score</td>
      <td>Product C Score</td>
      <td>Order of Preference</td>
    </tr>
  </thead> 
  <tr>
      <td>0.5</td>
      <td>0.8</td>
      <td>0.4</td>
      <td>C > A > B</td>
  </tr>
</table>

The probability of seeing C in the 1st place, A in the 2nd place and C in the 3rd place:

$$ P(C>A>B | Scores=[0.5,0.8,0.4]) = \dfrac{e^{0.4}}{e^{0.4} + e^{0.5} + e^{0.8}} \dfrac{e^{0.5}}{ e^{0.5} + e^{0.8}} \dfrac{e^{0.8}}{ e^{0.8}} $$

The ListMLELoss for this single sample is then the negative log likelihood of this probability:

$$ListMLELoss_{Row1} = -log(  P(C>A>B | Scores=[0.5,0.8,0.4])  )$$ 
$$= -(log \dfrac{e^{0.4}}{e^{0.4} + e^{0.5} + e^{0.8}} + log \dfrac{e^{0.5}}{e^{0.5} + e^{0.8}} + log\dfrac{e^{0.8}}{e^{0.8}} )$$
$$ = - (log(e^{0.4}) - log(e^{0.4} + e^{0.5} + e^{0.8}) ) - (log(e^{0.5}) - log(e^{0.5} + e^{0.8}) ) - (log(e^{0.8}) - log( e^{0.8}) ) $$
$$ = log(e^{0.4} + e^{0.5} + e^{0.8}) + log(e^{0.5} + e^{0.8}) + log( e^{0.8}) -(0.4 + 0.5 + 0.8)$$
$$ = 2.1344543 $$

Let's use tensorflow to calculate this

In [3]:
y_true1 = tf.constant([[2., 1.,3.]]) ##### this allows you to specify the order of preference; higher the value, higher the priority
y_pred1 = tf.constant([[0.5, 0.8,0.4] ])
loss = tfr.keras.losses.ListMLELoss()
l1 = loss(y_true1, y_pred1).numpy()
l1

2.1344543

Similarly for Row 2, the ListMLELoss =  

In [4]:
y_true2 = tf.constant([[2., 3.,1.]]) ##### this allows you to specify the order of preference; higher the value, higher the priority
y_pred2 = tf.constant([[-.3, 0.2,1.9] ])
loss = tfr.keras.losses.ListMLELoss()
l2 = loss(y_true2, y_pred2).numpy()
l2

4.2624245

The average loss of the dataset = 

In [None]:
y_true = tf.constant([[2., 1.,3.],##### this allows you to specify the order of preference
                      [2., 3.,1.]]) 
y_pred = tf.constant([[0.5, 0.8,0.4],
                      [-.3, 0.2,1.9]])
loss = tfr.keras.losses.ListMLELoss()
l = loss(y_true, y_pred).numpy()
l , np.mean([l1,l2])

(3.1984394, 3.1984394)

*As a side note, the order of preference is only used to show the preferred ranking of the products. In fact, tensorflow internally will take the order of preference and sort it. ONLY THE ORDER MATTERS. Hence, if we go back to row 1 example:<br>
if we set `y_true1 = tf.constant([[22., 11.,33.]])` or<br> `y_true1 = tf.constant([[2.2, 1.1,3.3]])`,<br>
we will still get the same result.

In [28]:
y_true1 = tf.constant([[22., 11.,33.]]) ##### this allows you to specify the order of preference; higher the value, higher the priority
y_pred1 = tf.constant([[0.5, 0.8,0.4] ])
loss = tfr.keras.losses.ListMLELoss()
l1 = loss(y_true1, y_pred1).numpy()
l1

2.1344543

In [29]:
y_true1 = tf.constant([[2.2, 1.1,3.3]]) ##### this allows you to specify the order of preference; higher the value, higher the priority
y_pred1 = tf.constant([[0.5, 0.8,0.4] ])
loss = tfr.keras.losses.ListMLELoss()
l1 = loss(y_true1, y_pred1).numpy()
l1

2.1344543

# Cautions - Order of Preference Tie Breaking
From the implementation, I discovered that when identical order of preference will result in random behavior. To see this in action, consider the following example:

<table>
  <thead>
    <tr>
      <td>Product A Score</td>
      <td>Product B Score</td>
      <td>Product C Score</td>
      <td>Product D Score</td>
      <td>Product E Score</td>
      <td>Product F Score</td>
      <td>Product G Score</td>
      <td>Product H Score</td>
      <td>Product I Score</td>
      <td>Product J Score</td>
      <td>Order of Preference</td>
    </tr>
  </thead> 
  <tr>
      <td>0.1</td>
      <td>0.2</td>
      <td>0.3</td>
      <td>0.4</td>
      <td>0.5</td>
      <td>0.6</td>
      <td>0.7</td>
      <td>0.8</td>
      <td>0.9</td>
      <td>1.0</td>
      <td>J > others or equivalently [0,...,0,1] </td>
  </tr>
</table>

Lets run this several times in tensorflow:

In [31]:
for i in range(10):
  tf.random.set_seed(i)
  y_true = tf.constant([[0.,0.,0.,0.,0.,0.,0.,0.,0.,1.]]) 
  y_pred = tf.constant([ [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ] ])
  loss = tfr.keras.losses.ListMLELoss()
  l = loss(y_true, y_pred).numpy()
  print(f'trial#{i}',l)

trial#0 15.452932
trial#1 15.356228
trial#2 13.896395
trial#3 14.8929
trial#4 15.433755
trial#5 14.246873
trial#6 15.556316
trial#7 15.693258
trial#8 13.839419
trial#9 14.375983


This behavior is caused by these order of preference with ties. Tensorflow will internally shuffle them which leads to random results.

# Cautions - Order of Preference Can Take On Any Real Number < 0

Another discovery is that if the order of preference take on negative values, tensorflow will have some special handling which may yield unwanted/incorrect result. Consider the following example:

<table>
  <thead>
    <tr>
      <td>Product A Score</td>
      <td>Product B Score</td>
      <td>Product C Score</td>
      <td>Product D Score</td>
      <td>Product E Score</td>
      <td>Product F Score</td>
      <td>Product G Score</td>
      <td>Product H Score</td>
      <td>Product I Score</td>
      <td>Product J Score</td>
      <td>Order of Preference</td>
    </tr>
  </thead> 
  <tr>
      <td>0.1</td>
      <td>0.2</td>
      <td>0.3</td>
      <td>0.4</td>
      <td>0.5</td>
      <td>0.6</td>
      <td>0.7</td>
      <td>0.8</td>
      <td>0.9</td>
      <td>1.0</td>
      <td>J>A>B>C>...>I</td>
  </tr>
</table>

To describe the order of J>A>B>C>...>I, we can use the following representation:
<br>
Option 1:[With negative]: [0,-1,-2,-3,-4,-5,-6,-7,-8,1]
<br>
Option 2:[Without negative]: [9., 8., 7., 6., 5., 4., 3., 2., 1., 10. ]

Lets try it on tensorflow:

In [45]:
tf.random.set_seed(123)
y_true = tf.constant([ [0., -1., -2., -3., -4., -5., -6., -7., -8., 1. ] ]) 
y_pred = tf.constant([ [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ] ])
loss = tfr.keras.losses.ListMLELoss()
l = loss(y_true, y_pred).numpy()
print(f'Option1 Represention: ',l)

Option1 Represention:  10.945755


In [46]:
tf.random.set_seed(123)
y_true = tf.constant([ [9., 8., 7., 6., 5., 4., 3., 2., 1., 10. ] ]) 
y_pred = tf.constant([ [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ] ])
loss = tfr.keras.losses.ListMLELoss()
l = loss(y_true, y_pred).numpy()
print(f'Option2 Represention: ',l)

Option2 Represention:  16.609795


If you calculate this by hand, you will find that option2 representation will give the correct solution. Hence, as a word of advice, try not to use negative order of preference.

# Playground: 
I have copied the tensorflow implementation so that you can examine the behavior in more detailed manner

In [42]:
from tensorflow_ranking import utils

#####function parameters
labels = tf.constant([ [0., -1., -2., -3., -4., -5., -6., -7., -8., 1. ] ])  ##### Order of Preference; higher value = higher priority
logits = tf.constant([ [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1. ] ]) ###### scores from neural network


######
_EPSILON = 1e-10
mask = None



#######tensorflow implementation
if mask is None:
  mask = utils.is_label_valid(labels)


# Reset the masked labels to 0 and reset the masked logits to a logit with
# ~= 0 contribution.
labels = tf.compat.v1.where(mask, labels, tf.zeros_like(labels))

logits = tf.compat.v1.where(mask, logits,
                            tf.math.log(_EPSILON) * tf.ones_like(logits))


scores = tf.compat.v1.where(
    mask, labels,
    tf.reduce_min(input_tensor=labels, axis=1, keepdims=True) -
    1e-6 * tf.ones_like(labels))


# Use a fixed ops-level seed and the randomness is controlled by the
# graph-level seed.
sorted_labels, sorted_logits = utils.sort_by_scores(
    scores, [labels, logits], shuffle_ties=True, seed=37)


raw_max = tf.reduce_max(input_tensor=sorted_logits, axis=1, keepdims=True)

sorted_logits = sorted_logits - raw_max

sums = tf.cumsum(tf.exp(sorted_logits), axis=1, reverse=True)

sums = tf.math.log(sums) - sorted_logits

negative_log_likelihood = tf.reduce_sum(
    input_tensor=sums, axis=1, keepdims=True)
negative_log_likelihood, tf.ones_like(negative_log_likelihood)

<tf.Tensor: shape=(1, 10), dtype=bool, numpy=
array([[ True, False, False, False, False, False, False, False, False,
         True]])>