Can someone please clarify how the NDCG metric is computed? Looking at
|
predictions: A `Tensor` with shape [batch_size, list_size]. Each value is |
I see that a "ranking score" is used to order examples.
My first question is: how is this "ranking score" computed? Code examples that I found:
https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb
or
https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py#L328
use
logits = tf.compat.v1.layers.dense(cur_layer, units=_GROUP_SIZE)
This suggests that for each example (answer to a query), the NN computes a softmax function to decide which rank (position within the group) is most appropriate for this example. When _GROUP_SIZE = 1, the output would be just a score. But what if _GROUP_SIZE > 1? How would a single score value be computed from the logits?
Next question:
|
predictions: A `Tensor` with shape [batch_size, list_size]. Each value is |
also suggests that predictions is a
Tensor with shape [batch_size, list_size]. On which line of
https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py
are such tensors formed? The tutorial
https://colab.research.google.com/github/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb also did not mention anything about grouping examples into such tensors...
If the data is indeed organized into such tensors, then I can see how a proper NDCG would be computed -- it would be just the average value of the NDCG values computed for each row of this tensor, with length = list_size. And within each row (a set of documents of size list_size), the standard way of computing NDCG would be used...
What if _GROUP_SIZE < _LIST_SIZE? Would just the top _GROUP_SIZE documents for each query be used to compute NDCG?
Can someone please clarify how the NDCG metric is computed? Looking at
ranking/tensorflow_ranking/python/metrics.py
Line 457 in d8c2e2e
I see that a "ranking score" is used to order examples.
My first question is: how is this "ranking score" computed? Code examples that I found:
https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb
or
https://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py#L328
use
logits = tf.compat.v1.layers.dense(cur_layer, units=_GROUP_SIZE)
This suggests that for each example (answer to a query), the NN computes a softmax function to decide which rank (position within the group) is most appropriate for this example. When _GROUP_SIZE = 1, the output would be just a score. But what if _GROUP_SIZE > 1? How would a single score value be computed from the logits?
Next question:
ranking/tensorflow_ranking/python/metrics.py
Line 457 in d8c2e2e
also suggests that predictions is a
Tensorwith shape [batch_size, list_size]. On which line ofhttps://github.com/tensorflow/ranking/blob/master/tensorflow_ranking/examples/tf_ranking_tfrecord.py
are such tensors formed? The tutorial https://colab.research.google.com/github/tensorflow/ranking/blob/master/tensorflow_ranking/examples/handling_sparse_features.ipynb also did not mention anything about grouping examples into such tensors...
If the data is indeed organized into such tensors, then I can see how a proper NDCG would be computed -- it would be just the average value of the NDCG values computed for each row of this tensor, with length = list_size. And within each row (a set of documents of size list_size), the standard way of computing NDCG would be used...
What if _GROUP_SIZE < _LIST_SIZE? Would just the top _GROUP_SIZE documents for each query be used to compute NDCG?