In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. #2322

sbruch · 2019-08-12T16:17:11Z

Additionally, the sigmoid function has an arbitrary factor of 2 in the exponent; it is not just non-standard but the gradients are not computed correctly anyway.

…ditionally, the sigmoid function has an arbitrary factor of 2 in the exponent; it is not just non-standard but the gradients are not computed correctly anyway.

msftclas · 2019-08-12T17:30:03Z

All CLA requirements met.

…e in scores.

sbruch · 2019-08-12T20:46:24Z

I am also proposing to remove what looks like a heuristic that normalizes the gradients by the difference in document scores. This is not part of the LambdaMART algorithm and at any rate does not help with convergence rate or generalization. An alternative to this change is to make such heuristics optional.

I have also trained a model on MSLR Web30k (Fold 1) and Yahoo! Learning-to-Rank Challenge (Set 1) datasets and what I have obtained is no worse or better than a model trained from the head:

NDCG@5 on Web30k before the change is 49.20 (+/-0.07) and after the change it is 49.05 (+/-0.09)
NDCG@5 on Yahoo LTRC before the change is 74.16 (+/-0.14) and after the change it is 74.22 (+/-0.08)

where sigmoid_ is set to two (post change).

Note that in my experiments I have removed the queries that have no relevant documents to avoid NDCG inflation -- LightGBM computes an NDCG of 1 for such queries.

sbruch · 2019-08-15T02:16:02Z

Update: I wanted to share an updated NDCG@5 after much fine-tuning on validation sets:

Web30K Fold 1: 49.39 (+/-0.08)
Yahoo! Set 1: 74.97 (+/-0.09)

As noted in an earlier comment, queries with no relevant documents were discarded from the test set. Here are the hyperparameters I used if you were to reproduce the results:

For Web30k (Yahoo): learning rate is 0.05 (0.05), num_leaves is 400 (400), min_data_in_leaf is 50 (100), and min_sum_hessian_in_leafis set to 200 (10), and sigmoid_ is set to 2.

guolinke · 2019-08-16T16:50:47Z

src/objective/rank_objective.hpp

-        p_lambda *= -delta_pair_NDCG;
-        p_hessian *= 2 * delta_pair_NDCG;
+        p_lambda *= -sigmoid_ * delta_pair_NDCG;
+        p_hessian *= sigmoid_ * sigmoid_ * delta_pair_NDCG;


I am thinking about eliminating sigmoid_ here. That is, no sigmoid_ in p_lambda, and only sigmoid_ in p_hessian. What is your opinion?

So the LambdaMART (implicit) cost function has the following form:

log(1 + exp(-sigmoid_ * (s_i - s_j)))

The first derivative of the above has the sigmoid_ term -- so mathematically p_lambda in the code must have the sigmoid_ term in its computation. The second derivative will have sigmoid_^2 as proposed in the code change.

For more details, please see Section 7 of this paper: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.180.634&rep=rep1&type=pdf

sbruch · 2019-08-19T14:56:19Z

Thanks for accepting and merging the pull request. I'd certainly be happy to send another PR and add the normalization piece back conditioned on a boolean; but a couple of questions: (1) Would it make sense to add a flag to the config file? (e.g., lambdarank_normalize_gradients_by_score_diff) and (2) Could it be false by default in the public library and set to true in internal configuration?

…

On Fri, Aug 16, 2019 at 8:48 PM Guolin Ke ***@***.***> wrote: @SBrush <https://github.com/SBrush> would you mind to create another PR for the normalization? We can add a bool parameter for that function. And I think it is better to open by default, for the consistent behavious as before. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2322?email_source=notifications&email_token=AKZUQSOVV3IOWHQAXYI2BDDQE5YHTA5CNFSM4ILC2X52YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4QCS3I#issuecomment-522201453>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AKZUQSK7FOWD6Z5GO2JGYKLQE5YHTANCNFSM4ILC2X5Q> .

guolinke · 2019-08-19T21:10:30Z

@sbruch refer to #2331, besides, I add the normalization for lambdas/hessians. However, both of them are not tested.

Lambdas and hessians need to factor sigmoid_ into the computation. Ad…

4f943ce

…ditionally, the sigmoid function has an arbitrary factor of 2 in the exponent; it is not just non-standard but the gradients are not computed correctly anyway.

sbruch changed the title ~~Lambdas and hessians need to factor sigmoid_ into the computation.~~ In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. Aug 12, 2019

Update unit test

69ea3da

Also remove a heuristic that normalizes the gradient by the differenc…

52aca7b

…e in scores.

Also fix unit test after removing the heuristic

d3f1e33

StrikerRUS requested review from guolinke and chivee August 13, 2019 00:15

guolinke reviewed Aug 16, 2019

View reviewed changes

guolinke approved these changes Aug 17, 2019

View reviewed changes

guolinke merged commit aee92f6 into microsoft:master Aug 17, 2019

lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. #2322

In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. #2322

sbruch commented Aug 12, 2019

msftclas commented Aug 12, 2019 •

edited

sbruch commented Aug 12, 2019

sbruch commented Aug 15, 2019

guolinke Aug 16, 2019

sbruch Aug 16, 2019

sbruch commented Aug 19, 2019 via email

guolinke commented Aug 19, 2019

In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. #2322

In the rank objective, lambdas and hessians need to factor sigmoid_ into the computation. #2322

Conversation

sbruch commented Aug 12, 2019

msftclas commented Aug 12, 2019 • edited

sbruch commented Aug 12, 2019

sbruch commented Aug 15, 2019

guolinke Aug 16, 2019

Choose a reason for hiding this comment

sbruch Aug 16, 2019

Choose a reason for hiding this comment

sbruch commented Aug 19, 2019 via email

guolinke commented Aug 19, 2019

msftclas commented Aug 12, 2019 •

edited