New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High Performance of Biased Methods #5
Comments
This is an open question. I cannot give a strict answer to this question.
But I can share my understanding of this question:
(1) NLL is more consistent with the training objective in our
implementation -- MSE loss and NLL are both point-wise, but AUC and NDCG
are about ranking.
There are some gaps between AUC and NDCG and debiasing evaluation.
I think NLL is better than AUC and NDCG to evaluate the DEBIASING
performance of debiasing methods.
According to the results, debiasing methods (IPS/DR/CauE) exhibit better
performance than MF_combine.
(2) AUC and NDCG can be used to evaluate the generalization ability, and
measure how much benefit the debiasing methods can bring for the
item-ranking task.
Although MF_combine is not designed for debiasing, it is an intuitive and
straightforward way to use random data.
Probably, this intuitive way is a more effective way to improve the
generalization ability of the model.
(3) Although AUC and NDCG are inconsistent with debiasing evaluation, they
are consistent with the recommendation setting.
Therefore, they can also be important metrics for the evaluation of
debiasing for recommendation systems.
Above are my understanding, and hope it can help you.
XPBooster ***@***.***> 于2022年11月11日周五 15:17写道:
… Hello, I am very happy to see the release of Autodebias.
When I reproduce the biased MF approach and the MF_combine approach, it
exhibits much better performance than the IPS/DR/CausE approaches. The
MF_combine approach, in particular, reaches an AUC of 0.735-0.737, which is
competitive to AutoDebias.
I'm very curious as to why the IPS/DR/CausE approach would fail in this
implementation, and why the biased/combine approaches would perform better
than debiased approaches.
—
Reply to this email directly, view it on GitHub
<#5>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AK2BV6JKGVWELHKWNPZVC6LWHXXJFANCNFSM6AAAAAAR5KIXEY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello, I am very happy to see the release of Autodebias.
When I seek to reproduce the
MF_biased
andMF_combine
approaches, they exhibits much better performance than the IPS/DR/CausE approaches. TheMF_combine
approach, in particular, reaches an AUC of 0.735-0.737, which is competitive to AutoDebias.I'm very curious as to why the IPS/DR/CausE approach would fail in this implementation, and why the biased/combine approaches would perform better than debiased approaches. This issue is important to me because this strange result makes me question the validity of the fundamental causal approaches (IPS/DR) in practice.
Thanks!
The text was updated successfully, but these errors were encountered: