Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rankers' implementation and indexing errors #320

Closed
wants to merge 10 commits into from

Conversation

dipanshu124
Copy link
Contributor

This PR is a followup of #278 and attempts to resolve the remaining issues.

VaibhavKansagara and others added 10 commits January 27, 2021 16:30
As the implementation is not correct it is better to revisit
or rewrite this in the future.
Before, we were dealing with a vector of FeatureVector objects,ie one
FeatureVector per entry in the training set. Now have a separate vector
per query in the training set. Before queryids were completely ignored
For more info on why this change is needed look the implementation of
ListNetRanker at https://www.microsoft.com/en-us/research/wp-content/
uploads/2016/02/tr-2007-40.pdf at page 5 and for the implementation of
ListMleRanker look at http://icml2008.cs.helsinki.fi/papers/167.pdf
page 6. If you look closely you will notice that the current implementation
doesn't take query into account which is clearly wrong.
This change applies to both ListNetRanker and ListMleRanker.
The motivation for Xavier initialization in Neural Networks is to
initialize the weights of the network so that the neuron activation
functions are not starting out in saturated or dead regions. In other
words, we want to initialize the parameters with random values that are
not “too small” and not “too large.”
Gradient being used to update the parameter per query is divided by the
number of documents associated with the query else it will simply give
more weightage to a query which has more documents associated with it.
In effect, a bias value allows you to shift the activation function
to the left or right, which may be critical for successful learning.
This are due to changes made from fixing ranker implementations, fixing
indexing errors, adding Xavier initialisation, adding normalisation of
gradient and adding bias combined. This also fixes scorer test which was
wrong earlier.
@ojwb
Copy link
Contributor

ojwb commented Jun 29, 2022

Thanks for your work on this, and sorry for having dropped the ball on getting it reviewed and merged.

We've since switched CI from travis-ci to GHA, so I've rebase your branch onto current master, updated the new CI to remove the libsvm stuff and opened a new PR: #324

Closing this, will review and try to actually get this merged via the new PR.

@ojwb ojwb closed this Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants