Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in AdaGrad implementation #99

Open
bhargav opened this issue Feb 5, 2017 · 2 comments
Open

Bug in AdaGrad implementation #99

bhargav opened this issue Feb 5, 2017 · 2 comments

Comments

@bhargav
Copy link
Contributor

bhargav commented Feb 5, 2017

AdaGrad does not increase the size of the weight vector while learning. Weight Vector dimensions might increase if there are new features seen while feature extraction from unseen training examples.

Cause:
https://github.com/IllinoisCogComp/lbjava/blob/master/lbjava/src/main/java/edu/illinois/cs/cogcomp/lbjava/learn/AdaGrad.java#L189

Example feature:

discrete MyTestFeature(MyData d) <- {
    return d.isCapitalized() ? "YES" : "NO"
}

For this example, weight vector should have size 3 - YES, NO, Bias Term. But exampleFeatures.length is only 1 here.

Compare with implementation of StochasticGradientDescent.

@danyaljj
Copy link
Member

danyaljj commented Feb 6, 2017

I understand the issue "For this example, weight vector should have size 3", although I don't understand why this is because of how AdaGrad is implemented. The increase in the size of the feature should automatically be taken care of by the lexicon. And it looks like the implementation of the learn(.) function in the AdaGrad (and its weight vector) is based on the length of the input features. Not sure where the problem might be arising ...

@bhargav
Copy link
Contributor Author

bhargav commented Feb 7, 2017

For the example that I provided, exampleFeatures.length would be equal to 1. Since we only have one discrete feature per example. Hence the weight vector would be initialized to be of size 2 (1 for feature + 1 for the bias term). Though the lexicon would have two features, the parameters for learn are just the features present in the current training example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants