Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSTM Learnable Hidden State #899

Merged
merged 4 commits into from
Jul 18, 2019
Merged

LSTM Learnable Hidden State #899

merged 4 commits into from
Jul 18, 2019

Conversation

myaldiz
Copy link

@myaldiz myaldiz commented Jul 17, 2019

Learnable hidden state feature, I did not thoroughly test whether accuracy for sequence_tagger gets better or not.

@alanakbik not sure how to initialize the hidden state though, just used rand for now

(Forgive me if I skipped some steps about the rules of PR, appreciate your guidance)

Fixes forward pass in models without learnable initial hidden state
Fixes loading serialized models
@alanakbik
Copy link
Collaborator

I've fixed some small issues with the code, so the unit tests now run through. I've also done a quick experiment using learnable hidden states but I cannot say if this makes much of a difference.

@myaldiz
Copy link
Author

myaldiz commented Jul 18, 2019

I've fixed some small issues with the code, so the unit tests now run through. I've also done a quick experiment using learnable hidden states but I cannot say if this makes much of a difference.

I think it can be a result of one of three reasons.

First one is if you tried with Conll-03, accuracy is already too high and default network is shallow, maybe benefits become unobservable. How about trying harder tasks with larger datasets using networks with larger capabilities?

Secondly, our initialization might be a problem. It can get better with some experimentation.

Thirdly, maybe it actually just does not make much difference :)

@alanakbik
Copy link
Collaborator

That's true :) Would you like to experiment with initialization, or should we go ahead and merge this as it is?

@myaldiz
Copy link
Author

myaldiz commented Jul 18, 2019

That's true :) Would you like to experiment with initialization, or should we go ahead and merge this as it is?

Let's just merge it as it is, and maybe change default parameter to false for not causing problems. And later if I can find some time, I will try to experiment and put the results here. At the same time, we can comment on the code so that maybe other people put their results about it.

@alanakbik
Copy link
Collaborator

👍

1 similar comment
@yosipk
Copy link
Collaborator

yosipk commented Jul 18, 2019

👍

@yosipk yosipk merged commit 8b50e2e into flairNLP:master Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants