LSTM Learnable Hidden State #899

myaldiz · 2019-07-17T09:27:59Z

Learnable hidden state feature, I did not thoroughly test whether accuracy for sequence_tagger gets better or not.

@alanakbik not sure how to initialize the hidden state though, just used rand for now

(Forgive me if I skipped some steps about the rules of PR, appreciate your guidance)

Fixes forward pass in models without learnable initial hidden state Fixes loading serialized models

alanakbik · 2019-07-17T19:22:02Z

I've fixed some small issues with the code, so the unit tests now run through. I've also done a quick experiment using learnable hidden states but I cannot say if this makes much of a difference.

myaldiz · 2019-07-18T00:36:08Z

I've fixed some small issues with the code, so the unit tests now run through. I've also done a quick experiment using learnable hidden states but I cannot say if this makes much of a difference.

I think it can be a result of one of three reasons.

First one is if you tried with Conll-03, accuracy is already too high and default network is shallow, maybe benefits become unobservable. How about trying harder tasks with larger datasets using networks with larger capabilities?

Secondly, our initialization might be a problem. It can get better with some experimentation.

Thirdly, maybe it actually just does not make much difference :)

alanakbik · 2019-07-18T08:18:24Z

That's true :) Would you like to experiment with initialization, or should we go ahead and merge this as it is?

myaldiz · 2019-07-18T10:09:44Z

That's true :) Would you like to experiment with initialization, or should we go ahead and merge this as it is?

Let's just merge it as it is, and maybe change default parameter to false for not causing problems. And later if I can find some time, I will try to experiment and put the results here. At the same time, we can comment on the code so that maybe other people put their results about it.

alanakbik · 2019-07-18T11:34:52Z

👍

yosipk · 2019-07-18T11:35:46Z

👍

myaldiz added 2 commits July 17, 2019 17:05

LSTM trainable hidden state

95f34b5

fix init

789d577

myaldiz mentioned this pull request Jul 17, 2019

Lstm Learnable Hidden State #897

Closed

flairNLPGH-897: fix unit tests

7ab6deb

Fixes forward pass in models without learnable initial hidden state Fixes loading serialized models

Set learn state to False

44085f7

yosipk merged commit 8b50e2e into flairNLP:master Jul 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LSTM Learnable Hidden State #899

LSTM Learnable Hidden State #899

myaldiz commented Jul 17, 2019 •

edited

alanakbik commented Jul 17, 2019

myaldiz commented Jul 18, 2019 •

edited

alanakbik commented Jul 18, 2019

myaldiz commented Jul 18, 2019

alanakbik commented Jul 18, 2019

yosipk commented Jul 18, 2019

LSTM Learnable Hidden State #899

LSTM Learnable Hidden State #899

Conversation

myaldiz commented Jul 17, 2019 • edited

alanakbik commented Jul 17, 2019

myaldiz commented Jul 18, 2019 • edited

alanakbik commented Jul 18, 2019

myaldiz commented Jul 18, 2019

alanakbik commented Jul 18, 2019

yosipk commented Jul 18, 2019

myaldiz commented Jul 17, 2019 •

edited

myaldiz commented Jul 18, 2019 •

edited