-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about alphabet #20
Comments
Sorry,maybe I have not described this problem clearly.
the output size of hidden2tag is 5, |
I think I get your point. I also had this concern during writing this framework. The "START" and "END" is for the CRF calculation. In the CRF layer, I set some of the default transition scores to -10000 to avoid the "START" and "END" output. To keep the code simple, the "START" and "END" are also added in the model with softmax output. Theoretically, the model may decode some invalid labels "START/END" but it is almost impossible in real data. When the model is trained with the training data, it will not decode the invalid labels as they do not exist in the training data. |
Thank you. |
Exactly! |
Hello,I have a problem about the alphabet.
In the file
alphabet.py
,the functionsize
returnslen(self.instances) + 1
,I think it is cause of the padding /pad,but in the fileseqmodel.py
,why we have to add two more labels for down stream lstm?Though we use the original label size for CRF,actually in the file CFR model,in the transition matrix ,still add "start" and "end".this confused me.And if I do not use CRF,
_, tag_seq = torch.max(outs, 1)
maybe lead to the wrong index.Thank you~
The text was updated successfully, but these errors were encountered: