-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
difference between implementation and LCF-paper #10
Comments
Not sure if relevant, but the implementation at https://github.com/songyouwei/ABSA-PyTorch/blob/master/models/lcf_bert.py is different to the implementation here and different to the description in the paper. It uses also only one self attention (compared to the three mentioned in the paper) but after applying linear_double (https://github.com/songyouwei/ABSA-PyTorch/blob/b925abcdf43c51e475a64ed04984d4911b88676d/models/lcf_bert.py#L116) whereas in this repository the self attention is applied before linear_double (https://github.com/yangheng95/LC-ABSA/blob/c945a94e0f86116c5578245aa9ad36c46c7b9c4a/models/lc_apc/lcf_bert.py#L56) |
Thank you for your concern. I modified the model when I cleaned and refactored the code because I found that the model could achieve better results in some cases without adding the last MHSA, as well as reduce some computation. Since the paper has been published, we have merely access to revise some minor problems in the aforementioned paper. |
Alright, thanks for letting me know. Since you mentioned the current implementation seems to be better than the one described in the paper, I'll also stick with the current implementation. FYI: Note the other issue opened in ABSA-PyTorch repository, where there is also a difference between their implementation and your implementation here. |
There is a difference between LCF-BERT as described in the paper and the implementation here. Specifically, the paper mentions three self attentions, see for example Figure 3 in https://www.mdpi.com/2076-3417/9/16/3389/htm, one for the output of local BERT, one for the output of global BERT, and one lastly during the feature interactive learning layer.
However, in the implementation here, there is only one self attention, which is only applied to the BERT local output. So, self attention on global BERT and self attention on features interactive learning layer are missing currently in the code. Could you let me know why that is?
The text was updated successfully, but these errors were encountered: