-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
display attention bugs - issues with Chinese characters #28
Comments
if input is English, is display rightly, but it can't display total chinese. |
Can you provide the link to your code / model / dataset so we can reproduce, if possible? |
I change the pretrained_lms.py,use pytorch chinese bert model,add attention to output spec.
in the pretrained_lm_demo.py, i use my model it display well if input English , display not well if input Chinese, same as gpt2 model. |
and I attempt to change attention_module.ts, it don't work. |
We will reproduce this locally and work on a fix. Thanks for discovering the issue! |
The same problem, any news? @bigprince97 @jameswex thanks |
Sorry for the lack of updates. The issue is that the attention_module rendering logic (https://github.com/PAIR-code/lit/blob/main/lit_nlp/client/modules/attention_module.ts#L109) assumes that due to the fixed width font that every char takes up the same fixed width in pixels, and places its lines based on that. But with chinese characters, the fixed width font renders them wider, so the math for placing the X position of the attention lines is wrong. There are the correct number of attention lines, but they are squeezed into too small a space, and the text gets cut off incorrectly due to that, and the tokens that are shown don't line up with the lines they are meant for. We'll work on fixing this. In the meantime, you could try changing the width setting on the line references above (and rebuild the client), and see if you can get the spacing to look correct for your use case. But we'll fix it so it works correctly regardless of language. |
To rebuild the client, see https://github.com/PAIR-code/lit/#download-and-installation, specifically the "yarn && yarn build" command. |
I changed the width and rebuild successfully. The attention graph changed with the different width, but still unnormal. Look forward to your revision. Thanks. |
is align=("tokens", "tokens")) changed to this: lit_types.AttentionHeads(align_in="tokens", align_out="tokens") ? |
when I display gpt2 or bert attentions, It's truncated and doesn't show the whole thing,how can i fix this?
The text was updated successfully, but these errors were encountered: