-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
why the pooled_output just use first token to represent the whole sentence? #196
Comments
because the first token is If you are interested in using (pretrained/fine-tuned) BERT for sentence encoding, please refer to my repo: https://github.com/hanxiao/bert-as-service and in particular, |
btw here is a visualization may help you understand different BERT layers: https://github.com/hanxiao/bert-as-service#q-so-which-layer-and-which-pooling-strategy-is-the-best |
why you say after fine-tuning, [CLS] aka the first token represents the whole sentence? why can't represent before fine-tune |
Because BERT is bidirectional, the [CLS] is encoded including all representative information of all tokens through the multi-layer encoding procedure. The representation of [CLS] is individual in different sentences. |
Hi, I want sentence representation for my downstream tasks. Any idea on how to do this? |
BERT_BASE_DIR="/home/cuiyi/repos/bert/model/chinese_L-12_H-768_A-12" python extract_features.py modify the BERT_BASE_DIR to your new model path |
Thanks a lot!! Have you trained a model and got sentence representations? |
Not yet, but many people have used this as a basic step on their own work |
Hey can you explain it a little more so as to how it is capturing the entire sentence's meaning. I wanted to use the representation I have from the last layer of the [CLS] token to understand the False Positives. Everywhere it is mentioned that CLS token representation works for the fine tuned task. What do you think? |
No description provided.
The text was updated successfully, but these errors were encountered: