Model outputs tuples #22

yjiang18 · 2020-11-17T06:47:59Z

Hi, could you explain how you generate the tweet sentence embedding please? I check the shape of the output based on the example, features = bertweet(input_ids) seems to have embeddings of each token in feature[0] (e.g., [1,20,768]) and tweet sentence embedding in feature[1] (e.g., [1, 768])? If so, please could you let me know how you generate feature[1]? Is it based on [CLS] token or simply average the whole word token embeddings? Thanks!

The text was updated successfully, but these errors were encountered:

datquocnguyen · 2020-11-18T04:56:11Z

As far as I understand it is based on the [CLS] token. However, I am not 100% sure.
You might ask the HuggingFace transformers team for the final confirmation.

datquocnguyen closed this as completed Nov 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model outputs tuples #22

Model outputs tuples #22

yjiang18 commented Nov 17, 2020

datquocnguyen commented Nov 18, 2020

Model outputs tuples #22

Model outputs tuples #22

Comments

yjiang18 commented Nov 17, 2020

datquocnguyen commented Nov 18, 2020