Skip to content

use Google Bert model to encode a sentence to vector.

Notifications You must be signed in to change notification settings

lzphahaha/bert_encoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

bert_encoder

use Google Bert model to encode a sentence to vector.

usage

BERT-Base, Chinese: Chinese Simplified and Traditional, 12-layer, 768-hidden, 12-heads, 110M parameters

Download the model above, unzip and place in current directory.

How to encode a sentence?

from bert_encoder import BertEncoder
be = BertEncoder()
embedding = be.encode("新年快乐,恭喜发财,万事如意!")
print(embedding)
print(embedding.shape)

update:直接使用bert的CLS位置得到句向量然后计算相似度被证明是不可行的,后来有很多工作研究这一点,如果想得到可用的bert句向量也有很多办法,例如可以参考:Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

About

use Google Bert model to encode a sentence to vector.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages