Skip to content

ylwangy/bert2vec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

bert2vec

This is a repo for training static word embeddings by combing BERT.

See our journal paper for detail: Improving Skip-Gram Embeddings Using BERT (TASLP 2021)

1.Preparing your corpus (wiki) as 'data.txt_plain', one sentence per line.

anarchism anarchism is a political philosophy that advocates self-governed societies based on voluntary institutions .

these are often described as stateless societies , although several authors have defined them more specifically as institutions based on non-hierarchical or free associations .

...

2.Training your word embeddings:

python word2vec.py

You can get the final 300-dim word embeddings through links below (Baidu or Google Storage):

https://pan.baidu.com/s/11hV_SFO36XabzFf2SE4GLw (code: vham)

https://drive.google.com/file/d/1WIfJ7XgbPoRHBDfYdx-BhzhCPaxmNRxJ/view?usp=sharing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages