Skip to content

theeluwin/wpe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WPE

Word Pair Encoding (WPE) for semi-automatic meaningful-keywords generation.

Of course, just word-level version of Byte Pair Encoding (BPE) but quite optimized for word-level.

Most of implementation ideas were borrowed from subword-nmt. Which means fast.

from wpe import WPE
wpe = WPE(["a a b", "a a c"], min_freq=2)
wpe.preprocess()
wpe.encode()  # we get "a a", "b", "c" as keywords!

About

Word Pair Encoding (WPE) for semi-automatic meaningful-keywords generation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages