Word Pair Encoding (WPE) for semi-automatic meaningful-keywords generation.
Of course, just word-level version of Byte Pair Encoding (BPE) but quite optimized for word-level.
Most of implementation ideas were borrowed from subword-nmt. Which means fast.
from wpe import WPE
wpe = WPE(["a a b", "a a c"], min_freq=2)
wpe.preprocess()
wpe.encode() # we get "a a", "b", "c" as keywords!