fasttext with hierarchical softmax, implemented by tensorflow.
The corpus can be find here.
reference: Bag of Tricks for Efficient Text Classification.
- python 3.4 or newer
- tensorflow 0.12.0rc1
The huffman tree should be constructed before training model.
paths_length.npy the length of huffman coding of every label.
cooking.train train file , contained 12404 examples.
cooking.valid test file , contained 3000 examples.
labels.dict the dict of labels.
words.dict the dict of all words.