In [1]:
import fasttext

In [2]:
model = fasttext.train_supervised(input="cooking.train")

In [3]:
model.save_model("model_cooking.bin")

In [4]:
model.predict("Which baking dish is best to bake a banana bread ?")

(('__label__baking',), array([0.06580226]))

In [5]:
model.predict("Why not put knives in the dishwasher?")

(('__label__food-safety',), array([0.06431208]))

In [6]:
model.test("cooking.valid")

(3000, 0.14066666666666666, 0.061813388018163176)

The output are the number of samples (here 3000), the precision at one (0.124) and the recall at one (0.0541).

In [7]:
model.test("cooking.valid", k=5)

(3000, 0.0652, 0.14325472389043503)

The k=5 parameter specifies the value of the k parameter, which is used to compute the precision at k. Precision at k measures how many of the top k predicted labels match the true labels. In other words, it evaluates how well the model's top k predictions align with the ground truth.

In [8]:
model.predict("Why not put knives in the dishwasher?", k=5)

(('__label__food-safety',
  '__label__baking',
  '__label__bread',
  '__label__substitutions',
  '__label__equipment'),
 array([0.06431208, 0.06355464, 0.03789969, 0.03471975, 0.03359836]))

On Stack Exchange, this sentence is labeled with three tags: equipment, cleaning and knives. 

Thus, one out of five labels predicted by the model is correct, giving a precision of 0.20. Out of the three real labels, only one is predicted by the model, giving a recall of 0.33.

Loading preprocessed text

In [9]:
model = fasttext.train_supervised(input="cooking_preprocessed.train")

In [10]:
model.test("cooking_preprocessed.valid")

(3000, 0.16966666666666666, 0.07337465763298255)

Increasing the number of epochs

In [11]:
model = fasttext.train_supervised(input="cooking_preprocessed.train", epoch=25)

In [12]:
model.test("cooking_preprocessed.valid")

(3000, 0.5173333333333333, 0.22372783624044976)

Increasing the learning rate

In [13]:
model = fasttext.train_supervised(input="cooking_preprocessed.train", lr=1.0)

In [14]:
model.test("cooking_preprocessed.valid")

(3000, 0.5673333333333334, 0.2453510162894623)

Tring epochs and lr together

In [15]:
model = fasttext.train_supervised(input="cooking_preprocessed.train", lr=1.0, epoch=25)
model.test("cooking_preprocessed.valid")

(3000, 0.5836666666666667, 0.2524145884388064)

Word-N-Grams

In [16]:
model = fasttext.train_supervised(input="cooking_preprocessed.train", lr=1.0, epoch=25, wordNgrams=2)
model.test("cooking_preprocessed.valid")

(3000, 0.607, 0.26250540579501225)

With a few steps, we were able to go from a precision at one of 12.4% to 60.7%. Important steps included:

preprocessing the data ;
changing the number of epochs (using the option -epoch, standard range [5 - 50]) ;
changing the learning rate (using the option -lr, standard range [0.1 - 1.0]) ;
using word n-grams (using the option -wordNgrams, standard range [1 - 5])

Since we are training our model on a few thousands of examples, the training only takes a few seconds. But training models on larger datasets, with more labels can start to be too slow. A potential solution to make the training faster is to use the hierarchical softmax, instead of the regular softmax. This can be done with the option -loss hs:

In [17]:
model = fasttext.train_supervised(input="cooking_preprocessed.train", lr=1.0, epoch=25, wordNgrams=2, bucket=200000, dim=50, loss='hs')

The training took just 2.4 secs