Skip to content

Commit

Permalink
Merge pull request #1145 from RasaHQ/docs_format_typo
Browse files Browse the repository at this point in the history
fix typo
  • Loading branch information
tmbo committed Jun 11, 2018
2 parents 532b9fe + 9dfbeb9 commit c61904f
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -285,11 +285,11 @@ intent_featurizer_count_vectors
In order to teach an algorithm how to treat unknown words, some words in training data can be substituted by generic word ``OOV_token``.
In this case during prediction all unknown words will be treated as this generic word ``OOV_token``.

For example, one might create separate intent ``outofscope`` in the training data containing messages of different number of ``OOV_token``s and
For example, one might create separate intent ``outofscope`` in the training data containing messages of different number of ``OOV_token`` s and
maybe some additional general words. Then an algorithm will likely classify a message with unknown words as this intent ``outofscope``.

.. note::
This featurizer creates bag-of-words representation by **counting** words, so a number of ``OOV_token``s might be important.
This featurizer creates bag-of-words representation by **counting** words, so a number of ``OOV_token`` s might be important.

- ``OOV_token`` set a keyword for unseen words; if training data contains ``OOV_token`` as words in some messages,
during prediction the words that were not seen during training will be substituted with provided ``OOV_token``;
Expand Down

0 comments on commit c61904f

Please sign in to comment.