Plugin for applying bert masking as a Sparv annotation.
First, install Sparv, as suggested:
pipx install sparv-pipeline
Then install install sparv-sbx-word-prediction-kb-bert
with
pipx inject sparv-pipeline sparv-sbx-word-prediction-kb-bert
Depending on how many explicit exports of annotations you have you can decide to use this
annotation exclusively by adding it as the only annotation to export under xml_export
:
xml_export:
annotations:
- <token>:sbx_word_prediction_kb_bert.word-prediction--kb-bert
To use it together with other annotations you might add it under export
:
export:
annotations:
- <token>:sbx_word_prediction_kb_bert.word-prediction--kb-bert
...
You can configure this plugin by the number of neighbours to generate.
The number of neighbours defaults to 5
but can be configured in config.yaml
:
sbx_word_prediction_kb_bert:
num_neighbours: 5
The number of decimals defaults to 3
but can be configured in config.yaml
:
sbx_word_prediction_kb_bert:
num_decimals: 3
[!NOTE] This also controls the cut-off, so all values where the score round to 0.000 (or the number of decimals) is discarded.
Type | HuggingFace Model | Revision |
---|---|---|
Model | KBLab/bert-base-swedish-cased |
c710fb8dff81abb11d704cd46a8a1e010b2b022c |
Tokenizer | same as Model | same as Model |
This project keeps a changelog.