Skip to content

hladek/spacy-skmodel

Repository files navigation

Slovak Spacy Model

This is Slovak Spacy model.

Features

  • Requires Spacy 3.x.
  • Contains Floret Word Vectors.
  • Tagger module uses Slovak National Corpus Tagset.
  • Morphological analyzer uses Universal dependencies tagset and is trained on Slovak dependency treebank.
  • Lemmatizer is trained on Slovak dependency treebank.
  • Named entity recognizer is trained separately on WikiAnn database.

Downloads

Version 3.4

  • Spacy 3.4, Dependencies.

    • Model for trained lemmatization, POS tagging and dependency relations.
    • Contains Floret Word Vectors, trained on our web corpus.
    • Should be without license issues.
  • Spacy 3.4, NER + Dependencies.

    • Includes the dependencies model.
    • This model uses separate fine-tuned model for NER recognition.

Version 3.3

These models do not have word vectors.

Training

Requirements for training:

  • Anaconda virtual environment
  • Spacy 3
  • make
  • bash

Usage:

  1. Install dependencies in the Conda

    ./prepare-env.sh

  2. Download and prepare data:

    make

  3. Train models

    ./train.sh

Credits

Author:

Daniel Hládek daniel.hladek@tuke.sk and Technical University of Košice

Sources:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published