Skip to content
#

folia

Here is 1 public repository matching this topic...

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser writt…

  • Updated Oct 31, 2023
  • Cython

Improve this page

Add a description, image, and links to the folia topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the folia topic, visit your repo's landing page and select "manage topics."

Learn more