Natural language processing using unsupervised vectors representation.

⚠️ Kadot is no longer in development, the project had two branches: 0.x and 1.x (this one).

Kadot is a high-level open-source library to easily process text documents. It relies on vector representations of documents or words in order to solve NLP tasks such as summarization, spellchecking or classification.

# How to get n-grams using kadot.
>>> from kadot.tokenizers import regex_tokenizer
>>> hello_tokens = regex_tokenizer("Kadot just lets you process a text easily.")
>>> hello_tokens.ngrams(n=2)

[('Kadot', 'just'), ('just', 'lets'), ('lets', 'you'), ('you', 'process'), ('process', 'a'), ('a', 'text'), ('text', 'easily')]

What's 🆕 in 1.0 ?

⚠️ All these new features may not yet be available on Github.

Vectorizers : We are now offering Word2Vec, the state-of-the-art Fasttext and Doc2Vec algorithms using Gensim's powerful backend.
Performances : Using a much more efficient algorithm, the new word vectorizer is up to 95% faster and sparse vectors now take up to 94% less memory.
Models : Kadot now includes a text classifier, an automatic text summarizer and an entity labeler which can be useful in many projects.
Bot Engine : Soon
Dependencies 😞 : In order to guarantee good performance without reinventing the wheel, we are adding Gensim and Pytorch to our list of dependencies. Although installed by default, these libraries will be optional and only Numpy and Scipy are strictly required to use Kadot.

⚖️ License

Kadot is under MIT license.

🚀 Contribute

Issues and pull requests are gratefully welcome. Come help me !

I am not a native English speaker, if you see any language mistakes in this README or in the code (docstrings included), please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
RaD		RaD
docs		docs
examples		examples
kadot		kadot
README.md		README.md
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RaD

RaD

docs

docs

examples

examples

kadot

kadot

README.md

README.md

logo.png

logo.png

Repository files navigation

Natural language processing using unsupervised vectors representation.

What's 🆕 in 1.0 ?

⚖️ License

🚀 Contribute

About

Releases

Packages

Languages

loristns/Kadot

Folders and files

Latest commit

History

Repository files navigation

Natural language processing using unsupervised vectors representation.

What's 🆕 in 1.0 ?

⚖️ License

🚀 Contribute

About

Topics

Resources

Stars

Watchers

Forks

Languages