Skip to content

Scikit-learn compatible vectorizers built with spaCy NLP famework.

License

Notifications You must be signed in to change notification settings

mpavlovic/spacy-vectorizers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spacy-vectorizers

Scikit-learn compatible vectorizers built with spaCy NLP famework.

This repo contains customized scikit-learn compatible classes and vectorizers inspired by CountVectorizer, but with more accurate tokenization and lemmatization funcitonality with the help of spaCy NLP framework. Simple Keras-like punctuation removal support is also added.

Built on (prerequisites):

  • Python 3.5.4
  • scikit-learn 0.19.1
  • spaCy 2.0.4

Usage:

Please refer to the Usage Examples & Tests Jupyter notebook or here.

About

Scikit-learn compatible vectorizers built with spaCy NLP famework.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published