Skip to content

v2.3.2: Improved Korean tokenizer speed, experimental character-based pretraining and bug fixes

Choose a tag to compare
@adrianeboyd adrianeboyd released this 13 Jul 16:09
· 4294 commits to master since this release

New features and improvements

  • Improve Korean tokenizer speed.
  • Add experimental character-based pretraining.

🔴 Bug fixes

  • Fix issue #5728: Fix French lemmatizer.
  • Fix issue #5729: Fix lemmatizer for python 2.7.
  • Fix issue #5751: Fix meta serialization in train CLI.

👥 Contributors

Thanks to @graue70, @mikeizbicki, @jbesomi, @gandersen101 and @DeNeutoy for the pull requests and contributions.