Skip to content

v1.5.0: Alpha support for Swedish and Hungarian

Compare
Choose a tag to compare
@ines ines released this 27 Dec 21:20
· 12201 commits to master since this release

✨ Major features and improvements

  • NEW: Alpha support for Swedish tokenization.
  • NEW: Alpha support for Hungarian tokenization.
  • Update language data for Spanish tokenization.
  • Speed up tokenization when no data is preloaded by caching the first 10,000 vocabulary items seen.

🔴 Bug fixes

  • List the language_data package in the setup.py.
  • Fix missing vec_path declaration that was failing if add_vectors was set.
  • Allow Vocab to load without serializer_freqs.

📖 Documentation and examples

  • NEW: spaCy Jupyter notebooks repo: ongoing collection of easy-to-run spaCy examples and tutorials.
  • Fix issue #657: Generalise dependency parsing annotation specs beyond English.
  • Fix various typos and inconsistencies.

👥 Contributors

Thanks to @oroszgy, @magnusburton, @jmizgajski, @aikramer2, @fnorf and @bhargavvader for the pull requests!