Skip to content

Releases: nipunsadvilkar/pySBD

v0.3.4: Fix trailing period/ellipses with spaces

11 Feb 16:42
Compare
Choose a tag to compare
  • 🐛 Fix trailing period/ellipses with spaces - #83
  • 🐛 Regex escape for parenthesis - #87

v0.3.3: Better handling consecutive periods and reserved special symbols

08 Oct 11:41
Compare
Choose a tag to compare
  • 🐛 Better handling consecutive periods and reserved special symbols - allenai/scholarphi#114
  • Add CONTRIBUTING.md

v0.3.2 : Enforce clean=True when doc_type="pdf"

11 Sep 09:33
91676b8
Compare
Choose a tag to compare
  • 🐛 ✅ Enforce clean=True when doc_type="pdf" - #75

v0.3.1 : Handle Newline character & update tests

11 Aug 12:50
9069997
Compare
Choose a tag to compare

v0.3.1

  • 🚑 ✅ Handle Newline character & update tests

v0.3.0: Multi-lang support & performance improvements

11 Aug 12:48
92362f7
Compare
Choose a tag to compare

v0.3.0

  • ✨ 💫 Support Multiple languages - #2
  • 🏎⚡️💯 Benchmark across Segmentation Tools, Libraries and Algorithms
  • 🎨 ♻️ Update sentence char_span logic
  • ⚡️ Performance improvements - #41
  • ♻️🐛 Refactor AbbreviationReplacer

♻️ ✨ Refactoring for more language support & sent char_span fix

09 Jun 17:11
e0cdada
Compare
Choose a tag to compare
  • ✨ 💫 sent char_span through with spaCy & regex approach - #63
  • ♻️ Refactoring to support multiple languages
  • ✨ 💫Initial language support for - Hindi, Marathi, Chinese, Spanish
  • ✅ Updated tests - more coverage & regression tests for issues
  • 👷👷🏻‍♀️ GitHub actions for CI-CD
  • 💚☂️ Add code coverage - coverage.py Add Codecov
  • 🐛 Fix incorrect text span & vanilla pysbd vs spacy output discrepancy - #49, #53, #55 , #59
  • 🐛 Fix NUMBERED_REFERENCE_REGEX for zero or one time - #58
  • 🔐Fix security vulnerability bleach - #62

Performance improvement in `abbreviation_replacer`

13 Nov 17:43
f7c640f
Compare
Choose a tag to compare

🐛 Performance improvement in abbreviation_replacer by reducing re.sub calls - @danielkingai2 #50

🐛 Fix unbalanced parenthesis

01 Nov 11:52
16e8683
Compare
Choose a tag to compare
  • 🐛 Fix unbalanced parenthesis - #47

✨ `pysbd` as a spaCy component

30 Oct 10:48
08ad60e
Compare
Choose a tag to compare
  • pysbd as a spacy component through entrypoints

✨Add `char_span` functionality, pySBD as a spaCy component

25 Oct 11:08
a2bb451
Compare
Choose a tag to compare
  • ✨Add char_span parameter (optional) to get sentence & its (start, end) char offsets from original text
  • ✨pySBD as a spaCy component example
  • 🐛 Fix double question mark swallow bug - #39