Skip to content
/ pySBD Public
forked from nipunsadvilkar/pySBD

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

License

Notifications You must be signed in to change notification settings

ines/pySBD

 
 

Repository files navigation

pySBD: Python Sentence Boundary Disambiguation (SBD)

Build Status License

pySBD - python Sentence Boundary Disambiguation (SBD) - is a rule-based sentence boundary detection module that works out-of-the-box.

This project is a direct port of ruby gem - Pragmatic Segmenter which provides rule-based sentence boundary detection.

Install

Python

pip install pysbd

Usage

  • Currently pySBD supports only English language. Support for more languages will be released soon.
import pysbd
text = "My name is Jonas E. Smith. Please turn to p. 55."
seg = pysbd.Segmenter(language="en", clean=False)
print(seg.segment(text))
# ['My name is Jonas E. Smith.', 'Please turn to p. 55.']

Contributing

If you find a text that is incorrectly segmented using pySBD, please submit an issue.

  1. Fork it ( https://github.com/nipunsadvilkar/pySBD/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Credit

This project wouldn't be possible without the great work done by Pragmatic Segmenter team.

About

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%