Skip to content

lofoyet/punctuator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Punctuator by @ottokart

Bomoda borrows a model to add missing punctuations back to documents, especially radio transcripts. Check out https://github.com/ottokart/punctuator2

How to use

First install required python packages

pip install -r requirements.txt

Then go to python

# define your own tokenize function
from nltk.tokenize import TweetTokenizer
tknzr = TweetTokenizer()

from lib.punctuator import Punctuator
P = Punctuator(
    tokenize_func=tknzr.tokenize
    )
P.load()
P.punctuate(u"hi this is the best-looking guy on globe why you laugh get lost")
# return will be like u"hi, this is the best-looking guy on globe, why you laugh get lost? "

About

Simple script to add punctuation

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages