New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot Import PunktWordTokenizer in nltk 3.3 #2122
Comments
Punkt is a sentence tokenizer algorithm not word, for word tokenization, you can use functions in >>> from nltk import word_tokenize
>>> word_tokenize("This is a sentence, where foo bar is present.")
['This', 'is', 'a', 'sentence', ',', 'where', 'foo', 'bar', 'is', 'present', '.'] Also, please do take a look at http://www.nltk.org/book/ch03.html |
Yeah I usually use |
Yes, the If you're interested in improving Punkt, do take a look at #2008 |
Nope, It's available in 3.3 version but use the following code from nltk.tokenize.punkt import PunktSentenceTokenizer |
Yes, it does seems like the PunktSentenceTokenizer has been re-exposured again https://github.com/nltk/nltk/blob/develop/nltk/tokenize/punkt.py#L1236 Closing this issue as resolved then =) Please do reopen the issue if it's still relevant/unresolved. |
How do I use PunktWordTokenizer in nltk 3.3? Has it been deprecated or renamed?
>>> nltk.__version__ '3.3'
>>> from nltk.tokenize import PunktWordTokenizer Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: cannot import name 'PunktWordTokenizer'
Any help/suggestion is highly appreciated.
The text was updated successfully, but these errors were encountered: