PunktTokenizer does not use the correct version of the pickled model on Python 3.x #2250

BLKSerene · 2019-03-10T09:47:05Z

Hi, I'm trying to package my program with NLTK and nltk_data using PyInstaller. So to minimize the size of the data file, I removed the zip file and all models for Python 2.x in nltk_data/tokenizers/punkt (only the PY3 folder is left).

But it seems that the PunktTokenizer always uses the Python 2.x version of the pickled model regardless of Python version I'm using. And the error message says that it can't find tokenizers/punkt/english.pickle instead of tokenizers/punkt/PY3/english.pickle.

Removing the PY3 folder is okay, so it seems that the Python 3 version of the pickled model is never used.

OS: Windows 10 64-bit
Python version: 3.7.2 64-bit
NLTK version: 3.4

The text was updated successfully, but these errors were encountered:

alvations added nltk_data pleaseverify tokenizer labels May 7, 2019

BLKSerene closed this as completed Aug 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PunktTokenizer does not use the correct version of the pickled model on Python 3.x #2250

PunktTokenizer does not use the correct version of the pickled model on Python 3.x #2250

BLKSerene commented Mar 10, 2019 •

edited

PunktTokenizer does not use the correct version of the pickled model on Python 3.x #2250

PunktTokenizer does not use the correct version of the pickled model on Python 3.x #2250

Comments

BLKSerene commented Mar 10, 2019 • edited

BLKSerene commented Mar 10, 2019 •

edited