Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource punkt not found. Please use the NLTK Downloader to obtain the resource: #54

Closed
gemfield opened this issue Jul 15, 2021 · 1 comment

Comments

@gemfield
Copy link
Contributor

环境:

  • MLab HomePod 2.0 pro
  • 宿主机OS:Ubuntu 20.04

代码及错误:

>>> from nltk import word_tokenize
>>> word_tokenize("gemfield is a civilnet maintainer")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/please_cd_to/home/gemfield/.local/lib/python3.8/site-packages/nltk/tokenize/__init__.py", line 130, in word_tokenize
    sentences = [text] if preserve_line else sent_tokenize(text, language)
  File "/please_cd_to/home/gemfield/.local/lib/python3.8/site-packages/nltk/tokenize/__init__.py", line 107, in sent_tokenize
    tokenizer = load("tokenizers/punkt/{0}.pickle".format(language))
  File "/please_cd_to/home/gemfield/.local/lib/python3.8/site-packages/nltk/data.py", line 750, in load
    opened_resource = _open(resource_url)
  File "/please_cd_to/home/gemfield/.local/lib/python3.8/site-packages/nltk/data.py", line 875, in _open
    return find(path_, path + [""]).open()
  File "/please_cd_to/home/gemfield/.local/lib/python3.8/site-packages/nltk/data.py", line 583, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/please_cd_to/home/gemfield/nltk_data'
    - '/opt/conda/nltk_data'
    - '/opt/conda/share/nltk_data'
    - '/opt/conda/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
@gemfield
Copy link
Contributor Author

你需要下载punkt资源:

>>> import nltk
>>> nltk.download('punkt')
[nltk_data] Downloading package punkt to /home/gemfield/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
True
>>> 

但是这个下载在大陆是被墙的,你可以在墙外下载,然后把内容手工拷到目录下(以下任选一个产品、):

  • '/please_cd_to/home/gemfield/nltk_data'
  • '/opt/conda/nltk_data'
  • '/opt/conda/share/nltk_data'
  • '/opt/conda/lib/nltk_data'
  • '/usr/share/nltk_data'
  • '/usr/local/share/nltk_data'

然后再运行就不会出错了:

>>> import nltk
>>> from nltk import word_tokenize
>>> word_tokenize("gemfield is a civilnet maintainer")
['gemfield', 'is', 'a', 'civilnet', 'maintainer']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant