Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collocations function returns error #2299

Closed
martinevanschouwenburg opened this issue May 15, 2019 · 12 comments
Closed

collocations function returns error #2299

martinevanschouwenburg opened this issue May 15, 2019 · 12 comments

Comments

@martinevanschouwenburg
Copy link

I was going through chapter 1 of the book and the collocations function returns an error. It seems like line 440 in text.py is redundant, since the collocation_list function has been introduced. I fixed the issue by rewriting the current line 440 and line 441 in text.py.

old code:
collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]*
print(tokenwrap(collocation_strings, separator="; "))

new code:
print(tokenwrap(self.collocation_list(), separator="; "))

@martinevanschouwenburg martinevanschouwenburg changed the title collocations collocations function returns error May 15, 2019
@alvations
Copy link
Contributor

alvations commented May 16, 2019

Thanks @martinevanschouwenburg for raising the bug!

Yes it looks like the collocation list is needed. To replicate the bug:

$ python3
Python 3.6.4rc1 (v3.6.4rc1:3398dcb14f, Dec  5 2017, 00:58:30) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in collocations
    collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/nltk/text.py", line 440, in <listcomp>
    collocation_strings = [w1 + ' ' + w2 for w1, w2 in self.collocation_list(num, window_size)]
ValueError: too many values to unpack (expected 2)

@networkjr
Copy link

I am still seeing this error as well when going through chapter 1 of the book.

*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
Traceback (most recent call last):
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\lib\python\ptvsd_main_.py", line 434, in main
run()
File "c:\Users\Adam.vscode\extensions\ms-python.python-2019.6.24221\pythonFiles\lib\python\ptvsd_main_.py", line 312, in run_file
runpy.run_path(target, run_name='main')
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "c:\users\adam\appdata\local\programs\python\python37-32\Lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "c:\Users\Adam\Documents\code\python\natlang\natlang.py", line 4, in
text4.collocations()
File "C:\Users\Adam.virtualenvs\natlang-9ek-vNym\lib\site-packages\nltk\text.py", line 444, in collocations
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
File "C:\Users\Adam.virtualenvs\natlang-9ek-vNym\lib\site-packages\nltk\text.py", line 444, in
w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)

@rdbliss
Copy link

rdbliss commented Aug 5, 2019

@networkjr I can confirm that too. Maybe the fix in #2227 hasn't been pushed to PyPi yet?

@callumskeet
Copy link

@networkjr it's the same with the Anaconda package

@george-carlin
Copy link

I'm working through the NLTK book, am completely new to NLTK and fairly new to Python - and I'm getting this same error.

$ python
Python 3.7.2 (default, Feb 14 2019, 11:13:53) 
[Clang 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in collocations
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
  File "/Users/george/code/nltk/py3env/lib/python3.7/site-packages/nltk/text.py", line 444, in <listcomp>
    w1 + " " + w2 for w1, w2 in self.collocation_list(num, window_size)
ValueError: too many values to unpack (expected 2)

According to my Pipfile.lock I'm using NLTK 3.4.5 which I believe is the most recent release.

Is there a fix for this issue?

@alvations
Copy link
Contributor

This has been fixed on #2377 , should be fixed in the next NLTK release soon.

Otherwise, if you can't wait =)

pip install -U https://github.com/nltk/nltk/archive/develop.zip

@xiaomiaoright
Copy link

I still have the same error after updating cntk with
pip install -U https://github.com/nltk/nltk/archive/develop.zip

Current cnkt version '3.4.5'

How can I fix it?

Many thanks.

thomaskrause added a commit to thomaskrause/nlp-mit-python that referenced this issue Nov 14, 2019
@dwanneruchi
Copy link

Also still having issues with .collocations(), but .collocation_list() works.

@TSanthoshi
Copy link

Replace at line 444 in /nltk/text.py :
collocation_strings = [ w1 + " " + w2 for w1, w2 in text.collocation_list(num, window_size)]

with the following:
collocation_strings = [ w for w in text.collocation_list(num, window_size)]

@KutapiAise
Copy link

Same here. Working through the nltk book gives error for collocations() whereas collocation_list() works.

@tomaarsen
Copy link
Member

I am unable to reproduce this error on the develop branch:

$ python
Python 3.9.4 (tags/v3.9.4:1f2e308, Apr  6 2021, 13:40:21) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from nltk.book import *
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
>>> text4.collocations()
United States; fellow citizens; four years; years ago; Federal
Government; General Government; American people; Vice President; God
bless; Chief Justice; Old World; Almighty God; Fellow citizens; Chief
Magistrate; every citizen; one another; fellow Americans; Indian
tribes; public debt; foreign nations
>>>

I believe this has been resolved, and this issue ought to be closed accordingly.

@stevenbird
Copy link
Member

Thanks @tomaarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests