Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FreqDist.keys() produces map object (dictionary?) and not list #390

Closed
sonofmun opened this issue Apr 17, 2013 · 10 comments
Closed

FreqDist.keys() produces map object (dictionary?) and not list #390

sonofmun opened this issue Apr 17, 2013 · 10 comments

Comments

@sonofmun
Copy link

After importing NLTK and * from nltk.book, I run the following code and get the following results:

>>>fdist = FreqDist(text1)
>>>fdist
<FreqDist with 19317 samples and 260819 outcomes>
>>>FreqDist.keys(fdist)
<map object at 0x03DD21D0>

The help for FreqDist says that keys(self) should produce a list of the keys but it appears that it produces another dictionary.
I can get a list of the keys by typing:

>>>list(FreqDist.keys(fdist))

Is the behavior of the code as it should be and the help simply not updated? Or is this a problem with the code?

@kmike
Copy link
Member

kmike commented Apr 17, 2013

Do you use Python 3.x?

In Python 3.x dict.keys() method returns iterator; while porting to Python 3.x I decided to prefer idiomatic behavior over backwards compatibility in such cases. NLTK 3 is still in alpha so we could revert this decision - if you have some arguments in favor of keys returning list they are welcome.

Also, in Python you usually write

>>> fdist=FreqDist(text1)
>>> list(fdist.keys())   # not FreqDist.keys(fdist)

@kmike
Copy link
Member

kmike commented Apr 17, 2013

If you want to follow the book it is better to install Python 2.7 because the book is not updated for NLTK3 / Python 3.x yet.

@heatherleaf
Copy link
Contributor

My suggestion is to keep the current behavior and update the docstrings. (The same also holds for .values())

@sonofmun
Copy link
Author

Hi kmike. Thanks for the information. I figured it was something like that. I have no problem with the new behavior.
And I know that the book was done for Python 2.7, but I like the challenge of trying to get the same behavior out of Python 3.x as described in the book. So far, doing it this way has been a very good learning activity for me.

@kmike
Copy link
Member

kmike commented Apr 17, 2013

@sonofmun I like your approach 👍 But be ready that many pre-trained models won't work (I think including nltk.pos_tag) because we didn't fix this issue yet; you'll have to train them yourself, and it is not always described in the book how to train models.

@heatherleaf Do you have ideas how to fix those docstrings? "Return list in Python 2.x and iterable in Python 3.x" is a bit cumbersome.

@heatherleaf
Copy link
Contributor

No, I think that's fine. But perhaps "iterator" is better than "iterable", since that's what the documentation for map() in py3 says.

"Returns a list in Python 2.x, and an iterator in Python 3.x"

@kmike kmike closed this as completed in 600f936 May 25, 2013
@undertherain
Copy link

the documentation in
https://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html
says "Use FreqDist.keys(), or iterate over the FreqDist to get its samples in sorted order (most frequent first)"
however, nor keys() method nor iteration over the object itself does not give elements in sorted by frequency order, I guess this should be updated...

@pasky
Copy link

pasky commented Aug 24, 2014

Yes, I think this should be reopened per previous comment. The main use of FreqDist seems to me to be to get stuff in frequency order, but that's not what happens in Python3, so imho this is quite broken.

@kmike
Copy link
Member

kmike commented Aug 24, 2014

FreqDist in NLTK3 is a wrapper for collections.Counter; Counter provides most_common() method to return items in order. FreqDist.keys() method is provided by standard library; it is not overridden. I think it is good we're becoming more compatible with stdlib.

@undertherain docs at googlecode are very old, they are from 2011. More up-to-date docs can be found on http://nltk.org website.

@pasky
Copy link

pasky commented Aug 25, 2014

Oh, I see! I just got confused by the documentation linked by
@undertherain then. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants