Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapt 5 Section 2.5 Verbs section correction #191

Open
AncientZygote opened this issue May 15, 2017 · 1 comment
Open

Chapt 5 Section 2.5 Verbs section correction #191

AncientZygote opened this issue May 15, 2017 · 1 comment

Comments

@AncientZygote
Copy link

There is an error in the NLTK Book updated for Python 3 and NLTK 3, Natural Language Processing with Python; Chapter 5. Categorizing and Tagging Words; Section 2.5 Verbs:

"To clarify the distinction between VBD (past tense) and VBN (past participle), let's find words which can be both VBD and VBN, and see some surrounding text:

[w for w in cfd1.conditions() if 'VBD' in cfd1[w] and 'VBN' in cfd1[w]]
['Asked', 'accelerated', 'accepted', 'accused', 'acquired', 'added', 'adopted', ...]"

The generator/comprehension bracketed above does not produce any result because cfd1 must be regenerated with the standard tagset (rather than the previously assigned universal tagset) of the treebank.tagged_words() corpus. Insert the following line prior to the bracketed line:

cfd1 = nltk.ConditionalFreqDist(wsj)

The corpus variable wsj was reassigned to the standard tagset just prior to this example so only this additional line is required to rebuild the conditional frequency distribution with the standard tagset so the events 'VBD' and 'VBN' can be found in the distribution (instead of merely 'VERB').

A minor additional detail is that the example result will not be alphabetic order (as shown in the book text) unless the bracketed comprehension is wrapped in the sorted() function.

@pjhinton
Copy link

pjhinton commented Sep 12, 2019

Another possible approach might be to just use set and its operators to do the work, using the ConditionalFreqDist created and stored in cfd2.

sorted(list(set(cfd2['VBN'].keys()) & set(cfd2['VBD'].keys())))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants