Chapt 5 Section 2.5 Verbs section correction #191

AncientZygote · 2017-05-15T01:54:25Z

There is an error in the NLTK Book updated for Python 3 and NLTK 3, Natural Language Processing with Python; Chapter 5. Categorizing and Tagging Words; Section 2.5 Verbs:

"To clarify the distinction between VBD (past tense) and VBN (past participle), let's find words which can be both VBD and VBN, and see some surrounding text:

[w for w in cfd1.conditions() if 'VBD' in cfd1[w] and 'VBN' in cfd1[w]]
['Asked', 'accelerated', 'accepted', 'accused', 'acquired', 'added', 'adopted', ...]"

The generator/comprehension bracketed above does not produce any result because cfd1 must be regenerated with the standard tagset (rather than the previously assigned universal tagset) of the treebank.tagged_words() corpus. Insert the following line prior to the bracketed line:

cfd1 = nltk.ConditionalFreqDist(wsj)

The corpus variable wsj was reassigned to the standard tagset just prior to this example so only this additional line is required to rebuild the conditional frequency distribution with the standard tagset so the events 'VBD' and 'VBN' can be found in the distribution (instead of merely 'VERB').

A minor additional detail is that the example result will not be alphabetic order (as shown in the book text) unless the bracketed comprehension is wrapped in the sorted() function.

pjhinton · 2019-09-12T20:18:31Z

Another possible approach might be to just use set and its operators to do the work, using the ConditionalFreqDist created and stored in cfd2.

sorted(list(set(cfd2['VBN'].keys()) & set(cfd2['VBD'].keys())))

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapt 5 Section 2.5 Verbs section correction #191

Chapt 5 Section 2.5 Verbs section correction #191

AncientZygote commented May 15, 2017

pjhinton commented Sep 12, 2019 •

edited

Chapt 5 Section 2.5 Verbs section correction #191

Chapt 5 Section 2.5 Verbs section correction #191

Comments

AncientZygote commented May 15, 2017

pjhinton commented Sep 12, 2019 • edited

pjhinton commented Sep 12, 2019 •

edited