Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
README
bitstrings
devset
test
training

README

PP ATTACHMENT CORPUS

Adwait Ratnaparkhi

ftp://ftp.cis.upenn.edu/pub/adwait/PPattachData/

This directory contains the data used for the model described in:

Ratnaparkhi, Adwait (1994). A Maximum Entropy Model for Prepositional
Phrase Attachment.  Proceedings of the ARPA Human Language Technology
Conference.  [http://www.cis.upenn.edu/~adwait/papers/hlt94.ps]

CONTENTS

training:   training data
devset:     development test set,
            used for debugging and algorithm development.
test:       used to report results
bitstrings: word classes derived from Mutual Information
            Clustering for the Wall Street Journal.

training, devset, and test are in the format:
  <source sentence#> V N1 P N2 <attachment>

Distributed with NLTK with the permission of the author.