Skip to content

g000001/tagger

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
fsa
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

This directory contains release 1.2 of the Xerox Part-of-Speech tagger. For more information, print the file doc/tagger/tagger.ps.


Until this project is added to the Quicklisp repository installation must be performed manually in several steps (considering that you've got Quicklisp installed already):

  1. Download project sources (either by cloning the repository or downloading it in an archive).
  2. Unpack them to some directory, remember it.
  3. cd to the ~/quicklisp/local-projects directory.
  4. Create symbolic links to the .asd files in the directory from step 2.

Now it is possible to download the application either in parts or entirely:

(ql:quickload "tagger")

When the loading is complete, you can run some simple queries:

(tag-analysis:tag-string "I saw the man on the hill with the telescope.")

I saw the man on the hill with the telescope.
ppss/2 vbd/3 at nn in at nn in/2 at nn/2

(The number following the tag is the arity of the ambiguity class assigned by the lexicon. Words without a number are unambiguous.)

Programmatic Tagging

To use the tagger in a program, create a tagging-ts and use the values of calls to the generic function next-token. Note that reinitialize-instance redirects tagging to a new text with minimal initialization overhead.

For example, the following function, my-tag-files, calls my-process-token-and-tag on each token/tag pair generated by tagging each le in the argument files:

(use-package :tdb)
(use-package :tag-analysis)
(defun my-tag-files (files)
  (let ((token-stream (make-instance 'tagging-ts)))
    (dolist (file files)
      (with-open-file (char-stream file)
	(reinitialize-instance token-stream :char-stream char-stream)
	(loop (multiple-value-bind (token tag)
		  (next-token token-stream)
		(unless token (return))
		(my-process-token-and-tag token tag)))))))

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published