Skip to content

dr-kd/jspos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ABOUT:

jspos is a Javascript port of Mark Watson's FastTag Part of Speech Tagger which 
was itself based on Eric Brill's trained rule set and English lexicon.
jspos also includes a basic lexer that can be used to extract words and other
tokens from text strings.

LICENSE:

jspos is licensed under the GNU LGPLv3

FILES:

lexicon.js_ - Javascript version of Eric Brill's English lexicon
lexer.js - Lexer to break a sentence into taggable tokens (e.g. words)
POSTagger.js - the Part of Speech tagger

You'll typically need to include all 3 files.

USAGE:

var words = new Lexer().lex("This is some sample text. This text can contain multiple sentences.");
var taggedWords = new POSTagger().tag(words);
for (i in taggedWords) {
    var taggedWord = taggedWords[i];
    var word = taggedWord[0];
    var tag = taggedWord[1];
}

ACKNOWLEDGEMENTS:

Thanks to Mark Watson for writing FastTag, which served as the basis for jspos.

TAGS:

CC Coord Conjuncn           and,but,or
CD Cardinal number          one,two
DT Determiner               the,some
EX Existential there        there
FW Foreign Word             mon dieu
IN Preposition              of,in,by
JJ Adjective                big
JJR Adj., comparative       bigger
JJS Adj., superlative       biggest
LS List item marker         1,One
MD Modal                    can,should
NN Noun, sing. or mass      dog
NNP Proper noun, sing.      Edinburgh
NNPS Proper noun, plural    Smiths
NNS Noun, plural            dogs
POS Possessive ending       Õs
PDT Predeterminer           all, both
PP$ Possessive pronoun      my,oneÕs
PRP Personal pronoun         I,you,she
RB Adverb                   quickly
RBR Adverb, comparative     faster
RBS Adverb, superlative     fastest
RP Particle                 up,off
SYM Symbol                  +,%,&
TO ÒtoÓ                     to
UH Interjection             oh, oops
VB verb, base form          eat
VBD verb, past tense        ate
VBG verb, gerund            eating
VBN verb, past part         eaten
VBP Verb, present           eat
VBZ Verb, present           eats
WDT Wh-determiner           which,that
WP Wh pronoun               who,what
WP$ Possessive-Wh           whose
WRB Wh-adverb               how,where
, Comma                     ,
. Sent-final punct          . ! ?
: Mid-sent punct.           : ; Ñ
$ Dollar sign               $
# Pound sign                #
" quote                     "
( Left paren                (
) Right paren               )

AUTHOR:

Percy Wegmann: http://www.percywegmann.com/

The orignal of this code is avallable http://code.google.com/p/jspos/

Kieren Diment <zarquon@cpan.org> added the demo.html and main.js files.

The next step is to add noun phrase extraction routines and other utility functions (see the Perl Module Lingua::EN::Tagger: http://search.cpan.org/perldoc?Lingua::EN::Tagger ).

About

Fork of the google code project jspos - English part of speech tagger in javascript.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published