POS_Tagger

===============

Generic POS Tagger for text documents. Folder used to read/write should be in the same level at 'default' package

python3 postagger.py --input <input-folder-location> --output <output--folder-location>

Requires the following libraries:
- nltk
- stop_words
- inflection

UPDATES:

[2018-12-22]

Stopwords considering NLTK. The one from PyPl is still na option, no clear difference about
Concatenation of NOUNS working 2.a. Has to be done before ADJ chains for more sound results
removeLastRepWord(word) in case this is part of a previous concatenation 3.a. Important to removePunctuation first
Concatenation of ADJ*NOUN+ is not working for consecutive adjectives - To Do

[2018-12-20]

Implemented NOUNS+ concatenation
Better punctuation removal
Holy Grail math-<?>

[2018-12-17]

Fixed all filters to ignore math-<?> tokens. It also does not concatenate math-<?> tokens

[2018-12-14]

Added one extra filter option to just lowercase words without any tag
cleanText calls all the filter-functions comment and/or uncomment the ones desired

[2018-12-07]

removePlurals() using inflection
Included all POS tags from NLTK in the end od file_manipulation.py
Simple refactoring
Fixed relative imports from default. package

[2018-12-04]

Major changes in the code to work with (word,pos_tag)
All functions (major) refactor
Concatenate words with NOUNS and ADJECTIVE tags in nltk.pos_tag()

[2018-11-29]

makeLowerCase and applyStemmer ignore words beginning with

[2018-11-28]

Several pre-processing functions implemented
NLTK tagger working
General refactorings

[2018-11-14]

cleanString implemented to get rid of specific chars

[2018-10-29]

Reading/writing files - line-by-line,
Text clean and stopwords removal
POS tagger using NLTK prototype

[2018-10-26]

Project creation
File/Folder structure implementation
Command line arguments added

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
default		default
files		files
.project		.project
.pydevproject		.pydevproject
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

POS_Tagger

UPDATES:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

POS_Tagger

UPDATES:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages