Luke's Latin Tagger and (under construction) Corpus
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
LICENSE.md initial commit Dec 12, 2017
README.md bug acknowledgement Dec 12, 2017
example initial commit Dec 12, 2017
latin_tag.py initial commit Dec 12, 2017
prayer bug acknowledgement Dec 12, 2017
prayer.tagged bug acknowledgement Dec 12, 2017

README.md

Corpus Latinum Lucae

This will be tools to create a searchable Latin Corpus built from texts from theLatinLibrary.com.

Right now, I've finished a part-of-speech tagger that uses Whitacker's Words to tag text documents. This is what latin_tag.py is.

latin_tag.py

The tagger. Feed it a text via command-line argument (or many) and will produce a tagged equivalent in FILENAME.tagged.

Dependencies:

Known bugs

  • Can't handle text with semicolons. Or brackets []. Will fix soon.

Next on the list:

  • system for generating the tagged corpus
  • way to search the corpus