Skip to content

CopticScriptorium/lexical-taggers

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 

lexical taggers for Sahidic Coptic

Includes tagger for language of origin. (Lemmatizer now integrated into the part-of-speech tagger at https://github.com/CopticScriptorium/tagger-part-of-speech)

_enrich.pl script to be used with lexicon file in each subdirectory (e.g., lexicon.txt in the languagetagger subdirectory for the language-of-origin tagging)

Usage: _enrich.pl [optional args] <IN_FILE>

Options and arguments:

-h print this [h]elp message and quit -l [l]exicon file (required). Defaults to lexicon.txt in same directory.

<IN_FILE> A text file one category per line, only text up to the first tab is used for lexicon lookup

example: _enrich.pl -l language-tagger/lexicon.txt my_file.txt

The language tagger now includes lexical entries provided by the Database and Dictionary of Greek Loanwords in Coptic (DDGLC). We thank the DDGLC and its director, Dr. Tonio Sebastian Richter, for this collaboration.

Perl script Copyright 2013-16 Amir Zeldes, Caroline T. Schroeder. The perl program is free software. You may copy or redistribute the script under the same terms as Perl itself.

Additional material copyright 2013-16 Amir Zeldes, Caroline T. Schroeder, Elizabeth Davidson: this is free software distributed under the GNU General Public license v. 3. http://www.gnu.org/licenses/gpl.html. You are welcome to distribute it under the conditions outlined in the license.

About

lexical taggers (language of origin, lemmatizer) for Sahidic Coptic

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages