Skip to content
forked from odaata/HisTEI

A framework for Oxygen XML Editor allowing researchers to transcribe historical documents in TEI

License

Notifications You must be signed in to change notification settings

oxygenxml/HisTEI

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HisTEI

A Framework add-on for Oxygen XML Editor allowing researchers to transcribe historical documents in TEI. More information on http://www.histei.info/p/home.html.

Compilation

Compile the project using IntelliJ. Make sure to update the build.properties file with the correct locations of the various modules.

JDK

Preferred JDK is Oracle. Installation:

sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java7-installer

Then make sure java -version outputs something along the lines of:

java version "1.7.0_55"
Java(TM) SE Runtime Environment (build 1.7.0_55-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.55-b03, mixed mode)

Python scripts

In the folder python you'll find two Python-scripts:

  • extractglosses.py allows you to extract all elements from your HisTEI-XML file (requires lxml).
  • xmltokenize.py allows you to to train a sentence tokenizer (uses the NLTK platform).

About

A framework for Oxygen XML Editor allowing researchers to transcribe historical documents in TEI

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • XQuery 65.5%
  • CSS 20.5%
  • Java 13.1%
  • Other 0.9%