Software for preprocessing textual data in multiple languages for textual analysis.
Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin V2 ready/ Jan 16, 2016
examples Delete 6.txt~ Jan 21, 2016
textorganizer V2 ready/ Jan 16, 2016
utilities Second commit Jan 3, 2014
CHANGES.txt Second commit Jan 3, 2014
GUI_README.md Second commit Jan 3, 2014
LICENSE.txt V2 ready/ Jan 16, 2016
MANIFEST.in Second commit Jan 3, 2014
README.md Update README.md Apr 22, 2014
amazon.md Second commit Jan 3, 2014
rce.md Second commit Jan 3, 2014
setup.py editing from Richs install Jan 21, 2016
training_text.txt Edited spellchecker Jan 26, 2014

README.md

txtorg

txtorg is a Python-based utility that leverages Apache Lucene to facilitate text preprocessing and management. It outputs processed text in a variety of formats for use in a wide array of analytical software, including (but not limited to) the structural topic model. It scales to large corpora and has a graphical user interface that anyone can use. With Lucene, txtorg can support a wide range of languages.

For more information, including installation instructions, see http://txtorg.org/.