Python TeX
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
condor Fix tests Jul 1, 2017
data Adding bibtex file iterator. Mar 18, 2016
docs Adding old intro and CNAME May 20, 2017
test Fix tests Jul 1, 2017
.editorconfig
.gitignore Add fancy table Mar 28, 2017
.travis.yml Drop 3.5 support Jul 1, 2017
CHANGELOG.rst Do it all Jul 1, 2017
Dockerfile
LICENSE.txt Adding packaging info Dec 15, 2015
README.rst
contributors.txt Adding contributors. Aug 22, 2015
docker-compose.yml
piprequirements.txt
setup.cfg Update reference from README.md to README.rst Jun 8, 2016
setup.py
towncrier.ini

README.rst

'Stories in Ready'

condor-ir

https://travis-ci.org/odarbelaeze/condor-ir.svg?branch=master

Access the docs here: home.condor-ir.co

Access to roadmap here: roadmap.

This is a program to work with examples of Latent Semantic Analysis search engines, a.k.a., LSA. The program is set up so that it understands froac xml documents on input as well as plain text records from isi web of knowledge.

You can find more information about froac repositories at http://froac.manizales.unal.edu.co/froac/ http://froac.manizales.unal.edu.co/froac/ and about isi web of knowledge text files at the thomson reuters website

Installing the condor-ir package

The second thing you will need is to download the program from its pypi repository,

pip install -U condor-ir

the -U parameter will upgrade the package to the latest version, a very recommendable step for a unstable package.

The language support requires the enchant engine as well as some dictionaries, for that you can install using your package manager or external tool:

# Arch
sudo pacman -S enchant \
               aspell-es aspell-en aspell-fr \
               aspell-it aspell-pt aspell-de
# Ubuntu
sudo apt-get install enchant \
                     aspell-es aspell-en aspell-fr \
                     aspell-it aspell-pt aspell-de

Furthermore, we require a bit of the nltk data package for the stems and stop word removal to work.

python -m nltk.downloader snowball_data stopwords

Finally, in order to prepare the database or reset the database in preparation for a new version of condor-ir you can run the database preparation script,

condor utils preparedb

CLI Interface

After installing the program you will have three basic commands at your disposal, for handling bibliography sets, term document matrices and engines, the CLI interface gives you most CRUD operations in a hierachical manner.

condor triggers the main program and you can get top level help by running condor --help.

condor bibliography namespaces the bibliography set related commands, you can list and get help about those using condor bibliography --help.

condor model is a short cut that offers the condor model create sub command, that creates both a term document matrix and an lsa search engine, get help on models using condor model --help.

condor query <string...> this non crud command search a bibliography set using a previously created search engine, the search engine can be targeted figure out how using condor query --help.

Feel free to check detailed descriptions of these commands using their --help flag.