This proyect is composed of three applications:
- Hephaestus: Workers and background tasks
- Aphrodite: Web application
- Chaos: Core models and libs
- Aeolus: JS app
- Ruby 2.1.2
- Bundler (
- Nokogiri dependencies
- MongoDB server
- Redis server
- FreeLing 3.1
- Poppler 0.20+
$ \curl -sSL https://get.rvm.io | bash -s stable --ruby $ source ~/.bashrc $ rvm install ruby-2.1.2 $ rvm use 2.1.2 --default
# apt-get install libxslt-dev libxml2-dev
MongoDB and Redis servers
On Debian / Ubuntu machines, install from the package manager:
# apt-get install mongodb mongodb-server redis-server
You need Java 6 (or newer) to run ElasticSearch. If on Debian / Ubuntu, you can install OpenJDK JRE from the package manager:
# apt-get install openjdk-7-jre
Then, download and install the .deb package:
$ wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.3.2.deb # dpkg -i elasticsearch-1.3.2.deb
Keep in mind that elasticsearch produces a large amount of logs, so it's a good idea to setup a logrotate for this tool. Also, elasticsearch needs to keep lots of files open simultaneously so you'll probably need to run this (for the elasticsearch runner user):
$ ulimint -n 32000
For more information about this issue, please read.
There are alternative downloads here.
$ curl https://raw.github.com/creationix/nvm/v0.3.0/install.sh | sh $ source ~/.bashrc $ nvm install 0.10 $ nvm alias default 0.10
Install Docsplit dependencies
# apt-get install -y graphicsmagick poppler-utils poppler-data ghostscript pdftk libreoffice
Detailed dependencies listed in Docsplit documentation.
Download the tarball of Poppler 0.20.1 and extract it somewhere, like
apt-get build-dep poppler-utils to make sure you have all of its
dependencies. Then, just execute
make install as
The NER module currently uses FreeLing, an open source suite of language analyzers written in C++.
Compile and Install
For compiling the source, you need the
libicu libraries. On Debian / Ubuntu machines, you can run:
# apt-get install build-essential libboost-dev libboost-filesystem-dev \ libboost-program-options-dev libboost-regex-dev \ libicu-dev
Then, just execute
make install as usual.
Clone the repository
# apt-get install git $ git clone firstname.lastname@example.org:hhba/mapa76.git
bundle install to install all gem dependencies.
$ cd mapa76 $ cd aphrodite $ bundle install $ cd ../hephaestus $ bundle install $ cd ../aeolus
Setup your config files
Both, aphrodite and hephaestus are ruby applications and each one of them has their own configuration files. They live in
./config and they have
.yml extensions. You need to adjust them to your workstation needs. For easy setup, just rename
If the servers will be running on the same machine as Mapa76, you don't need to change anything.
Install Aeoluos dependencies
Just follow the instructions here.
Up and Running
You will need to run the aeolus file watcher:
$ cd aeolus $ grunt w
Fire Rails app:
$ cd aphrodite $ rails s
To start workers for document processing, you need to run at least one Resque worker:
$ cd hephaestus $ QUEUE=* bundle exec rake resque:work
you can run multiple workers with the
$ COUNT=2 QUEUE=* bundle exec rake resque:workers
And you also need to
freeling as a server. The
.sh file only works in OSX, but it shouldn't be hard to make it work on Ubuntu:
$ cd hephaestus $ sh ./start-freeling.sh
- Split workers from web app.
- Upgrade to Rails 4.