Skip to content

Commit

Permalink
Merge pull request #5 from johnb30/updates
Browse files Browse the repository at this point in the history
Updates
  • Loading branch information
johnb30 committed Nov 4, 2014
2 parents 8438a98 + 7567ba5 commit 86404c6
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 5 deletions.
16 changes: 16 additions & 0 deletions README.md
Expand Up @@ -62,6 +62,8 @@ For those familiar with `git`, a `git clone` should work fine. For those
unfamiliar with `git`, it is possible to download the repository as a zip file
as shown in the picture below.

*Note: We've tested this setup on Vagrant 1.6.5*

[![Github][git]][git]

Once this file is downloaded and unzipped, you should use the command line to cd into the
Expand Down Expand Up @@ -93,12 +95,26 @@ explored in the [Vagrant documentation](https://docs.vagrantup.com/v2/getting-st
Due to the way Vagrant sets up the virtual machine, it is necessary to prepend nearly
every command with `sudo`.

The filepaths in the config file for the `stanford_pipeline` need to be changed
to use absolute paths. For example:

```
cd ~/stanford_pipeline
sudo vim default_config.ini
```

Once in the config, change the `~/` characters to `/home/vagrant/`.

The `bootstrap.sh` script is specifically configured for use with the Vagrant
box, but with slight modifications can be used on any Linux box (it's what we
use to bootstrap our machines). This means that the script can serve as the
basis for setting up a high-performance computer running EL:DIABLO, an
individual's laptop, etc.

Currently the virtual machine takes up 4GB of RAM. Less than this doesn't
really work since the shift-reduce parser needs a fair amount of memory to
operate.

For the two Github repositories, `scraper` and `phoenix_pipeline`, each time
`vagrant up` is run the most recent version of the code is pulled from Github.
If you have a long-running virtual machine and wish to obtain the latest code,
Expand Down
7 changes: 5 additions & 2 deletions Vagrantfile
Expand Up @@ -5,7 +5,10 @@
VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = "precise64"
config.vm.box_url = "http://files.vagrantup.com/precise64.box"
config.vm.provider "virtualbox" do |vb|
vb.customize ["modifyvm", :id, "--memory", "4096"]
end
config.vm.box = "ubuntu/trusty64"
config.vm.box_url = "https://vagrantcloud.com/ubuntu/trusty64"
config.vm.provision :shell, :path => "bootstrap.sh"
end
11 changes: 8 additions & 3 deletions bootstrap.sh
Expand Up @@ -3,9 +3,11 @@
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' |
tee /etc/apt/sources.list.d/mongodb.list

sudo apt-get update

echo "Installing base packages..."
sudo apt-get install zlib1g-dev
sudo apt-get install git <<-EOF
yes
EOF
Expand Down Expand Up @@ -45,9 +47,12 @@ sudo pip install git+https://github.com/openeventdata/petrarch.git
cd

echo "Downloading CoreNLP..."
sudo wget http://www-nlp.stanford.edu/software/stanford-corenlp-full-2013-06-20.zip
sudo unzip stanford-corenlp-full-2013-06-20.zip
mv stanford-corenlp-full-2013-06-20 /home/vagrant/stanford-corenlp
sudo wget http://nlp.stanford.edu/software/stanford-corenlp-full-2014-06-16.zip
sudo unzip stanford-corenlp-full-2014-06-16.zip
mv stanford-corenlp-full-2014-06-16 /home/vagrant/stanford-corenlp
cd /home/vagrant/stanford-corenlp
echo "Downloading shift-reduce parser..."
sudo wget http://nlp.stanford.edu/software/stanford-srparser-2014-07-01-models.jar

echo "Downloading NLTK data..."
sudo mkdir -p nltk_data/tokenizers
Expand Down

0 comments on commit 86404c6

Please sign in to comment.