Command Line Usage
Gistable provides a NodeJS command line and can be installed globally using NPM and git.
npm i -g git+https://github.com/gistable/gistable.git
If Gistable has already been cloned, it can also be linked with
Display help with
Running Inference Harness
Gistable includes Vagrant, Terraform, and Ansible configuration required for replicating gist inference. The list
of gists scraped is stored as a list in the YAML file
The provided Terraform configuration describes the resources needed to orchestrate the inference harness running in
DigitalOcean. A DigitalOcean authentication token is required. Public/Private keys named
playbooks/keys/do-test-harness(.pub) must exist and are used to create the DigitalOcean ssh key. All IP addresses
are displayed as output after orchestration. To orchestrate with Terraform, run
terraform apply. On the first run,
it is also necessary to run
The provided Vagrant configuration orchestrates an inference harness locally using VirtualBox. To orchestrate using
vagrant up. The machines have the following local IP addresses.
Orchestration requires Ansible to be present on the local system, and configures the following machines
|Control||A control machine with Ansible installed, used to manage Nomad machines.|
|Nomad1||Nomad server used to schedule and manage jobs.|
Once orchestration is complete, ssh into the control machine and
cd ~/harness. You will need to bootstrap the nomad
cluster, install nomad, and schedule jobs using the following commands.
ansible-playbook -i inventory playbooks/bootstrap-cluster.yml ansible-playbook -i inventory playbooks/install-nomad.yml ansible-playbook -i inventory playbooks/run-gist-jobs.yml
Scheduling and running jobs takes approximately 8-10 hours. Once complete, gists and results can be scraped from the inference harness using the provided script.
python2 scripts/scrape_consul_gists.py http://<nomad1 ip addr>:8500/
scrape_consul_gists.py generates three outputs. The first is a
gists/ directory containing all gists which either
succeeded on the first trial run, or failed due to
ImportError on the first run but not the second. All gists of
these gists have an accompanying Dockerfile for building and running them.
gists-other/ contains all gists which
failed for any other reason.
gists.csv is a csv file containing gist metatdata and inference execution results.
Finally, as a convenience, the Ansible playbook
playbooks/purge-nomad-jobs.yml is provided to purge nomad job
execution if inference is to be rerun.