Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
JavaScript CSS Python CoffeeScript Shell

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
client
documents @ c7e29a9
finalips @ 093d1ac
listings @ 626c8d9
listings-2 @ 24a9f54
logs @ cfa2fad
migrations
private @ 806edbf
reader
server
.gitmodules
deploy
desk.crontab
readme.md
restore_from_log.py
serve
server.crontab

readme.md

Scott's goal is to automate the tedious parts of Scott's job so that he has time to do the important parts.

How to

Clone with submodules

git clone git@github.com:tlevine/scott.git --recursive

Install dependencies

# For the server
(
  cd server
  npm install
)

# For the reader
(
  cd reader
  sudo pip2 install -r requirements.txt
)

# In Arch
sudo pacman -S python2-lxml python2-cssselect \
  tesseract-data-eng tesseract coffee-script \
  poppler imagemagick git

# In Ubuntu 12.04
echo Install node 0.8 from source. Then,
sudo apt-get install python-lxml tesseract-ocr tesseract-ocr-eng \
  coffeescript git poppler-utils imagemagick

server contains a restify web server, and client contains a Backbone application. The server also serves the client files. Serve the site like so

./serve

During development, it expects a database to be in /tmp/wetlands.db. This often ram, so you might need to put one there.

Useful stuff

Network of servers

This is run across two computers.

  1. One (server) serves the node application, the downloader/reader, the database and the primary documents
  2. A second (desk) coordinates backups; it periodically pulls the documents and listings submodules and the logs directory and then pushes those to external git repository hosting services.

Downloading old documents

You can see all the previous permit applications from Scott's manual spreadsheet parsings here (except for a couple that went missing).

If you want to integrate them with your csv file, switch everything in the respective urls before the final (sixth) slash for "https://raw.github.com/tlevine/mvn-www2-backup/master/documents"

For example, http://www2.mvn.usace.army.mil/ops/regulatory/pdf/document2011-09-09-103911.pdf becomes https://raw.github.com/tlevine/mvn-www2-backup/master/documents/document2011-09-09-103911.pdf

You can also download them all at once here

Something went wrong with that request. Please try again.