Skip to content
This repository has been archived by the owner on Aug 5, 2020. It is now read-only.

tinta/ALLegalsScraper

Repository files navigation

Alabama Legals Scraper

Build Status

Setup

  1. Install electron system requirements (required for nightmare.js)
apt-get install -y libgtk2.0-0 libgconf-2-4 \
    libasound2 libxtst6 libxss1 libnss3 xvfb
  1. Run npm install

  2. Setup the database

$ bin/setup_db.sh
$ mysql -u scraper alabamalegals -p < bin/create_tables.sql
  1. Set environment variables
export AL_GOOGLE_CLIENT_SECRET=supersecretclientsecret
export NODE_ENV=dev
export SCRAPERPASS=thedbpassword
  1. Start the app with npm start

Usage

Allow access to data interface on port 3000 npm run start

Run scraper (takes a couple of minutes) npm run scraper

Production

forever start app/index.js

Debugging

Electron and nightmare have a lot of dependencies and are sometimes buggy. To debug the installation of these requirements, use the debug flag.

DEBUG=nightmare* node scraper/index.js

If node is throwing weird errors, check the version and ensure the correct one is in use.

node --version
nvm use

About

This is basically a project for my mom, and likely not useful to anyone else on the internet, except as a case study in writing frontend scrapers for shitty websites.