Scrape Bandon river water level from Bandon FEWS site and save to Google Fusion Tables using Node.js
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
README.md
bandonfews.conf
config-example.json
main.js
manualinsert.js
package.json
server.js
tokens-example.json

README.md

Bandon FEWS Small Data Scraper in Node.js

Introduction

Cork County Council has a site called Bandon FEWS (Bandon Flood Early Warning System). When the Bandon river hits certain levels near Bandon town, it alerts registered users via SMS in case they need to take emergency measures. It's a very useful service. However the historical river level data is not available in any useful form and that's the point of this project.

In November 2011, I created a simple Python script which scrapes the site every 15 minutes and saves the river level to a Google Fusions Table "spreadsheet" here. This now has (with a few interruptions) a lot of data which anyone can query, re-use or slice-dice and mashup with weather info. Not that anyone has done this :-)

I re-wrote it in Node.js in 2014.

To use it for something else on Fusion Tables, the main thing you need to do is setup an App in the Google API Console, enable Fusion Tables API access and use those keys in the placeholders in the config files.

Files

  • main.js - The main code
  • bandonfews.conf - Supervisor config file. Copy to /etc/supervisor/conf.d/bandonfews.conf then sudo supervisorctl reread and sudo supervisorctl reread
  • tokens-example.json - rename to tokens.json. Will be auto-generated by the code on first run in any case
  • config-example.json - Rename to config.json and fill out the Google OAuth API key/secret, the URL of the FEWS site and the id of the Fusion Table
  • index.html - Placeholder for upcoming landing page
  • server.js - Placeholder for upcoming API server
  • package.json - Required Node.js packages. npm install to install everything
  • .gitignore - Stuff that shouldn't go under Git control

TO-DO

  • The first time run is currently broken for the OAuth credentials.
  • Token refresh is not handled yet but has never been required.
  • It should have a simple Express based API so others can avoid the pain of dealing with the Fusion Tables API for basic queries
  • It should have a simple landing page showing the latest water level and maybe a graph

Changelog

  • 04/01/2015 - First public usable version and blogpost
  • 05/01/2015 - Add error handling for parsing issues in case FEWS site changes and breaks the scraping