waterWatch

A self-hosted Node.js backend that ingests Alberta river conditions (water level and discharge) from Environment and Climate Change Canada's (ECCC) public hydrometric real-time feed — the same data behind rivers.alberta.ca — and caches them behind a storage abstraction for later querying.

What's here now: the province-wide ingestion pipeline plus the multi-user service layered on top of it — a map web UI (public/), passwordless (magic-link) auth, per-station favorites with threshold alert rules, and breach email delivery. It can run as one combined process or split into separate API and ingest processes that coordinate only through Postgres. See docs/ingestion-backend-plan.md for the ingestion backend and docs/plan-highlevel.md for the multi-user email-alerts epic.

Requirements

Node.js 20+ (uses native fetch and the built-in test runner)
npm 10+

Quick start

npm install
npm start

The server boots on http://localhost:3000 by default. Verify it is healthy:

curl http://localhost:3000/health
# {"status":"ok","service":"waterwatch","env":"development","uptimeSeconds":0,"time":"..."}

For local development with auto-reload:

npm run dev

Storage (Postgres)

waterWatch stores everything in Postgres — there is a single storage path in every environment (dev, test, prod); there is no SQLite fallback. Provide the connection via DATABASE_URL (or the discrete PG* variables) in your env file; see Configuration.

For local development and tests, a committed docker-compose.yml stands up a throwaway Postgres 17 matching the .env.example defaults:

docker compose up -d   # Postgres on localhost:5432 (waterwatch/waterwatch)
npm test               # uses the docker-compose default connection

To point the tests at a different database, set TEST_DATABASE_URL:

TEST_DATABASE_URL=postgres://user:pass@host:5432/db npm test

The schema is created automatically on first connect from the consolidated src/storage/schema.sql.

Run with Docker

A multi-stage Dockerfile is provided for self-hosted deployment. The builder stage installs production dependencies against the lockfile and the slim runtime stage ships only those plus the app, running as the non-root node user.

Build the image:

docker build -t waterwatch:latest .

Run it. The container binds 0.0.0.0:3000 and starts ingesting on boot. Provide configuration with --env-file (copy .env.example to .env first) and point DATABASE_URL at a reachable Postgres:

cp .env.example .env   # then edit DATABASE_URL to your Postgres
docker run -d --name waterwatch \
  -p 3000:3000 \
  --env-file .env \
  waterwatch:latest

The image declares a HEALTHCHECK that polls /health, so docker ps shows the container's health. Verify and inspect it like any other deployment:

curl http://localhost:3000/health
curl http://localhost:3000/ingestion/status

Stop it gracefully (the entrypoint uses exec form, so node is PID 1 and receives SIGTERM directly — it stops the poller, drains in-flight requests, and closes the Postgres pool before exiting):

docker stop waterwatch

Inspection / debug endpoints

These endpoints exist only to confirm that province-wide ingestion is working — they are operational/debug aids, not a consumer query API. A rich query surface (history ranges, filtering, consumer pagination) is intentionally deferred to the later monitoring/alerting work (see docs/ingestion-backend-plan.md, Phase 5).

Endpoint	Purpose
`GET /health`	Liveness check (process is up).
`GET /ingestion/status`	Last successful poll time, current watermark, total readings, distinct stations.
`GET /stations`	Count + station ids/names — confirms province-wide coverage.
`GET /stations/:stationId/latest`	Most recent reading for one station (spot-check, e.g. the test station).

# Is data landing and fresh?
curl http://localhost:3000/ingestion/status
# {"lastSuccessAt":"...","watermark":"...","totalReadings":600,"distinctStations":227,...}

# How many stations do we know about?
curl http://localhost:3000/stations
# {"count":1104,"stations":[{"stationId":"05AA004","name":"..."}, ...]}

# Spot-check the test station's latest reading.
curl http://localhost:3000/stations/05BL004/latest
# {"stationId":"05BL004","timestamp":"...","level":3.01,"discharge":177,...}
# Unknown station -> 404 {"error":"not_found","stationId":"..."}

npm scripts

Script	Purpose
`npm start`	Start the server (`node src/index.js`)
`npm run dev`	Start with `--watch` auto-reload
`npm test`	Run the test suite (`node --test`)
`npm run lint`	Lint with ESLint
`npm run lint:fix`	Lint and auto-fix
`npm run format`	Format with Prettier
`npm run format:check`	Check formatting without writing

Configuration

All configuration is read from environment variables once, at startup (src/config/index.js), validated, and exposed as an immutable object. Copy .env.example to .env and adjust as needed. Defaults:

Variable	Default	Description
`NODE_ENV`	`development`	Runtime environment.
`LOG_LEVEL`	`debug` (dev) / `info` (prod)	Minimum log level: `trace,debug,info,warn,error,fatal,silent`.
`HOST`	`0.0.0.0`	HTTP bind address.
`PORT`	`3000`	HTTP port.
`PROVINCE`	`AB`	Province/territory filter for the ECCC feed (province-wide ingest).
`ECCC_BASE_URL`	`https://api.weather.gc.ca`	ECCC OGC API base URL.
`TEST_STATION`	`05BL004`	Station id used for manual spot-checks (not a scope limit).
`CONTACT_EMAIL`	`info@<APP_BASE_URL host>`	Contact address sent in the ECCC API User-Agent (MSC usage policy); defaults from `APP_BASE_URL`.
`POLL_INTERVAL_MINUTES`	`10`	Ingestion poll interval (source updates ~every 5 min).
`INITIAL_BACKFILL_HOURS`	`24`	Cold-start backfill window when the store is empty.
`POLL_OVERLAP_MINUTES`	`15`	Re-query window before the watermark to catch late stragglers.
`RETENTION_DAYS`	`30`	Retention window for pruning old readings (the app keeps only the last 30 days).
`STATION_ACTIVE_WINDOW_DAYS`	`30`	Stations are shown on the map / in search only if they reported within this many days; hides dead/discontinued gauges. `0` disables; per-request opt-out via `?includeInactive=true`.
`DATABASE_URL`	(docker-compose default)	Postgres connection string. Wins over the discrete `PG*` vars.
`PGHOST`/`PGPORT`	`localhost`/`5432`	Discrete Postgres host/port (libpq names) if no `DATABASE_URL`.
`PGDATABASE`	`waterwatch`	Postgres database name.
`PGUSER`/`PGPASSWORD`	`waterwatch`/`waterwatch`	Postgres credentials.
`PG_POOL_SIZE`	`10`	Max connections in the `pg` pool.
`AUTH_ADMIN_EMAILS`	(empty)	Comma-separated admin allow-list (case-insensitive); admin is config-derived, never stored.
`APP_BASE_URL`	`http://localhost:3000`	Public base URL for magic-link callbacks and post-login redirect.
`MAGIC_LINK_TTL_MINUTES`	`15`	Single-use magic-link token lifetime (minutes).
`SESSION_TTL_DAYS`	`30`	Session lifetime after login (days).
`AUTH_COOKIE_NAME`	`ww_session`	Name of the session cookie.
`AUTH_COOKIE_SECURE`	`true` (prod) / `false` (dev)	Set the `Secure` flag on the session cookie.

Postgres is the only storage backend. A connection (DATABASE_URL or PGDATABASE) is required in every environment; src/config/index.js rejects startup without one.

Data-growth caveat: province-wide ingestion at ~5-minute cadence across hundreds of active stations accumulates quickly. RETENTION_DAYS bounds the table: the ingestion scheduler prunes readings older than the window on every poll cycle.

Project structure

src/
  config/       # Centralized env-based configuration (single source of truth)
  http/         # Fastify server + routes (/health + Phase 5 inspection endpoints)
    routes/
  lib/          # Cross-cutting utilities (structured logger)
  datasource/   # ECCC OGC API client + normalization (Phase 2)
  storage/      # Repository interface + Postgres implementation + migrations
  ingestion/    # Scheduled polling service (Phase 4)
  index.js      # Entrypoint: wires config + logger + server, lifecycle
test/           # Test suite (node:test)
docs/ingestion-backend-plan.md  # Implementation plan and phase status (this backend)
docs/plan-highlevel.md          # High-level plan for the multi-user email-alerts epic
docker-compose.yml  # Local/CI Postgres for dev + tests
Dockerfile      # Multi-stage container build
.dockerignore   # Keeps the build context small / image clean
.env.example    # Sample env file (copy to .env)
ansible/        # Single-node deploy: provision (playbook.yml) + redeploy (deploy.yml)

Deployment

waterWatch deploys as a single node: one host runs Postgres plus the combined app container (node src/index.js = HTTP API + static web UI + ingestion + alerts). The ansible/ directory automates the whole thing — it builds the image on the host from source (no registry, no git on the box), provisions Postgres, renders the env file, and runs the app as a systemd service.

First-time provisioning of a fresh Debian/Ubuntu host (run in order):

cd ansible
ansible-galaxy collection install -r requirements.yml
cp inventory.example.ini inventory.ini
cp group_vars/all.example.yml group_vars/all.yml
cp secrets.example.yml secrets.yml      # set DB password and API tokens
$EDITOR inventory.ini                    # point at your host
$EDITOR group_vars/all.yml               # domain, admin emails, deploy defaults
ansible-playbook firstrun.yml            # timezone/NTP, upgrades, firewall
ansible-playbook playbook.yml            # Postgres + build/run the app
ansible-playbook proxy.yml               # (optional) nginx + Let's Encrypt TLS

Routine code updates — after a host is provisioned, this is the everyday "push my latest code" command. It ships whatever source is currently on disk (committed or not), rebuilds the image, and restarts only if something changed:

cd ansible
ansible-playbook deploy.yml

deploy.yml deliberately does not touch Postgres, the firewall, or backups — it is just the fast app redeploy. Full details (what each playbook does, TLS, outbound email, backups, security notes) live in ansible/README.md; a high-level overview of the moving parts is in docs/DEPLOY.md.

Running by hand (no Ansible)

You can also build and run the container yourself against your own Postgres — see Run with Docker above and docs/DEPLOY.md.

Note

Don't edge-cache the dynamic endpoints. If you front the app with a CDN (e.g. Cloudflare), keep caching off for the JSON API (/api/*, /stations/*, /ingestion/status) — caching them serves stale readings even when ingestion is current. Static assets under public/ are safe to cache.

Future-upgrade path

These are deliberately not built in this deliverable (see the Scope boundary in docs/ingestion-backend-plan.md); they are the intended evolution and the design keeps each change isolated to one seam:

Alternative storage engines. Storage lives behind the Repository interface (src/storage/repository.js); the Postgres implementation is built by createRepository() in src/storage/index.js. A different engine would slot in there behind the same interface with no change to ingestion or the HTTP layer, and is validated by reusing the existing storage contract test suite (runRepositoryContract). Queries already use real SQL so semantics carry over.
AMQP / Sarracenia push ingestion. The scheduled poller (src/ingestion/scheduler.js) is the only component that decides when new readings arrive. Swapping the poll loop for ECCC's AMQP push notifications changes just that module — it still normalizes via the Phase 2 client and writes through the same repository, so storage and reads are untouched.
Ingestion / read-API process split. (Built.) The app can run combined (src/index.js) or as two processes — the read/API server (src/api.js) and the ingest scheduler (src/ingest.js) — that coordinate only through Postgres (writer vs. readers). All three reuse the same wiring factories (src/app/wiring.js), so the storage interface stays the seam. Scaling the pieces onto separate hosts is the remaining step.

Status

All six phases in docs/ingestion-backend-plan.md are complete: backend scaffolding, the ECCC client, storage, the ingestion scheduler, the minimal inspection endpoints, and packaging/ops/run docs (this Docker build + run guide). Phase status is tracked in docs/ingestion-backend-plan.md.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

waterWatch

Requirements

Quick start

Storage (Postgres)

Run with Docker

Inspection / debug endpoints

npm scripts

Configuration

Project structure

Deployment

Running by hand (no Ansible)

Future-upgrade path

Status

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ansible		ansible
docs		docs
public		public
src		src
test		test
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.js		eslint.config.js
package-lock.json		package-lock.json
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

waterWatch

Requirements

Quick start

Storage (Postgres)

Run with Docker

Inspection / debug endpoints

npm scripts

Configuration

Project structure

Deployment

Running by hand (no Ansible)

Future-upgrade path

Status

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages