UpTime

A self-hosted uptime and synthetic monitoring platform built with Go, Gin, PostgreSQL, Redis, worker pools, Prometheus metrics, and webhook notifications.

UpTime started as a small uptime-check API. This rebuild turns the same idea into a backend-first portfolio project with real persistence, scheduler/worker separation, incident handling, API-key auth, metrics, and Docker Compose.

Features

Gin REST API with GET /health, legacy GET /health-check, and legacy POST /ping-endpoint
HTTP, keyword, TCP, DNS, and TLS checks
HTTP timing details through httptrace: DNS, TCP connect, TLS handshake, first byte, total duration
PostgreSQL tables for monitors, check results, incidents, notification channels, API keys, and audit logs
Worker process with goroutines, channels, context cancellation, duplicate-check avoidance, and graceful shutdown
Incident lifecycle: opens after failureThreshold consecutive failures and resolves on recovery
Webhook notification channels for incident open/resolve events
API key authentication with hashed stored keys and a bootstrap admin key
Prometheus metrics for API requests, checks, incidents, and worker jobs
Docker Compose stack with API, worker, Postgres, Redis, Prometheus, and Grafana

Screenshots

These screenshots were captured from the live Docker Compose stack.

Architecture

flowchart LR
  User[User / API Client] --> API[Go Gin API]
  API --> Postgres[(PostgreSQL)]
  API --> Redis[(Redis)]
  Worker[Go Worker Pool] --> Postgres
  Worker --> Redis
  Worker --> Targets[Websites / TCP / DNS / TLS Targets]
  Worker --> Notify[Webhook Notifications]
  Prometheus --> API
  Prometheus --> Worker
  Grafana --> Prometheus

Tech Stack

Go 1.22+
Gin
PostgreSQL via pgx
Redis
Prometheus client library
Structured logging with slog
Docker Compose

Local Setup

Run the full stack:

make docker-up

API: http://localhost:8008

Prometheus: http://localhost:9090

Grafana: http://localhost:3000 with admin / admin

Run without Docker for Go processes:

export DATABASE_URL='postgres://uptime:uptime@localhost:5432/uptime?sslmode=disable'
export REDIS_URL='redis://localhost:6379/0'
export UPTIME_BOOTSTRAP_API_KEY='dev_admin_key'

make migrate
go run ./cmd/api
go run ./cmd/worker

Environment

Variable	Default	Description
`APP_ENV`	`development`	Runtime environment (`production` enforces stricter defaults)
`APP_PORT`	`8008`	API port
`METRICS_PORT`	`8009`	Worker Prometheus metrics port
`DATABASE_URL`	local Postgres	PostgreSQL connection string (`postgres://` or `postgresql://`)
`REDIS_URL`	local Redis	Redis connection string (`redis://` or `rediss://`)
`UPTIME_BOOTSTRAP_API_KEY`	`dev_admin_key` (dev only)	Bootstrap bearer token. Required in production; must be ≥ 16 chars
`ALLOW_PRIVATE_TARGETS`	`false`	Allow localhost/private targets for checks/webhooks (forbidden in production)
`CHECK_WORKER_COUNT`	`10`	Worker goroutine count (1–1024)
`DEFAULT_CHECK_TIMEOUT_SECONDS`	`10`	Default check timeout (1–300)
`SCHEDULER_TICK_SECONDS`	`5`	How often the scheduler polls for due monitors (1–60)
`LOG_LEVEL`	`info`	`debug`, `info`, `warn`, or `error`
`TLS_EXPIRY_WARN_DAYS`	`14`	Days before expiry that TLS checks report `degraded`
`WEBHOOK_SIGNING_SECRET`	empty	If set, webhook bodies are HMAC-SHA256 signed in `X-UpTime-Signature`
`WEBHOOK_TIMEOUT_SECONDS`	`10`	Per-attempt webhook timeout
`WEBHOOK_MAX_RETRIES`	`3`	Additional webhook attempts after the first failure (0–10)
`SHUTDOWN_TIMEOUT_SECONDS`	`15`	Graceful shutdown deadline
`API_READ_HEADER_TIMEOUT_SECONDS`	`5`	API `http.Server` read header timeout
`API_WRITE_TIMEOUT_SECONDS`	`30`	API `http.Server` write timeout
`MAX_REQUEST_BODY_BYTES`	`1048576`	Maximum accepted request body size in bytes
`MIGRATIONS_DIR`	`migrations`	Directory containing `*.up.sql` files

API Examples

Health:

curl http://localhost:8008/health

Manual legacy check:

curl -X POST http://localhost:8008/ping-endpoint \
  -H "Content-Type: application/json" \
  -d '{"endpoint":"https://example.com"}'

Create a monitor:

curl -X POST http://localhost:8008/api/v1/monitors \
  -H "Authorization: Bearer dev_admin_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Example Website",
    "type": "http",
    "target": "https://example.com",
    "method": "GET",
    "expectedStatus": 200,
    "timeoutSeconds": 10,
    "intervalSeconds": 60,
    "failureThreshold": 3,
    "enabled": true
  }'

Run a monitor now:

curl -X POST http://localhost:8008/api/v1/monitors/00000000-0000-0000-0000-000000000101/check-now \
  -H "Authorization: Bearer dev_admin_key"

Create an API key:

curl -X POST http://localhost:8008/api/v1/api-keys \
  -H "Authorization: Bearer dev_admin_key" \
  -H "Content-Type: application/json" \
  -d '{"name":"local dev"}'

Scheduler And Worker

cmd/worker periodically loads enabled monitors from PostgreSQL. It schedules checks by intervalSeconds, skips monitors already in flight, and fans jobs out to a fixed goroutine pool. Each job uses context timeouts, stores a check result, updates monitor status, and applies incident rules.

Redis is part of the local stack and health reporting. The current worker uses local in-process scheduling; Redis-backed distributed locks/queues are a natural next step for multiple worker replicas.

Check Types

http: validates URL, blocks private targets by default, supports GET/HEAD, expected status, redirects disabled, body snippets, and timing breakdowns.
keyword: HTTP check plus expected keyword matching.
tcp: checks host:port reachability with net.Dialer.
dns: resolves a hostname with Go's resolver.
tls: connects to a TLS endpoint and marks certificates near expiry as degraded.

Incident Lifecycle

Checks are stored in check_results. A monitor opens an incident only after failureThreshold consecutive failures. A succeeding check resolves the open incident. Webhook notifications are sent on both transitions and attempts are recorded in notification_events.

Observability

GET /metrics exposes API metrics. The worker exposes metrics on :8009/metrics.

Prometheus scrapes both services, and Grafana is provisioned with a starter UpTime dashboard.

Worker dashboard

A minimal job UI is served by the API at GET /workers. It polls GET /api/v1/workers/status every 2 seconds and shows, per worker instance: host, started/last-seen, active and queued jobs, in-flight monitor IDs, and the most recent 50 check results. Workers write their state into worker_heartbeats every 5 seconds, so the same view also reflects crashed or restarting instances (rows older than ~20 seconds are flagged stale).

The HTML page is unauthenticated; it prompts for an API key client-side and uses it as a Bearer token for the protected status XHR.

Security

/api/v1/* endpoints require Authorization: Bearer <key> or X-API-Key
Raw generated API keys are shown once; only SHA-256 hashes are stored
URLs and webhooks block localhost/private/link-local targets unless ALLOW_PRIVATE_TARGETS=true
Checks use context timeouts and bounded response snippets
Logs avoid raw API keys and webhook payload secrets

Testing

make test
make check

The test suite covers HTTP checker success, timeout, expected-status mismatch, SSRF blocking, TCP success/failure, DNS success/failure, TLS expiry classification, API key hashing, and incident open/resolve rules.

Roadmap

Redis-backed distributed queue and locks
Slack, Discord, and SMTP notification channels
Public status pages
Multi-tenant organisations
Remote monitoring agents
Optional React/Next.js dashboard
Elasticsearch/Kibana analytics as a future optional integration

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
cmd		cmd
deployments		deployments
docs		docs
examples		examples
internal		internal
migrations		migrations
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UpTime

Features

Screenshots

Architecture

Tech Stack

Local Setup

Environment

API Examples

Scheduler And Worker

Check Types

Incident Lifecycle

Observability

Worker dashboard

Security

Testing

Roadmap

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UpTime

Features

Screenshots

Architecture

Tech Stack

Local Setup

Environment

API Examples

Scheduler And Worker

Check Types

Incident Lifecycle

Observability

Worker dashboard

Security

Testing

Roadmap

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages