Agent Watch

Track release stability for OpenClaw, Hermes, and any other open-source project. Built on Cloudflare Workers + D1.

Live: https://agentwatch.aicompass.dev

How it works

Cron every 20 minutes pulls the 15 most-recent releases from GitHub for each configured project, plus issues updated since the last poll.
LLM (OpenAI-compatible) classifies each issue with project-aware rules. The system prompt embeds an explicit core-vs-niche rubric per project (openclaw, hermes-agent), so a provider/channel/backend-specific bug is correctly tagged as niche instead of inflated to core+broad+critical. Output fields: sentiment, target release, severity, impact scope, functionality, affected user share, duplicate cluster size, workaround status, and a one-line summary.
Stability score (0 = unstable → 10 = stable) per version blends:
- Impact-weighted issue risk, with per-issue cap so one over-tagged report can't tank the score and a niche-total cap (1.0) so any number of niche/integration/provider issues contribute at most 1.0 to the risk index
- Core-blocker floor (6.0): if the release has zero core+critical|high negatives, the score never drops below "Mostly stable" — vocal but bounded niche failures don't make working software look broken
- Peer-median floor (5.5): a release whose weighted negative signal is at-or-below the project's own historical median is held to "Mixed" or better
- Stronger positive signal: positive issues / "works for me" comments offset roughly 2× more than before
- User star ratings (1–10) blend in with up to 60% weight at saturation
New versions (< 3 hours old) display a grey 5 with analyzing….
Color coding is interpolated:
- Lower scores shade red because they indicate higher observed release risk
- = 5 grey means neutral or insufficient signal
- Higher scores shade green because they indicate lower observed release risk
Confidence label (low/medium/high) reflects how many independent signals (negatives + positives + ratings) backed the score, so a low score from a single report is visibly distinct from a low score from many corroborating ones.
Login with GitHub or Google to add your own 1–10 rating with optional comment.

Local development

npm install
cp .env.example .dev.vars            # populate at minimum LLM_API_KEY and GITHUB_TOKEN

# Validate everything once before deploy
npm run test                         # typecheck + sql validation + score smoke tests

# Run the worker locally
npm run db:migrate:local
npm run dev

npm run dev serves on http://localhost:8787 with the static frontend mounted under /.

To trigger a one-off poll locally:

curl -X POST http://localhost:8787/cron/run -H "x-admin-token: $SESSION_SECRET"

Configuration

All config is via environment variables — see .env.example for the full list.

Variable	Purpose	Example
`PROJECTS`	Comma-separated `slug=owner/repo` for projects to monitor	`openclaw=openclaw/openclaw,hermes=nousresearch/hermes-agent`
`DEFAULT_PROJECT`	Slug shown on the homepage by default	`openclaw`
`PUBLIC_BASE_URL`	Origin used for OAuth `redirect_uri`	`https://agentwatch.aicompass.dev`
`GITHUB_TOKEN`	Bumps GitHub API rate-limit from 60 → 5000/hour	`ghp_…`
`LLM_BASE_URL`	OpenAI-compatible endpoint (works for OpenAI, Anthropic via proxy, third-party)	`https://api.openai.com/v1`
`LLM_MODEL_NAME`	Model used for issue analysis	`gpt-4o-mini`
`LLM_API_KEY`	API key for the LLM provider	`sk-…`
`GITHUB_OAUTH_CLIENT_ID` / `_SECRET`	GitHub login app credentials	—
`GOOGLE_OAUTH_CLIENT_ID` / `_SECRET`	Google login app credentials	—
`SESSION_SECRET`	Random ≥32-char string; also gates `/cron/run` admin endpoint	—
`PUBLIC_GA_MEASUREMENT_ID`	Optional Google Analytics 4 measurement ID; leave empty to disable GA	`G-…`

OAuth callback URLs

When registering OAuth apps:

GitHub → https://agentwatch.aicompass.dev/auth/github/callback
Google → https://agentwatch.aicompass.dev/auth/google/callback

LLM provider tips

LLM_BASE_URL accepts any OpenAI-Chat-Completions-compatible endpoint. Tested with:

OpenAI (https://api.openai.com/v1)
Anthropic via proxy (e.g. https://api.anthropic.com/v1 with a compat shim)
Self-hosted (https://your-host/v1)

If LLM_API_KEY is unset the worker still polls and stores issues, but every analysis defaults to neutral / confidence 0 so versions show 5 / grey.

Deploying to Cloudflare

# 1. Create the D1 database (capture the printed ID)
npx wrangler d1 create agent-watch
# → paste database_id into wrangler.jsonc

# 2. Create the KV namespace
npx wrangler kv namespace create CACHE
# → paste id into wrangler.jsonc

# 3. Apply migrations
npm run db:migrate         # remote
npm run db:migrate:local   # local emulator

# 4. Set secrets (NEVER commit these)
npx wrangler secret put GITHUB_TOKEN
npx wrangler secret put LLM_API_KEY
npx wrangler secret put GITHUB_OAUTH_CLIENT_ID
npx wrangler secret put GITHUB_OAUTH_CLIENT_SECRET
npx wrangler secret put GOOGLE_OAUTH_CLIENT_ID
npx wrangler secret put GOOGLE_OAUTH_CLIENT_SECRET
npx wrangler secret put SESSION_SECRET

# 5. Build & deploy
npm run deploy

Custom domain `agentwatch.aicompass.dev`

wrangler.jsonc already declares a custom-domain route. Make sure:

The aicompass.dev zone exists in your Cloudflare account.
After the first wrangler deploy, attach the route in Workers → agent-watch → Settings → Domains & Routes (or wrangler will provision the certificate automatically when the route block above is present).
Update PUBLIC_BASE_URL to match — and update the OAuth app callback URLs.

Architecture

┌──────────────┐     hourly cron     ┌──────────────────────┐
│   GitHub     │ ◀────────────────── │  Workers scheduled() │
│   REST API   │                     │     (poll.ts)        │
└──────────────┘                     └─────────┬────────────┘
                                               │
                                               ▼
┌──────────────┐                     ┌──────────────────────┐
│  LLM (any    │ ◀─────────────────  │   analyzeIssue()     │
│  OAI-compat) │                     │   sentiment + tag    │
└──────────────┘                     └─────────┬────────────┘
                                               │
                                               ▼
                                     ┌──────────────────────┐
                                     │   D1 (sqlite)        │
                                     │   versions / issues  │
                                     │   analyses / ratings │
                                     └─────────┬────────────┘
                                               │ HTTP API
                                               ▼
                                     ┌──────────────────────┐
                                     │ static SPA dashboard │
                                     │ (Cloudflare Assets)  │
                                     └──────────────────────┘

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.omc		.omc
.omx		.omx
.smoke		.smoke
migrations		migrations
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
spec.md		spec.md
tsconfig.frontend.json		tsconfig.frontend.json
tsconfig.json		tsconfig.json
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Watch

How it works

Local development

Configuration

OAuth callback URLs

LLM provider tips

Deploying to Cloudflare

Custom domain `agentwatch.aicompass.dev`

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent Watch

How it works

Local development

Configuration

OAuth callback URLs

LLM provider tips

Deploying to Cloudflare

Custom domain agentwatch.aicompass.dev

Architecture

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Custom domain `agentwatch.aicompass.dev`

Packages