See what Google sees.
Self-hosted indexing diagnostics for your sitemap, in one docker compose up.
Quick Start · What It Does · Prerequisites · Troubleshooting · beacon.brianonai.com
Built by Brian Diamond — fractional CAIO, builder of Onaro and The CAIO Brief.
Beacon answers a question Google Search Console makes harder than it should be:
"Of the URLs in my sitemap, which ones is Google actually indexing — and which ones isn't it?"
Point Beacon at your domain. It will:
- Auto-discover your sitemap from
robots.txt(or accept a paste) - Authenticate against your own Google Search Console via OAuth
- Inspect each URL through the official URL Inspection API
- Optionally fetch each URL with HTTP to detect dead pages, redirects, and stale content
- Show you a clean breakdown: Indexed, Unknown, Discovered, Crawled-Not-Indexed, Excluded, Errors
No data ever leaves your machine. No SaaS account. No subscription. Your Google credentials, your data, your hardware.
Completed scans for the same property + sitemap pair can reload instantly from your browser’s local cache (optional Re-scan clears it and runs a fresh SSE stream).
The SEO industry has spent twenty years building tools for the gap between Indexed and Ranking: keyword research, content optimization, backlink analysis, technical audits. All valuable. All predicated on Google having indexed your URLs in the first place.
Beacon is the prior-question tool. It checks stage zero before you spend on stage three.
| Stage | What it means |
|---|---|
| Published | The URL exists on your server. Returns 200. |
| Submitted | The URL is in your sitemap and GSC knows about it. |
| Crawled | Google has fetched the page at least once. |
| Indexed | Google has decided the page is worth showing in results. |
Beacon shows you which stage every URL in your sitemap is at. That's the whole tool.
- You manage a site and want to know if Google has the URLs from your sitemap. GSC tells you per-URL but doesn't roll it up — Beacon does.
- You're an agency or freelancer auditing a client. Run Beacon locally, share the export. No vendor lock-in, no client data in third-party SaaS.
- You're diagnosing why traffic is flat. Beacon surfaces the gap between what you think you've published and what Google has crawled and indexed.
- You don't want to pay $99-200/mo for an SEO suite to see this one screen. Fair.
- Not a replacement for Search Console (use both)
- Not a keyword research tool
- Not a backlink analyzer
- Not a content optimizer
- Not a ranking tracker
Beacon does one thing: shows you the delta between your sitemap and Google's index.
- Docker with Compose v2 (
docker compose version) - A Google account with access to at least one Search Console property
- ~10 minutes for first-time OAuth setup
You need Docker, a Google account, and ~10 minutes for first-time setup. Once configured, scans take 30–120 seconds for typical sites.
Beacon runs in Docker, which means you don't have to install Python, Node.js, or any other dependencies.
After installing, open Docker Desktop once to make sure it's running. You'll see a whale icon in your menu bar (Mac) or system tray (Windows) when active.
Verify with:
docker --version
docker compose versionIf both return version numbers, you're ready.
git clone https://github.com/brianonai/beacon.git
cd beaconThe repo includes a template called .env.example. Copy it to .env, which is where you'll put your actual credentials. The .env file is gitignored — your secrets stay on your machine.
PowerShell (Windows):
Copy-Item .env.example .envBash (Mac/Linux):
cp .env.example .envOpen .env in any text editor. Fill in GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, and SESSION_SECRET. The walkthrough is in Google OAuth Setup below.
This repository’s Docker Compose uses host ports 13000 (web) and 18080 (API) so Windows users avoid Hyper-V reserved port ranges on :8000. The template already sets:
OAUTH_REDIRECT_URI=http://localhost:18080/auth/google/callbackPOST_LOGIN_REDIRECT=http://localhost:13000/ALLOWED_ORIGIN=http://localhost:13000
OAUTHLIB_INSECURE_TRANSPORT=1 must stay on for local HTTP. Never enable that on a public server.
docker compose up -dThe first run takes a minute or two while Docker pulls images. Subsequent starts are nearly instant.
Click Connect Search Console, authorize Beacon, and run your first scan.
This is a one-time setup (~5 minutes). You're creating an OAuth client that lets Beacon read your Search Console data on your behalf. Beacon never stores your Google password — only OAuth tokens in a signed session cookie after you authorize.
- Go to https://console.cloud.google.com/
- Click the project dropdown at the top → New Project
- Name it anything (e.g., "Beacon Local")
- Click Create and wait ~30 seconds
- Make sure the new project is selected in the project dropdown
- In the left sidebar: APIs & Services → Library
- Search for "Google Search Console API"
- Click it, then click Enable
Without this, you may get a confusing error at scan time instead of during OAuth.
- APIs & Services → OAuth consent screen
- Choose External → Create
- Fill in App name, User support email, Developer contact
- Click Save and Continue through scopes (defaults ok). Under Test users, add your Google account if the app stays in testing.
- Optionally Publish App if you want to remove the 100-user test cap (Beacon is still local-only).
-
APIs & Services → Credentials
-
+ Create Credentials → OAuth client ID
-
Application type: Web application
-
Name: Beacon Local (or anything)
-
Authorized redirect URIs — add exactly (matches this repo’s Compose defaults):
http://localhost:18080/auth/google/callback -
Click Create and copy the Client ID and Client Secret
GOOGLE_CLIENT_ID=your-client-id-here.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=GOCSPX-your-secret-here
SESSION_SECRET=your-random-32-char-string
OAUTHLIB_INSECURE_TRANSPORT=1
OAUTH_REDIRECT_URI=http://localhost:18080/auth/google/callback
POST_LOGIN_REDIRECT=http://localhost:13000/
ALLOWED_ORIGIN=http://localhost:13000Save the file.
docker compose restartOpen http://localhost:13000 and sign in again.
You may see Google hasn't verified this app. Click Advanced → Go to … (unsafe). Expected for a personal OAuth client.
Beacon uses the Search Console URL Inspection API (read-only) for per-URL indexing state. Beacon does not use the Indexing API to request crawls in V1.
URL Inspection is limited (on the order of ~2,000 inspections per day per property — confirm current limits in Google’s docs). Each Beacon scan inspects every URL in your chosen sitemap once.
| Sitemap size | Rough daily scan budget |
|---|---|
| Up to ~100 URLs | Many re-scans |
| Up to ~500 URLs | Fewer |
| Near quota limit | Plan partial sitemaps or stagger days |
For large sites, prefer segment sitemaps (blog vs products). See the Chilistation write-up.
- Pick a property from the dropdown (sites verified in your Search Console).
- Sitemap discovery — Beacon checks
robots.txtforSitemap:, then common paths. Paste a URL if auto-discovery misses. - Run scan — sitemap is parsed (including one-level sitemap indexes), each URL is inspected via GSC, then HTTP-checked.
- Progress — two phases: GSC inspection and page checks (hidden when loading from cache).
- Results — states include Indexed, Unknown, Discovered-not-indexed, Crawled-not-indexed, Excluded, Errors, and phantom URLs (404 while not indexed).
- Filters — chips above the table; row links open GSC URL Inspection where available.
- Export CSV — for analysis or client handoff.
| Pattern | Likely meaning | What to do |
|---|---|---|
| Lots of Unknown | New site or sitemap not crawled yet | Wait, re-scan, submit sitemap in GSC |
| Lots of Discovered / Crawled-not-indexed | Quality/selectivity signals | Content, internal links, differentiation |
| Phantom URLs | Sitemap lists dead pages | Remove from sitemap |
| High Excluded | Often canonical / noindex | Verify intentional |
| Variable | Required? | Default | Purpose |
|---|---|---|---|
GOOGLE_CLIENT_ID |
Yes | — | OAuth client ID |
GOOGLE_CLIENT_SECRET |
Yes | — | OAuth client secret |
SESSION_SECRET |
Yes | — | Signs session cookies (32+ random chars) |
OAUTHLIB_INSECURE_TRANSPORT |
Yes (local HTTP) | 1 |
Localhost only |
OAUTH_REDIRECT_URI |
Yes (Docker defaults) | see .env.example |
Must match Google Console |
POST_LOGIN_REDIRECT |
Yes | http://localhost:13000/ |
After OAuth callback |
ALLOWED_ORIGIN |
Yes | http://localhost:13000/ |
CORS for the web app |
INSPECT_CONCURRENCY |
No | 10 |
Parallel GSC inspections |
INSPECT_JITTER_MS |
No | 100 |
Delay jitter between batches |
STALE_DAYS |
No | 180 |
UI “stale” threshold |
PAGE_CHECK_CONCURRENCY |
No | 5 |
Parallel HTTP page fetches |
USER_AGENT |
No | BeaconBot/0.1… |
UA for page checks |
MAX_REDIRECTS |
No | 5 |
Max redirects per URL |
MICROLINK_API_KEY |
No | — | Optional Microlink Pro key for previews |
BEACON_TELEMETRY |
No | false |
Reserved for optional install ping |
| Service | Host port | Container | Override |
|---|---|---|---|
| Web (Next.js) | 13000 | 3000 | docker-compose.yml ports mapping |
| API (FastAPI) | 18080 | 8000 | Same — OAuth redirect must use 18080 on the host |
Short fixes live here; expanded steps: docs/TROUBLESHOOTING.md.
Add OAUTHLIB_INSECURE_TRANSPORT=1 to .env, restart: docker compose restart.
The Console Authorized redirect URI must match OAUTH_REDIRECT_URI exactly, including http, port 18080, path /auth/google/callback, and no trailing slash.
Your Google user must be an owner or full user of at least one Search Console property.
New site, or wrong property variant (www vs apex, http vs https). Pick the property that matches your sitemap.
Change host ports in docker-compose.yml, then update OAUTH_REDIRECT_URI, ALLOWED_ORIGIN, POST_LOGIN_REDIRECT, and the Google Cloud redirect URI to stay consistent.
Open a GitHub issue with OS, Docker version, Beacon version, and redacted docker compose logs api / web.
- Runs on your hardware; scan payloads stay in your browser session and optional localStorage cache (same machine).
- OAuth tokens live in a signed HTTP-only session cookie (
SESSION_SECRET). - API traffic goes to Google and to URLs you scan (HTTP checks). Optional Microlink for site previews.
BEACON_TELEMETRY: recognized in config; no telemetry request is sent by the app today (endpoint reserved).
Full detail: docs/PRIVACY.md.
Beacon is built and maintained by Brian Diamond.
If you find it useful:
- 📬 The CAIO Brief — AI governance & technical diligence
- 🛠️ Contact — fractional CAIO, governance, indexing at scale
- ⭐ Star the repo on GitHub
- ✅ V1: sitemap vs GSC delta, page checks, CSV export, local scan cache
- 🚧 V2: scan history, week-over-week deltas, scheduled scans + email
- 💭 Beacon Cloud: waitlist
MIT.
docs/INSTALLATION.md— deep install walkthroughdocs/USAGE.md— end-to-end scan guidedocs/TROUBLESHOOTING.md— errors and OS quirksdocs/PRIVACY.md— data, cookies, cache, Google visibility