Skip to content

observability: indexing dashboard polish — realm column, static gauge, longest-jobs move#4821

Merged
habdelra merged 3 commits into
mainfrom
worktree-grafana-indexing-realm-column
May 14, 2026
Merged

observability: indexing dashboard polish — realm column, static gauge, longest-jobs move#4821
habdelra merged 3 commits into
mainfrom
worktree-grafana-indexing-realm-column

Conversation

@habdelra
Copy link
Copy Markdown
Contributor

Summary

  • Realm column fix. Active Indexing and Per-realm indexing status used to leave the realm cell blank for root-of-host (published) realms because their realmURL has no path segment after the host. The column now falls back to the first host label (e.g. https://buckpublishedsep8.staging.boxel.build/buckpublishedsep8), so every row has a realm.
  • Queued Indexing Jobs gains a leading realm column (with (all realms) for full-reindex). The pre-existing concurrency_group column stays for operators that lean on it.
  • Static gauge in Active Indexing. The percent gauge cell now hides its inline value (valueDisplayMode: hidden), so the bar bounds stay at full cell width instead of shrinking around the label. The percentage moves to a small adjacent left-aligned column with a blank header so the digits sit flush against the bar.
  • Longest indexing jobs (24h) move. The two Longest from-scratch-index / incremental-index jobs (24h) panels move from the Job Queue dashboard to the Indexing dashboard (the more natural home), with their realm extraction unified to the same fallback used elsewhere in this dashboard.

Test plan

  • Apply to staging: pnpm --filter @cardstack/observability apply --env staging (or local equivalent) and open the Indexing dashboard.
  • Active Indexing — every row shows a realm, including a published / root-of-host realm; the gauge track is the same width on every row; the digits next to the gauge change but the bar bounds don't.
  • Per-realm indexing status — no blank realm cells.
  • Queued Indexing Jobs — realm column populated for from-scratch-index / incremental-index; shows (all realms) for a queued full-reindex.
  • Longest from-scratch / incremental jobs (24h) — present on the Indexing dashboard (and no longer on Job Queue).

🤖 Generated with Claude Code

…, longest-jobs move

- Active Indexing & Per-realm indexing status: realm column derives from
  args.realmURL with a host-subdomain fallback so root-of-host (published)
  realms no longer render blank (e.g. `buckpublishedsep8`).
- Queued Indexing Jobs: add a leading `realm` column (`(all realms)` for
  full-reindex). The pre-existing `concurrency_group` column stays for
  operators that lean on it.
- Active Indexing percent column: gauge cell now hides its inline value
  (`valueDisplayMode: hidden`) so the bar bounds stay at full cell width
  instead of shrinking around the label. The percentage moves to a small
  adjacent left-aligned column with a blank header so the digits sit
  flush against the bar.
- Longest from-scratch-index / incremental-index jobs (24h) panels move
  from the Job Queue dashboard to the Indexing dashboard (the more
  natural home), with their realm extraction unified to the same
  fallback used elsewhere in this dashboard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 13, 2026

Observability diff (vs staging)

diff --git a/tmp/remote-canon.8rrtDx/dashboards/boxel-status/indexing.json b/tmp/committed-canon.AZBm7q/dashboards/boxel-status/indexing.json
index 86c9743..378bd40 100644
--- a/tmp/remote-canon.8rrtDx/dashboards/boxel-status/indexing.json
+++ b/tmp/committed-canon.AZBm7q/dashboards/boxel-status/indexing.json
@@ -69,6 +69,10 @@
           "uid": "cef5v5sl9k7i8f"
         },
         "description": "System-wide operator action: queue a full reindex across every realm. The button disables itself while a `full-reindex` orchestration job is already pending or running. Per-realm reindex moved to the Realms dashboard. Click POSTs with `Authorization: Bearer ${grafana_secret}` (substituted from SSM at apply time, CS-10929).",
+        "fieldConfig": {
+          "defaults": {},
+          "overrides": []
+        },
         "gridPos": {
           "h": 8,
           "w": 24,
@@ -629,7 +633,8 @@
                   "id": "custom.cellOptions",
                   "value": {
                     "mode": "gradient",
-                    "type": "gauge"
+                    "type": "gauge",
+                    "valueDisplayMode": "hidden"
                   }
                 },
                 {
@@ -646,6 +651,42 @@
                 }
               ]
             },
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "pct"
+              },
+              "properties": [
+                {
+                  "id": "unit",
+                  "value": "percent"
+                },
+                {
+                  "id": "decimals",
+                  "value": 1
+                },
+                {
+                  "id": "displayName",
+                  "value": " "
+                },
+                {
+                  "id": "custom.align",
+                  "value": "left"
+                },
+                {
+                  "id": "custom.minWidth",
+                  "value": 70
+                },
+                {
+                  "id": "custom.width",
+                  "value": 70
+                },
+                {
+                  "id": "custom.filterable",
+                  "value": false
+                }
+              ]
+            },
             {
               "matcher": {
                 "id": "byName",
@@ -766,7 +807,7 @@
             "editorMode": "code",
             "format": "table",
             "rawQuery": true,
-            "rawSql": "SELECT\n  j.id AS job_id,\n  RTRIM(REGEXP_REPLACE(j.concurrency_group, '^indexing:https?://[^/]+/', ''), '/') AS realm,\n  COALESCE(j.args->>'realmURL','') AS realm_url,\n  j.job_type,\n  COALESCE(jp.files_completed, 0) AS files_completed,\n  COALESCE(jp.total_files, 0) AS total_files,\n  CASE WHEN COALESCE(jp.total_files, 0) > 0\n    THEN (jp.files_completed::float / jp.total_files) * 100\n    ELSE 0\n  END AS percent,\n  EXTRACT(EPOCH FROM (NOW() - jr.created_at)) AS elapsed_seconds,\n  jr.created_at AS started_at,\n  jr.worker_id,\n  jr.id AS reservation_id\n FROM jobs j\n JOIN job_reservations jr ON jr.job_id = j.id\n   AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n LEFT JOIN job_progress jp ON jp.job_id = j.id\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n   AND j.finished_at IS NULL\n ORDER BY jr.created_at DESC;",
+            "rawSql": "SELECT\n  job_id,\n  realm,\n  realm_url,\n  job_type,\n  files_completed,\n  total_files,\n  percent,\n  percent AS pct,\n  elapsed_seconds,\n  started_at,\n  worker_id,\n  reservation_id\nFROM (\n  SELECT\n    j.id AS job_id,\n    COALESCE(\n      NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n      REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\1')\n    ) AS realm,\n    COALESCE(j.args->>'realmURL','') AS realm_url,\n    j.job_type,\n    COALESCE(jp.files_completed, 0) AS files_completed,\n    COALESCE(jp.total_files, 0) AS total_files,\n    CASE WHEN COALESCE(jp.total_files, 0) > 0\n      THEN (jp.files_completed::float / jp.total_files) * 100\n      ELSE 0\n    END AS percent,\n    EXTRACT(EPOCH FROM (NOW() - jr.created_at)) AS elapsed_seconds,\n    jr.created_at AS started_at,\n    jr.worker_id,\n    jr.id AS reservation_id\n  FROM jobs j\n  JOIN job_reservations jr ON jr.job_id = j.id\n    AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n  LEFT JOIN job_progress jp ON jp.job_id = j.id\n  WHERE j.job_type IN ('from-scratch-index','incremental-index')\n    AND j.finished_at IS NULL\n) active\nORDER BY started_at DESC;",
             "refId": "A"
           }
         ],
@@ -898,7 +939,7 @@
             "editorMode": "code",
             "format": "table",
             "rawQuery": true,
-            "rawSql": "SELECT\n  RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/') AS realm,\n  COALESCE(j.args->>'realmURL','') AS realm_url,\n  COUNT(*) FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NULL) AS pending,\n  COUNT(*) FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NOT NULL) AS in_flight,\n  MAX(j.finished_at) AS last_completed_at,\n  EXTRACT(EPOCH FROM (NOW() - MIN(j.created_at)\n    FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NULL))) AS oldest_pending_seconds\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n   AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n GROUP BY j.args->>'realmURL'\n ORDER BY pending DESC, in_flight DESC, last_completed_at DESC NULLS LAST\n LIMIT 200;",
+            "rawSql": "SELECT\n  COALESCE(\n    NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n    REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\1')\n  ) AS realm,\n  COALESCE(j.args->>'realmURL','') AS realm_url,\n  COUNT(*) FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NULL) AS pending,\n  COUNT(*) FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NOT NULL) AS in_flight,\n  MAX(j.finished_at) AS last_completed_at,\n  EXTRACT(EPOCH FROM (NOW() - MIN(j.created_at)\n    FILTER (WHERE j.status = 'unfulfilled' AND jr.id IS NULL))) AS oldest_pending_seconds\n FROM jobs j\n LEFT JOIN job_reservations jr ON j.id = jr.job_id\n   AND jr.completed_at IS NULL AND jr.locked_until > NOW()\n WHERE j.job_type IN ('from-scratch-index','incremental-index')\n GROUP BY j.args->>'realmURL'\n ORDER BY pending DESC, in_flight DESC, last_completed_at DESC NULLS LAST\n LIMIT 200;",
             "refId": "A"
           }
         ],
@@ -1034,7 +1075,7 @@
             "editorMode": "code",
             "format": "table",
             "rawQuery": true,
-            "rawSql": "SELECT \n  j.id, \n  j.priority, \n  j.job_type, \n  CASE \n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '') \n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '') \n    ELSE j.concurrency_group \n  END AS concurrency_group, \n  j.status AS status, \n  j.created_at AS created_at, \n\n\n  -- Wait time in seconds\n    CASE \n      WHEN jr.created_at IS NOT NULL \n        THEN EXTRACT(EPOCH FROM (jr.created_at - j.created_at))\n      ELSE \n        EXTRACT(EPOCH FROM (NOW() - j.created_at))\n    END\n    AS wait_seconds,\n j.id as job_id\n\nFROM \n  jobs j\n  \nLEFT JOIN \n  job_reservations jr ON j.id = jr.job_id\n\nWHERE\njr.job_id IS NULL AND j.status = 'unfulfilled' AND j.job_type IN ('from-scratch-index','incremental-index','full-reindex') \n  \nORDER BY \n  j.created_at ASC\nLIMIT 500;",
+            "rawSql": "SELECT \n  j.id, \n  j.priority, \n  j.job_type, \n  CASE\n    WHEN j.job_type = 'full-reindex' THEN '(all realms)'\n    ELSE COALESCE(\n      NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n      REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\1')\n    )\n  END AS realm,\n  CASE \n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '') \n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '') \n    ELSE j.concurrency_group \n  END AS concurrency_group, \n  j.status AS status, \n  j.created_at AS created_at, \n\n\n  -- Wait time in seconds\n    CASE \n      WHEN jr.created_at IS NOT NULL \n        THEN EXTRACT(EPOCH FROM (jr.created_at - j.created_at))\n      ELSE \n        EXTRACT(EPOCH FROM (NOW() - j.created_at))\n    END\n    AS wait_seconds,\n j.id as job_id\n\nFROM \n  jobs j\n  \nLEFT JOIN \n  job_reservations jr ON j.id = jr.job_id\n\nWHERE\njr.job_id IS NULL AND j.status = 'unfulfilled' AND j.job_type IN ('from-scratch-index','incremental-index','full-reindex') \n  \nORDER BY \n  j.created_at ASC\nLIMIT 500;",
             "refId": "A",
             "sql": {
               "columns": [
@@ -1057,6 +1098,280 @@
         ],
         "title": "Queued Indexing Jobs",
         "type": "table"
+      },
+      {
+        "datasource": {
+          "type": "grafana-postgresql-datasource",
+          "uid": "cef5v5sl9k7i8f"
+        },
+        "description": "Top 10 from-scratch-index jobs that finished in the past 24 hours, ranked by run duration. `run_seconds` measures from the final reservation's `created_at` to `finished_at`, so a job that was retried is timed on the attempt that completed it. Use this to spot realms whose full reindex is regressing.",
+        "fieldConfig": {
+          "defaults": {
+            "color": {
+              "mode": "thresholds"
+            },
+            "custom": {
+              "align": "left",
+              "cellOptions": {
+                "type": "auto"
+              },
+              "filterable": true,
+              "inspect": false,
+              "minWidth": 150
+            },
+            "mappings": [],
+            "thresholds": {
+              "mode": "absolute",
+              "steps": [
+                {
+                  "color": "green"
+                },
+                {
+                  "color": "red",
+                  "value": 80
+                }
+              ]
+            }
+          },
+          "overrides": [
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "run_seconds"
+              },
+              "properties": [
+                {
+                  "id": "unit",
+                  "value": "s"
+                }
+              ]
+            },
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "worker_id"
+              },
+              "properties": [
+                {
+                  "id": "links",
+                  "value": [
+                    {
+                      "targetBlank": true,
+                      "title": "View logs",
+                      "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
+                    }
+                  ]
+                },
+                {
+                  "id": "mappings",
+                  "value": [
+                    {
+                      "options": {
+                        "pattern": "^(.{6}).*$",
+                        "result": {
+                          "index": 0,
+                          "text": "View logs ($1)"
+                        }
+                      },
+                      "type": "regex"
+                    }
+                  ]
+                }
+              ]
+            },
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "reservation_id"
+              },
+              "properties": [
+                {
+                  "id": "custom.hidden",
+                  "value": true
+                }
+              ]
+            }
+          ]
+        },
+        "gridPos": {
+          "h": 10,
+          "w": 12,
+          "x": 0,
+          "y": 54
+        },
+        "id": 20,
+        "options": {
+          "cellHeight": "sm",
+          "footer": {
+            "countRows": false,
+            "enablePagination": false,
+            "fields": "",
+            "reducer": [
+              "sum"
+            ],
+            "show": false
+          },
+          "showHeader": true,
+          "sortBy": [
+            {
+              "desc": true,
+              "displayName": "run_seconds"
+            }
+          ]
+        },
+        "pluginVersion": "10.4.1",
+        "targets": [
+          {
+            "datasource": {
+              "type": "grafana-postgresql-datasource",
+              "uid": "cef5v5sl9k7i8f"
+            },
+            "editorMode": "code",
+            "format": "table",
+            "rawQuery": true,
+            "rawSql": "SELECT\n  j.id,\n  lr.reservation_id,\n  CASE\n    WHEN j.job_type = 'full-reindex' THEN '(all realms)'\n    ELSE COALESCE(\n      NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n      REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\1')\n    )\n  END AS realm,\n  j.status,\n  lr.started_at,\n  j.finished_at,\n  EXTRACT(EPOCH FROM (j.finished_at - lr.started_at)) AS run_seconds,\n  lr.worker_id\nFROM jobs j\nJOIN LATERAL (\n  SELECT jr.id AS reservation_id, jr.created_at AS started_at, jr.worker_id\n  FROM job_reservations jr\n  WHERE jr.job_id = j.id\n  ORDER BY jr.created_at DESC\n  LIMIT 1\n) lr ON TRUE\nWHERE j.job_type = 'from-scratch-index'\n  AND j.finished_at IS NOT NULL\n  AND j.finished_at > NOW() - INTERVAL '24 hours'\nORDER BY run_seconds DESC NULLS LAST\nLIMIT 10;",
+            "refId": "A"
+          }
+        ],
+        "title": "Longest from-scratch-index jobs (24h)",
+        "type": "table"
+      },
+      {
+        "datasource": {
+          "type": "grafana-postgresql-datasource",
+          "uid": "cef5v5sl9k7i8f"
+        },
+        "description": "Top 10 incremental-index jobs that finished in the past 24 hours, ranked by run duration. `run_seconds` measures from the final reservation's `created_at` to `finished_at`, so a job that was retried is timed on the attempt that completed it. Outliers here usually mean a heavy invalidation fan-out from a single edit.",
+        "fieldConfig": {
+          "defaults": {
+            "color": {
+              "mode": "thresholds"
+            },
+            "custom": {
+              "align": "left",
+              "cellOptions": {
+                "type": "auto"
+              },
+              "filterable": true,
+              "inspect": false,
+              "minWidth": 150
+            },
+            "mappings": [],
+            "thresholds": {
+              "mode": "absolute",
+              "steps": [
+                {
+                  "color": "green"
+                },
+                {
+                  "color": "red",
+                  "value": 80
+                }
+              ]
+            }
+          },
+          "overrides": [
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "run_seconds"
+              },
+              "properties": [
+                {
+                  "id": "unit",
+                  "value": "s"
+                }
+              ]
+            },
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "worker_id"
+              },
+              "properties": [
+                {
+                  "id": "links",
+                  "value": [
+                    {
+                      "targetBlank": true,
+                      "title": "View logs",
+                      "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
+                    }
+                  ]
+                },
+                {
+                  "id": "mappings",
+                  "value": [
+                    {
+                      "options": {
+                        "pattern": "^(.{6}).*$",
+                        "result": {
+                          "index": 0,
+                          "text": "View logs ($1)"
+                        }
+                      },
+                      "type": "regex"
+                    }
+                  ]
+                }
+              ]
+            },
+            {
+              "matcher": {
+                "id": "byName",
+                "options": "reservation_id"
+              },
+              "properties": [
+                {
+                  "id": "custom.hidden",
+                  "value": true
+                }
+              ]
+            }
+          ]
+        },
+        "gridPos": {
+          "h": 10,
+          "w": 12,
+          "x": 12,
+          "y": 54
+        },
+        "id": 21,
+        "options": {
+          "cellHeight": "sm",
+          "footer": {
+            "countRows": false,
+            "enablePagination": false,
+            "fields": "",
+            "reducer": [
+              "sum"
+            ],
+            "show": false
+          },
+          "showHeader": true,
+          "sortBy": [
+            {
+              "desc": true,
+              "displayName": "run_seconds"
+            }
+          ]
+        },
+        "pluginVersion": "10.4.1",
+        "targets": [
+          {
+            "datasource": {
+              "type": "grafana-postgresql-datasource",
+              "uid": "cef5v5sl9k7i8f"
+            },
+            "editorMode": "code",
+            "format": "table",
+            "rawQuery": true,
+            "rawSql": "SELECT\n  j.id,\n  lr.reservation_id,\n  CASE\n    WHEN j.job_type = 'full-reindex' THEN '(all realms)'\n    ELSE COALESCE(\n      NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n      REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\1')\n    )\n  END AS realm,\n  j.status,\n  lr.started_at,\n  j.finished_at,\n  EXTRACT(EPOCH FROM (j.finished_at - lr.started_at)) AS run_seconds,\n  lr.worker_id\nFROM jobs j\nJOIN LATERAL (\n  SELECT jr.id AS reservation_id, jr.created_at AS started_at, jr.worker_id\n  FROM job_reservations jr\n  WHERE jr.job_id = j.id\n  ORDER BY jr.created_at DESC\n  LIMIT 1\n) lr ON TRUE\nWHERE j.job_type = 'incremental-index'\n  AND j.finished_at IS NOT NULL\n  AND j.finished_at > NOW() - INTERVAL '24 hours'\nORDER BY run_seconds DESC NULLS LAST\nLIMIT 10;",
+            "refId": "A"
+          }
+        ],
+        "title": "Longest incremental-index jobs (24h)",
+        "type": "table"
       }
     ],
     "refresh": "5s",
diff --git a/tmp/remote-canon.8rrtDx/dashboards/boxel-status/job-queue.json b/tmp/committed-canon.AZBm7q/dashboards/boxel-status/job-queue.json
index a2e16cc..d2de711 100644
--- a/tmp/remote-canon.8rrtDx/dashboards/boxel-status/job-queue.json
+++ b/tmp/committed-canon.AZBm7q/dashboards/boxel-status/job-queue.json
@@ -1112,280 +1112,6 @@
         ],
         "title": "Finished Jobs (limit 500)",
         "type": "table"
-      },
-      {
-        "datasource": {
-          "type": "grafana-postgresql-datasource",
-          "uid": "cef5v5sl9k7i8f"
-        },
-        "description": "Top 10 from-scratch-index jobs that finished in the past 24 hours, ranked by run duration. `run_seconds` measures from the final reservation's `created_at` to `finished_at`, so a job that was retried is timed on the attempt that completed it. Use this to spot realms whose full reindex is regressing.",
-        "fieldConfig": {
-          "defaults": {
-            "color": {
-              "mode": "thresholds"
-            },
-            "custom": {
-              "align": "left",
-              "cellOptions": {
-                "type": "auto"
-              },
-              "filterable": true,
-              "inspect": false,
-              "minWidth": 150
-            },
-            "mappings": [],
-            "thresholds": {
-              "mode": "absolute",
-              "steps": [
-                {
-                  "color": "green"
-                },
-                {
-                  "color": "red",
-                  "value": 80
-                }
-              ]
-            }
-          },
-          "overrides": [
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "run_seconds"
-              },
-              "properties": [
-                {
-                  "id": "unit",
-                  "value": "s"
-                }
-              ]
-            },
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "worker_id"
-              },
-              "properties": [
-                {
-                  "id": "links",
-                  "value": [
-                    {
-                      "targetBlank": true,
-                      "title": "View logs",
-                      "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
-                    }
-                  ]
-                },
-                {
-                  "id": "mappings",
-                  "value": [
-                    {
-                      "options": {
-                        "pattern": "^(.{6}).*$",
-                        "result": {
-                          "index": 0,
-                          "text": "View logs ($1)"
-                        }
-                      },
-                      "type": "regex"
-                    }
-                  ]
-                }
-              ]
-            },
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "reservation_id"
-              },
-              "properties": [
-                {
-                  "id": "custom.hidden",
-                  "value": true
-                }
-              ]
-            }
-          ]
-        },
-        "gridPos": {
-          "h": 10,
-          "w": 12,
-          "x": 0,
-          "y": 58
-        },
-        "id": 20,
-        "options": {
-          "cellHeight": "sm",
-          "footer": {
-            "countRows": false,
-            "enablePagination": false,
-            "fields": "",
-            "reducer": [
-              "sum"
-            ],
-            "show": false
-          },
-          "showHeader": true,
-          "sortBy": [
-            {
-              "desc": true,
-              "displayName": "run_seconds"
-            }
-          ]
-        },
-        "pluginVersion": "10.4.1",
-        "targets": [
-          {
-            "datasource": {
-              "type": "grafana-postgresql-datasource",
-              "uid": "cef5v5sl9k7i8f"
-            },
-            "editorMode": "code",
-            "format": "table",
-            "rawQuery": true,
-            "rawSql": "SELECT\n  j.id,\n  lr.reservation_id,\n  CASE\n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '')\n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '')\n    ELSE j.concurrency_group\n  END AS realm,\n  j.status,\n  lr.started_at,\n  j.finished_at,\n  EXTRACT(EPOCH FROM (j.finished_at - lr.started_at)) AS run_seconds,\n  lr.worker_id\nFROM jobs j\nJOIN LATERAL (\n  SELECT jr.id AS reservation_id, jr.created_at AS started_at, jr.worker_id\n  FROM job_reservations jr\n  WHERE jr.job_id = j.id\n  ORDER BY jr.created_at DESC\n  LIMIT 1\n) lr ON TRUE\nWHERE j.job_type = 'from-scratch-index'\n  AND j.finished_at IS NOT NULL\n  AND j.finished_at > NOW() - INTERVAL '24 hours'\nORDER BY run_seconds DESC NULLS LAST\nLIMIT 10;",
-            "refId": "A"
-          }
-        ],
-        "title": "Longest from-scratch-index jobs (24h)",
-        "type": "table"
-      },
-      {
-        "datasource": {
-          "type": "grafana-postgresql-datasource",
-          "uid": "cef5v5sl9k7i8f"
-        },
-        "description": "Top 10 incremental-index jobs that finished in the past 24 hours, ranked by run duration. `run_seconds` measures from the final reservation's `created_at` to `finished_at`, so a job that was retried is timed on the attempt that completed it. Outliers here usually mean a heavy invalidation fan-out from a single edit.",
-        "fieldConfig": {
-          "defaults": {
-            "color": {
-              "mode": "thresholds"
-            },
-            "custom": {
-              "align": "left",
-              "cellOptions": {
-                "type": "auto"
-              },
-              "filterable": true,
-              "inspect": false,
-              "minWidth": 150
-            },
-            "mappings": [],
-            "thresholds": {
-              "mode": "absolute",
-              "steps": [
-                {
-                  "color": "green"
-                },
-                {
-                  "color": "red",
-                  "value": 80
-                }
-              ]
-            }
-          },
-          "overrides": [
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "run_seconds"
-              },
-              "properties": [
-                {
-                  "id": "unit",
-                  "value": "s"
-                }
-              ]
-            },
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "worker_id"
-              },
-              "properties": [
-                {
-                  "id": "links",
-                  "value": [
-                    {
-                      "targetBlank": true,
-                      "title": "View logs",
-                      "url": "/d/fetquzizsej28b?${__url_time_range}&var-job_id=${__data.fields.id}.${__data.fields.reservation_id}&orgId=1&viewPanel=3"
-                    }
-                  ]
-                },
-                {
-                  "id": "mappings",
-                  "value": [
-                    {
-                      "options": {
-                        "pattern": "^(.{6}).*$",
-                        "result": {
-                          "index": 0,
-                          "text": "View logs ($1)"
-                        }
-                      },
-                      "type": "regex"
-                    }
-                  ]
-                }
-              ]
-            },
-            {
-              "matcher": {
-                "id": "byName",
-                "options": "reservation_id"
-              },
-              "properties": [
-                {
-                  "id": "custom.hidden",
-                  "value": true
-                }
-              ]
-            }
-          ]
-        },
-        "gridPos": {
-          "h": 10,
-          "w": 12,
-          "x": 12,
-          "y": 58
-        },
-        "id": 21,
-        "options": {
-          "cellHeight": "sm",
-          "footer": {
-            "countRows": false,
-            "enablePagination": false,
-            "fields": "",
-            "reducer": [
-              "sum"
-            ],
-            "show": false
-          },
-          "showHeader": true,
-          "sortBy": [
-            {
-              "desc": true,
-              "displayName": "run_seconds"
-            }
-          ]
-        },
-        "pluginVersion": "10.4.1",
-        "targets": [
-          {
-            "datasource": {
-              "type": "grafana-postgresql-datasource",
-              "uid": "cef5v5sl9k7i8f"
-            },
-            "editorMode": "code",
-            "format": "table",
-            "rawQuery": true,
-            "rawSql": "SELECT\n  j.id,\n  lr.reservation_id,\n  CASE\n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/.+' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://[^/]+/', '')\n    WHEN j.concurrency_group ~ '^indexing:https://[^/]+/?$' THEN REGEXP_REPLACE(j.concurrency_group, '^indexing:https://', '')\n    ELSE j.concurrency_group\n  END AS realm,\n  j.status,\n  lr.started_at,\n  j.finished_at,\n  EXTRACT(EPOCH FROM (j.finished_at - lr.started_at)) AS run_seconds,\n  lr.worker_id\nFROM jobs j\nJOIN LATERAL (\n  SELECT jr.id AS reservation_id, jr.created_at AS started_at, jr.worker_id\n  FROM job_reservations jr\n  WHERE jr.job_id = j.id\n  ORDER BY jr.created_at DESC\n  LIMIT 1\n) lr ON TRUE\nWHERE j.job_type = 'incremental-index'\n  AND j.finished_at IS NOT NULL\n  AND j.finished_at > NOW() - INTERVAL '24 hours'\nORDER BY run_seconds DESC NULLS LAST\nLIMIT 10;",
-            "refId": "A"
-          }
-        ],
-        "title": "Longest incremental-index jobs (24h)",
-        "type": "table"
       }
     ],
     "refresh": "5s",

(Run: https://github.com/cardstack/boxel/actions/runs/25835910315)

@habdelra habdelra requested a review from Copilot May 13, 2026 20:40
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f65383ce09

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/observability/grafanactl/resources/dashboards/boxel-status/indexing.json Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Polish pass on the Grafana Indexing dashboard: ensure the realm column is always populated (even for root-of-host published realms), make the Active Indexing percent gauge render at full cell width with a separate numeric column, give Queued Indexing Jobs a dedicated realm column, and relocate the two "Longest …-index jobs (24h)" tables from the Job Queue dashboard to the Indexing dashboard. Also normalizes a number of \u2014 JSON escapes to literal em-dashes.

Changes:

  • Update SQL realm extraction to fall back to the host's first label when the path is empty, and add '(all realms)' for full-reindex.
  • Hide the gauge's inline value and add a new pct column with a blank header for the percentage display.
  • Move the two longest-indexing-jobs (24h) tables from job-queue.json into indexing.json.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
packages/observability/grafanactl/resources/dashboards/boxel-status/job-queue.json Removes the two Longest-jobs panels; replaces \u2014 escapes with em-dash characters.
packages/observability/grafanactl/resources/dashboards/boxel-status/indexing.json Adds the relocated panels, updates Active Indexing/Per-realm/Queued Indexing queries with the realm fallback, configures the static gauge and adjacent pct column.
Comments suppressed due to low confidence (1)

packages/observability/grafanactl/resources/dashboards/boxel-status/indexing.json:1369

  • Same backreference-escaping bug as in the from-scratch panel: '\\\\1' decodes to SQL '\\1', which regexp_replace treats as a literal backslash + 1 rather than as a backreference. Use '\\1' (matching the form used at lines 810/942/1077) so the root-of-host fallback emits the actual hostname first label.
            "rawSql": "SELECT\n  j.id,\n  lr.reservation_id,\n  CASE\n    WHEN j.job_type = 'full-reindex' THEN '(all realms)'\n    ELSE COALESCE(\n      NULLIF(RTRIM(REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://[^/]+/', ''), '/'), ''),\n      REGEXP_REPLACE(COALESCE(j.args->>'realmURL',''), '^https?://([^./:]+).*$', '\\\\1')\n    )\n  END AS realm,\n  j.status,\n  lr.started_at,\n  j.finished_at,\n  EXTRACT(EPOCH FROM (j.finished_at - lr.started_at)) AS run_seconds,\n  lr.worker_id\nFROM jobs j\nJOIN LATERAL (\n  SELECT jr.id AS reservation_id, jr.created_at AS started_at, jr.worker_id\n  FROM job_reservations jr\n  WHERE jr.job_id = j.id\n  ORDER BY jr.created_at DESC\n  LIMIT 1\n) lr ON TRUE\nWHERE j.job_type = 'incremental-index'\n  AND j.finished_at IS NOT NULL\n  AND j.finished_at > NOW() - INTERVAL '24 hours'\nORDER BY run_seconds DESC NULLS LAST\nLIMIT 10;",

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread packages/observability/grafanactl/resources/dashboards/boxel-status/indexing.json Outdated
Comment thread packages/observability/grafanactl/resources/dashboards/boxel-status/indexing.json Outdated
…edupe percent

- Longest from-scratch-index / incremental-index jobs (24h): the realm
  fallback regex was using a JSON `'\\\\1'` (4 backslashes), which decodes
  to SQL `'\\1'` and emits a literal `\1` instead of the captured host
  label. Fix to `'\\1'` so root-of-host (published) realms render their
  subdomain — matching the form used in the other panels.
- Active Indexing: the `percent` (gauge) and `pct` (left-aligned number)
  columns held duplicate CASE expressions. Wrap the main select in a
  subquery and project `percent AS pct` from a single source expression
  so the two fields can't drift.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@habdelra habdelra requested a review from a team May 13, 2026 20:45
@backspace
Copy link
Copy Markdown
Contributor

  • Active Indexing — every row shows a realm, including a published / root-of-host realm; the gauge track is the same width on every row; the digits next to the gauge change but the bar bounds don't.
  • Per-realm indexing status — no blank realm cells.
  • Queued Indexing Jobs — realm column populated for from-scratch-index / incremental-index; shows (all realms) for a queued full-reindex.
  • Longest from-scratch / incremental jobs (24h) — present on the Indexing dashboard (and no longer on Job Queue).
CleanShot 2026-05-13 at 20 34 12@2x CleanShot 2026-05-13 at 20 36 02@2x

Thanks for adding the preview @lukemelia, so much easier to review properly 🎉

@habdelra habdelra merged commit 7856d2c into main May 14, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants