Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Documentation

Project documentation has moved to [README.md](README.md) and `docs/`.

- Read first: `README.md`
- Design docs: `docs/plans/`
98 changes: 98 additions & 0 deletions HACKATHON_SUBMISSION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Agent Flight Recorder

## Problem Discovered

NullWatch already provides the observability layer for the nullclaw ecosystem:
run summaries, spans, evals, OTLP ingest, cost, token usage, and failure context.
It also exports a NullHub-compatible manifest. NullHub already provides the
operator UI and orchestration pages, but it did not register NullWatch or expose
its tracing/eval data in the UI.

## Chosen Solution

Add a local-first Observability cockpit to NullHub:

- register `nullwatch` as a known component
- proxy `/api/observability/*` to a managed NullWatch instance
- add a Flight Recorder page for runs, spans, evals, cost, tokens, and errors
- document the local demo flow through NullHub's managed install path

## Why This Idea Was Chosen

This is stronger than a single CLI preflight because it connects multiple parts
of the ecosystem into a visible agent platform story: execution, orchestration,
task tracking, observability, and operations. It is still hackathon-sized because
it uses existing NullWatch APIs and NullHub UI patterns instead of changing core
agent runtime behavior.

## What Was Implemented

- NullWatch component registration in the NullHub registry.
- Observability reverse proxy with optional bearer token forwarding.
- Sidebar entry and `/observability` UI page.
- API client methods for NullWatch summary, runs, spans, evals, and health.
- README documentation for the proxy and local demo setup.

## Files Changed

- `src/installer/registry.zig`
- `src/api/observability.zig`
- `src/api/proxy.zig`
- `src/api/components.zig`
- `src/api/meta.zig`
- `src/root.zig`
- `src/server.zig`
- `ui/src/lib/api/client.ts`
- `ui/src/lib/components/Sidebar.svelte`
- `ui/src/routes/observability/+page.svelte`
- `README.md`
- `HACKATHON_SUBMISSION.md`

## How To Test Or Demo

Start NullHub:

```bash
zig build run -- serve --no-open
```

Install NullWatch from NullHub:

1. Open the web UI.
2. Go to `Install Component`.
3. Select `NullWatch`.
4. Keep or set the API port to `7710`.
5. Finish the wizard. The installer starts the NullWatch instance and NullHub
discovers it automatically.

Optional sample data can be ingested through the NullHub proxy:

```bash
curl -X POST http://127.0.0.1:19800/api/observability/v1/spans \
-H 'Content-Type: application/json' \
-d '{"run_id":"demo-run-1","trace_id":"trace-demo-1","span_id":"span-1","source":"nullclaw","operation":"tool.call","status":"error","started_at_ms":1710000000000,"ended_at_ms":1710000001500,"tool_name":"shell","error_message":"tool call failed: command timed out","attributes_json":"{\"exit_code\":124}"}'

curl -X POST http://127.0.0.1:19800/api/observability/v1/evals \
-H 'Content-Type: application/json' \
-d '{"run_id":"demo-run-1","eval_key":"tool_success","scorer":"deterministic","score":0.0,"verdict":"fail","dataset":"demo","notes":"The tool call timed out."}'
```

Open `/observability` in NullHub and inspect the NullWatch runs.

## Screenshots

Flight Recorder overview:

![NullHub Observability overview](docs/screenshots/nullhub-observability-overview.png)

Failure detail with tool-call error context:

![NullHub Observability failure detail](docs/screenshots/nullhub-observability-failure.png)

## Limitations And Future Improvements

- `NULLWATCH_URL` remains useful for pointing NullHub at an external NullWatch
instance, but the default demo path uses a managed NullWatch install.
- The first UI version renders a compact timeline, not a full waterfall chart.
- Run correlation with NullBoiler orchestration pages can be added as a follow-up
when both systems share stable run ids.
48 changes: 46 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,22 @@ Management hub for the nullclaw ecosystem.

`NullHub` is a single Zig binary with an embedded Svelte web UI for installing,
configuring, monitoring, and updating ecosystem components (NullClaw, NullBoiler,
NullTickets).
NullTickets, NullWatch).

## Features

- **Install wizard** -- manifest-driven guided setup with component-aware flows and local `NullTickets -> NullBoiler` linking
- **Process supervision** -- start, stop, restart, crash recovery with backoff
- **Health monitoring** -- periodic HTTP health checks, dashboard status cards
- **Cross-component linking** -- auto-connect `NullTickets -> NullBoiler`, generate native tracker config, and inspect queue/orchestrator status from one UI
- **Config management** -- structured editors for `NullClaw`, `NullBoiler`, and `NullTickets`, with raw JSON fallback when needed
- **Config management** -- structured editors for `NullClaw`, `NullBoiler`, `NullTickets`, and `NullWatch`, with raw JSON fallback when needed
- **Log viewing** -- tail and live SSE streaming per instance
- **One-click updates** -- download, migrate config, rollback on failure
- **Multi-instance** -- run multiple instances of the same component side by side
- **Web UI + CLI** -- browser dashboard for humans, CLI for automation
- **Managed instance admin API** -- instance-scoped status, config, models, cron, channels, and skills routes for managed NullClaw installs
- **Orchestration UI** -- workflow editor, poll-based run monitoring, checkpoint forking, encoded workflow/run/store links, and key-value store browser (proxied to NullTickets through NullHub)
- **Observability cockpit** -- local NullWatch run summaries, span timelines, eval results, token usage, cost, and error context through a NullHub proxy

## Quick Start

Expand Down Expand Up @@ -119,6 +120,47 @@ to the local orchestration stack. Most routes go to NullBoiler's REST API via
`/api/orchestration/store/*` is proxied to NullTickets via `NULLTICKETS_URL` and
optional `NULLTICKETS_TOKEN`.

**Observability proxy** -- requests to `/api/observability/*` are reverse-proxied
to the managed NullWatch instance installed in NullHub. `NULLWATCH_URL` can
still override the target for an external NullWatch instance, and
`NULLWATCH_TOKEN` overrides the managed instance token when set. The built-in
Observability page uses this proxy to display run summaries, spans, evals,
latency, cost, and failure context without sending data to hosted services.

Local NullWatch setup:

1. Start NullHub:

```bash
zig build run -- serve --no-open
```

2. In the web UI, open **Install Component**, select **NullWatch**, keep or set
the API port to `7710`, and finish the wizard. The installer starts the
NullWatch instance and the observability proxy discovers it automatically.

3. Optional demo data can be ingested through the NullHub proxy:

```bash
curl -X POST http://127.0.0.1:19800/api/observability/v1/spans \
-H 'Content-Type: application/json' \
-d '{"run_id":"demo-run-1","trace_id":"trace-demo-1","span_id":"span-1","source":"nullclaw","operation":"tool.call","status":"error","started_at_ms":1710000000000,"ended_at_ms":1710000001500,"tool_name":"shell","error_message":"tool call failed: command timed out","attributes_json":"{\"exit_code\":124}"}'

curl -X POST http://127.0.0.1:19800/api/observability/v1/evals \
-H 'Content-Type: application/json' \
-d '{"run_id":"demo-run-1","eval_key":"tool_success","scorer":"deterministic","score":0.0,"verdict":"fail","dataset":"demo","notes":"The tool call timed out."}'
```

### Observability Screenshots

Flight Recorder overview:

![NullHub Observability overview](docs/screenshots/nullhub-observability-overview.png)

Failure detail with tool-call error context:

![NullHub Observability failure detail](docs/screenshots/nullhub-observability-failure.png)

## Development

Testing strategy and roadmap live in [TESTING.md](TESTING.md).
Expand Down Expand Up @@ -159,12 +201,14 @@ src/
auth.zig # Optional bearer token auth
api/ # REST endpoints (components, instances, wizard, ...)
orchestration.zig # Reverse proxy to NullBoiler orchestration API
observability.zig # Reverse proxy to NullWatch tracing/eval API
core/ # Manifest parser, state, platform, paths
installer/ # Download, build, UI module fetching
supervisor/ # Process spawn, health checks, manager
ui/src/
routes/ # SvelteKit pages
orchestration/ # Orchestration pages (dashboard, workflows, runs, store)
observability/ # NullWatch flight recorder page
lib/components/ # Reusable Svelte components
orchestration/ # GraphViewer, StateInspector, RunEventLog, InterruptPanel,
# CheckpointTimeline, WorkflowJsonEditor, NodeCard, SendProgressBar
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/superpowers/specs/2026-03-18-report-command-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Create GitHub issues with pre-filled system data from CLI and Web UI.

Order is fixed (as listed above) in both CLI and Web UI selectors.

The target list is hardcoded in `report.zig`, independent of `known_components` in `registry.zig`. nullwatch exists as a repo but is not yet in the component registry.
The target list is hardcoded in `report.zig`, independent of `known_components` in `registry.zig`. `nullwatch` is also a known installable component, so report target metadata should stay aligned with the registry entry.

## Report types and labels

Expand Down
18 changes: 14 additions & 4 deletions src/api/components.zig
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
const std = @import("std");
const std_compat = @import("compat");
const builtin = @import("builtin");
const registry = @import("../installer/registry.zig");
const paths_mod = @import("../core/paths.zig");
const state_mod = @import("../core/state.zig");
Expand All @@ -21,7 +22,12 @@ pub fn deriveDisplayName(allocator: std.mem.Allocator, name: []const u8) ![]cons

/// Check if a component has a standalone installation at ~/.{component}/config.json
fn hasStandaloneInstall(allocator: std.mem.Allocator, component: []const u8) bool {
const home = std_compat.process.getEnvVarOwned(allocator, "HOME") catch return false;
const home = std_compat.process.getEnvVarOwned(allocator, "HOME") catch blk: {
if (builtin.os.tag == .windows) {
break :blk std_compat.process.getEnvVarOwned(allocator, "USERPROFILE") catch return false;
}
return false;
};
defer allocator.free(home);
const dot_name = std.fmt.allocPrint(allocator, ".{s}", .{component}) catch return false;
defer allocator.free(dot_name);
Expand Down Expand Up @@ -176,7 +182,7 @@ test "deriveDisplayName capitalizes first letter" {
try std.testing.expectEqualStrings("", name3);
}

test "handleList returns valid JSON with all 3 known components" {
test "handleList returns valid JSON with all known components" {
const allocator = std.testing.allocator;
var fixture = try test_helpers.TempPaths.init(allocator);
defer fixture.deinit();
Expand All @@ -192,30 +198,34 @@ test "handleList returns valid JSON with all 3 known components" {
try std.testing.expect(std.mem.startsWith(u8, json, "{\"components\":["));
try std.testing.expect(std.mem.endsWith(u8, json, "]}"));

// Verify all 3 components are present
// Verify all components are present
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullclaw\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullboiler\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nulltickets\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullwatch\"") != null);

// Verify display names
try std.testing.expect(std.mem.indexOf(u8, json, "\"NullClaw\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"NullBoiler\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"NullTickets\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"NullWatch\"") != null);

// Verify descriptions are present
try std.testing.expect(std.mem.indexOf(u8, json, "Autonomous AI agent runtime") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "DAG-based workflow orchestrator") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "Task and issue tracker") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "Headless observability") != null);

// Verify repo fields
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullclaw/nullclaw\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullclaw/NullBoiler\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullclaw/nulltickets\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"nullclaw/nullwatch\"") != null);

// Verify structural fields
try std.testing.expect(std.mem.indexOf(u8, json, "\"alpha\"") != null);
try std.testing.expectEqual(@as(usize, 2), std.mem.count(u8, json, "\"alpha\":true"));
try std.testing.expectEqual(@as(usize, 1), std.mem.count(u8, json, "\"alpha\":false"));
try std.testing.expectEqual(@as(usize, 2), std.mem.count(u8, json, "\"alpha\":false"));
try std.testing.expect(std.mem.indexOf(u8, json, "\"installed\"") != null);
try std.testing.expect(std.mem.indexOf(u8, json, "\"instance_count\"") != null);
}
Expand Down
Loading
Loading