feat: add humanize monitor web dashboard for RLCR sessions#63
feat: add humanize monitor web dashboard for RLCR sessions#63SihaoLiu merged 30 commits intoPolyArch:betafrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6771bda603
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9aa75a59bf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
615fcde to
1b575fe
Compare
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
16dbf50 to
a1f3614
Compare
|
@codex review this PR as entirety |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1b575fe70a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
c6c5e63 to
a43cfec
Compare
a43cfec to
3039e7e
Compare
450d91e to
be29c41
Compare
Introduces the `humanize monitor web` command and its Flask + SPA dashboard for RLCR sessions, plus the accumulated review-phase fixes and UX polish. Feature - Per-project session discovery and summary (state.md, goal tracker, round summaries, review results, methodology reports). - Pipeline canvas with snake-path node layout, SVG connectors, pan / zoom, and per-round flyout details. - Session-detail page carries a below-canvas live-log panel with a three-state toggle (collapsed / normal / expanded) driven by --log-h and a single grid-template-rows declaration. Clicking a pipeline node auto-collapses the log; closing the flyout restores the prior state. - Incremental pipeline update (_updatePipelineIncremental) appends or replaces individual nodes and redraws only the SVG connectors, preserving zoom / pan across round additions. - Diff-based home refresh: only cards whose rendered HTML changed are replaced; unchanged cards stay untouched. - Per-session SSE live-log streaming from ~/.cache/humanize with snapshot / append / resync / eof / retained-replay protocol. - Active-log selection mirrors the CLI monitor: pick the top round's codex-run log, with codex-review / gemini-run / gemini-review fallbacks. - REST + WebSocket + SSE transports. Localhost binds skip auth; remote binds require a bearer token (Authorization header for fetch, ?token= query for SSE) plus origin-matched CSRF gating on every mutating route. - Ops-menu Preview Issue flow: calls local Claude CLI to generate a sanitized methodology report (humanize issue taxonomy), shows a preview modal, and submits via `gh issue create --repo humania-org/humanize`. Falls back to a copy-to-clipboard path when gh is unauthenticated. - Active-session dynamic progress visualization, methodology report generation via local Claude CLI, and a comprehensive frontend test suite. Correctness and safety - broadcast_message uses set.difference_update to avoid the UnboundLocalError from Python treating `-=` as a local rebind. - Session path validation rejects `..`, dotfiles, symlink escapes; plan-file reads are constrained to the project / session dir. - Remote-mode auth fail-closed when no token is configured. - CSRF layer enforced on every mutating route, including wildcard binds and IPv6 literals. - Parser accepts table-form acceptance criteria and legacy / single-letter criterion ids; multi-id table cells are split correctly; COMPLETE verdict only fires on terminal lines. - Finalize-phase classification scoped to the live round, and skip-impl sessions correctly classify round 0 as review-only. Transport and watcher - watcher.py starts a cache-log observer per active session that broadcasts `round_added` when a new round-N-*.log file appears, so the dashboard switches the live-log pane to the new round without waiting for the next state.md write. - SessionWatcher.on_session_created hook primes cache-log observers for brand-new sessions; every event retries the hook so the state-dir-before-cache-dir race eventually heals. - Non-destructive toggleTheme (no DOM rebuild). The analytics route is the only surface whose charts bake CSS vars into SVG at render time, so that route alone re-renders on theme change. - Remote-mode slow polling (10s) uses the same targeted refresh helpers as the WS path so card counters and pipeline nodes catch up without a full-page reload. - Session-detail refresh race guard: after awaiting /api/sessions/<id>, re-check the active route + layout skeleton before mutating DOM. Install and CLI - Preflight codex_hooks and idempotent hook merge when paths contain spaces. - Relaxed runtime detection and tolerate unwritable default shim paths. - `humanize monitor web` foreground mode passes --host explicitly, IPv6 brackets in the printed URL, readiness probe is fail-closed. Parser and API - /api/sessions summary carries every field the terminal `humanize monitor rlcr` status bar needs (push_every_round, ask_codex_question, agent_teams, drift, stall count, ultimate_goal, tasks_active / deferred, active_log_path, git_status, etc.). - cache_logs entries carry round / tool / role / path / basename for deterministic client-side selection. Version: 1.17.0. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
be29c41 to
c0c4835
Compare
|
@codex review this PR as entirety against branch dev |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c0c4835c59
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
`scripts/humanize.sh` is meant to be sourced into the user's interactive shell (the README shows `source humanize.sh` followed by `humanize monitor web`). Using `exec` to hand control to the Flask process replaced that shell, so pressing Ctrl+C — or any server exit — took the whole interactive session down with it. Replace the `exec` with a plain subprocess invocation so the function returns normally when the server stops and the user's shell prompt stays alive. Daemon mode still delegates to `viz/scripts/viz-start.sh` and is unaffected. Fixes Codex review P1 on PR PolyArch#63 (PolyArch#63 (comment)). Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
viz-restart.sh advertised `viz-restart.sh [--project <path>]` in
its usage string but ran `PROJECT_DIR="${1:-.}"`, which treated
the literal flag name `--project` as a directory. Calling the
documented form would fail at `cd --project` before the restart
could happen.
Mirror the flag-parsing loop used by viz-start.sh and viz-stop.sh
(positional path plus `--project <path>` named form), and forward
`--project` to both helpers so the whole chain uses the same
resolved absolute path.
Fixes Codex review P2 on PR PolyArch#63
(PolyArch#63 (comment)).
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
The six Chart.js analytics panels on /#/analytics (rounds per session, avg round duration, verdict distribution, P0-P9 issues, first COMPLETE round, BitLesson growth) were dropped in the home-analytics refactor; the home page now carries the same summary numbers as inline stat cards plus a lightweight inline- SVG sparkline, and the analytics route surfaces the verdict timeline + session comparison table without any Chart.js panels. Clean up the remaining dead references that Codex flagged: - Delete viz/static/js/charts.js (no longer imported). - Drop the Chart.js CDN <script> from index.html. - Drop the /js/charts.js <script> tag from index.html. - Remove the six chart-panel i18n keys from i18n.js — no other caller references them. - Update tests/test-viz.sh's static-file presence check to match. Fixes Codex review P2 on PR PolyArch#63 (PolyArch#63 (comment)). Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against branch dev |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4ebd4ee2d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
The forbidden-token pattern for imports was
`\b(?:import|require|from)\s+\w+`, which matches ordinary English
prose like "drifted from the original plan structure". That one
phrase appears verbatim in the built-in `plan_execution`
methodology observation text, so every sanitized issue payload
that picked `plan_execution` as a dominant category was marked as
having warnings, and `/api/sessions/<id>/github-issue` rejected the
submission with a 400 even after the outbound body had already
been assembled from the constrained taxonomy.
Replace the single pattern with three code-anchored variants:
- `^\s*import\s+[\w.]+` (Python `import x.y` at line start)
- `^\s*from\s+[\w.]+\s+import\b` (Python `from x import y` at line start)
- `\brequire\s*\(` (JS / Node `require(` call syntax)
These still catch real import fragments that leak into a
methodology report (bare `import os`, indented ` import sys`,
`from viz.server import app`, `require("fs")`) but leave natural
prose using the preposition "from" alone.
Fixes Codex review P1 on PR PolyArch#63
(PolyArch#63 (comment)).
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
`read_plan_file()` gated its open() on `os.path.exists(candidate_real)`, which is True for directories as well as files. A state.md that carried `plan_file: .` (or any directory under the project / session that happened to resolve inside the allowed prefix) would therefore pass validation, skip the backup fallback, and drop into `open(candidate_real, 'r')` which raises IsADirectoryError. The exception propagated out through /api/sessions/<id>/plan as an uncaught 500 instead of surfacing the controlled fallback to the session-local plan.md (or the intended 404 when no backup exists). Swap the existence check for `os.path.isfile`, which is directory- safe and also returns False for broken symlinks. Directories and dangling symlinks now fall through to the existing backup branch. Fixes Codex review P2 on PR PolyArch#63 (PolyArch#63 (comment)). Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
CI route-backed truncation test (tests/test-app-routes-live.sh Group 6)
is prone to a watcher/writer race: when the writer does
``with open(path, 'wb') as f: f.write(data)`` the file is truncated
to 0 bytes before the second write lands. The per-cache watcher's
callback can fire on that first IN_MODIFY event and invoke
``_poll_locked`` while the file is still empty, so the resync path's
inline ``_snapshot_locked`` emits zero snapshot events and the content
subsequently arrives as ``append`` — violating the protocol contract's
resync -> snapshot sequencing that downstream clients rely on.
Track a ``_resync_pending`` flag that every resync path sets whenever
its immediate snapshot attempt returned empty (file was 0 bytes at
the time). The next poll that observes ``size > self._offset`` while
the flag is set emits the new content via ``_snapshot_locked`` instead
of ``_append``, and clears the flag once caught up. Normal write paths
are unaffected: the flag stays False when the resync snapshot sees
content immediately, so the append fast-path still covers steady-state
streaming.
Fixes the GitHub Actions ``run-all-tests`` failure on Round 20:
FAIL: route-backed truncation event stream incomplete:
snapshots=1 resync_truncated=True eof=True
The three existing resync sites (recreated, rotated, truncated) share
the same pattern so the fix is uniform across all of them.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 86f0de9b00
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 21 Codex P1 (PRRT_kwDOQ4a3IM57_efh). `_TERMINAL_STATE_FILES` listed `methodology-analysis-state.md` and `finalize-state.md` among the markers that tell the SSE generator to emit `eof` and close the stream. Both markers are actually *active* states — `_is_terminal_status` excludes `analyzing` / `finalizing`, and the dashboard still treats those sessions as cancellable — so `_session_is_terminal_cheap` was mis-classifying live sessions as finished. The observable impact was that as soon as a session entered finalize or methodology-analysis, every live SSE client saw an immediate `eof` event and the log pane stopped updating, even though the backing codex-run / codex-review log files were still being written. Trim the marker list to the five truly-terminal states (complete, cancel, stop, maxiter, unexpected) so SSE now keeps delivering append events throughout the analyzing/finalizing phases. A short comment pins this list to `_is_terminal_status` so future markers don't drift out of sync again. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Round 21 Codex P2 (PRRT_kwDOQ4a3IM57_efi). `parse_git_status` in `viz/server/parser.py` checked `x == 'M' or y == 'M'` before the index-side `A` branch. That made a porcelain status like `AM` (new file added to the index, then modified in the worktree — the common "stage a new file, then tweak it" workflow) count as `modified` and never as `added`, so the dashboard git summary drifted out of parity with `humanize_parse_git_status` in `scripts/humanize.sh`, which explicitly maps `AM` to `added`. Reorder the branches so any index-side `A` wins first (covers `A `, `AM`, `AD`), followed by `R` (rename → modified) then `D` (deletion), and only then the `M` fallback. Matches the shell helper's explicit priority exactly and stops the dashboard header from undercounting added files whenever the user has a staged-new-file with an in-progress edit. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08ab1d4547
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 22 Codex P1 (PRRT_kwDOQ4a3IM57_u12). ``LogStreamRegistry.get_or_create`` had no matching delete path, so every ``(session_id, basename)`` ever opened lived until the dashboard process exited. Each ``LogStream`` keeps a 256-event retention deque (often large base64 snapshot/append chunks), so browsing many sessions/round logs grew memory without bound on long-running viz servers. Extend the registry with explicit ``acquire`` / ``release`` helpers that ref-count active SSE consumers, and expose ``LogStream.eof_emitted`` for the release path to query. Eviction policy: - refcount > 0: keep the stream (clients still reading). - refcount == 0 AND stream has emitted EOF: drop from the registry and free the retention deque. - refcount == 0 AND session still active (no EOF yet): keep the stream so future reconnects still land inside the 256-event ``Last-Event-Id`` replay window the streaming contract mandates. Wire the SSE route to ``_acquire_log_stream`` / ``_release_log_stream``: the release now sits in the same generator ``finally`` that already releases the cache watcher, so normal EOF, client disconnect, and exception paths all balance the refcount. ``get_or_create`` stays callable (tests exercise it directly without refcount semantics). Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Round 22 Codex P2 (PRRT_kwDOQ4a3IM57_u13). The SPA fallback route probed ``os.path.isfile(os.path.join(STATIC_DIR, path))`` before handing off to ``send_from_directory``. ``os.path.join`` does not reject traversal segments, so a request like ``GET /..%2f..%2fetc%2fpasswd`` differentiated its response based on whether the traversal target existed on disk: - existing target → ``send_from_directory`` branch → 404 - missing target → SPA fallback → 200 ``index.html`` Because the static route is intentionally unauthenticated, that turned the dashboard into an unauthenticated filesystem-existence oracle for any path readable by the server process. Swap ``os.path.join`` for ``werkzeug.utils.safe_join``: it returns ``None`` whenever the resolved path would escape ``STATIC_DIR``, so traversal attempts skip the ``os.path.isfile`` probe entirely and fall straight through to the SPA fallback. Legitimate static assets keep working (they resolve to an absolute path inside STATIC_DIR and pass both checks). No new CDN dep is introduced — ``werkzeug.utils`` is already imported transitively via Flask. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bfec5afa83
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 23 Codex P2 (PRRT_kwDOQ4a3IM57_zC9). The earlier eviction path only dropped a stream when ``refcount`` reached zero AND ``eof_emitted`` was already true. A client that disconnected before its session reached a terminal state therefore left the stream (and its 256-event retention deque, often full of base64 snapshot/append chunks) resident for the process lifetime — no later code path marks EOF on an unpolled stream whose session quietly ended. On long-lived dashboard instances where users briefly browse many active sessions, that accumulated memory without bound. Add a second eviction axis: an idle-TTL timer recorded whenever a release drops the refcount to zero without EOF. Every ``release`` doubles as an opportunistic sweep — entries whose idle timestamp is older than ``IDLE_STREAM_TTL_SECONDS`` (5 min) and still have refcount=0 are dropped regardless of EOF status. ``acquire`` clears the idle timestamp so normal reconnect-within-window flows keep retention (the streaming contract's 256-event ``Last-Event-Id`` replay window), while late reconnects fall through to the existing ``resync(overflow)`` + fresh-snapshot path. The constructor accepts an override so tests can drive eviction with a deterministic short TTL without waiting 5 minutes of wall-clock time. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Round 23 Codex P1 (PRRT_kwDOQ4a3IM57_zC8). ``/api/analytics`` fed the
raw output of ``list_sessions(PROJECT_DIR)`` into ``compute_analytics``
without the ``_is_safe_session_id`` filter that ``/api/sessions``
already applies. A crafted on-disk session directory whose name
contained quote / JS metacharacters could therefore reach the
Analytics comparison table, where ``buildCmpTable`` interpolated
``s.session_id`` directly into an inline
``onclick="navigate('#/session/${id}')"`` handler plus cell HTML —
the same XSS vector the rest of the API had already closed.
Two layers of defence:
1. Backend: drop sessions whose id fails ``_is_safe_session_id``
before handing them to the analyzer, so the Analytics payload
cannot carry an unsafe id at all.
2. Frontend: stop interpolating the id through an inline
``onclick`` string literal. ``buildCmpTable`` now emits a
``data-session-id`` attribute (escaped via ``_esc``) and binds
navigation through a delegated click listener; every other cell
value is also escaped for consistency. Even if a future backend
regression lets an unsafe id slip past the filter, the value
flows only through ``dataset`` (a DOM-level string that is never
re-parsed as JS) and ``window.location`` — neither evaluates
markup.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0cde0f712a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 24 Codex P2 (PRRT_kwDOQ4a3IM57_5AV). ``compute_analytics`` read ``s['current_round']`` as if it were a round count, but the field is a 0-based *index* — a session that finished round 0 reports ``current_round=0`` even though its ``rounds`` list has one entry. Two knock-on effects on the Analytics UI: - ``overview.average_rounds`` undercounted every non-empty session by one. The prior filter ``current_round > 0`` also excluded single-round sessions entirely, so the average silently skipped the shortest runs instead of representing them. - ``session_stats[*].rounds`` (the comparison table's ``Rounds`` column) reported ``0`` for sessions whose only round was round 0 and ``N-1`` for sessions with N rounds completed. Switch both sites to ``len(s['rounds'])``. The parser already constructs ``rounds`` from ``range(max_disk_round + 1)`` so its length is the authoritative completed-round count, and it naturally covers sessions where on-disk review files exceed ``current_round`` (the parser's own ``max_disk_round`` rule). Drop the ``> 0`` filter in favour of ``n > 0`` on the derived length so single-round sessions now participate in the average. Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 53f5b98493
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 25 Codex P2 (PRRT_kwDOQ4a3IM58A-pN). The GitHub-issue generator
rendered ``A {current_round}-round RLCR session ...`` where
``current_round`` is a 0-based *index*. A session whose only round
was round 0 reported ``0-round`` in the outbound issue text, and
every other session lost one round in the summary.
Swap to ``len(s['rounds'])``, matching the Round 24 fix in
``compute_analytics``. The parsed ``rounds`` list is the authoritative
count: the parser builds it from ``range(max_disk_round + 1)`` so it
covers both the normal ``current_round + 1`` case and the edge case
where on-disk review files exceed ``current_round``. The sanitized
taxonomy-only issue body now reflects the real session size.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
Round 25 Codex P2 (PRRT_kwDOQ4a3IM58A-pP). The Markdown exporter put
``session['current_round']`` into the ``Rounds`` overview cell, which
is a 0-based round index. A session whose only round was round 0
therefore exported ``| Rounds | 0 |`` and every other session lost
one round in the generated report. Downstream consumers (users
archiving session summaries, anything grepping the export) saw
incorrect session totals.
Use ``len(session.get('rounds') or [])`` to match the analytics and
sanitized-issue fixes from Round 24/25: the parsed rounds list is
the authoritative count regardless of how ``current_round`` relates
to ``max_disk_round`` (they can diverge when on-disk review files
exceed ``current_round``). The guard against a missing ``rounds``
key also hardens the exporter so a partially-parsed session cannot
raise AttributeError mid-render.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a3655147ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Round 26 Codex P2 (PRRT_kwDOQ4a3IM58BEtU). The sanitized GitHub issue
body's ``Quantitative Summary`` table still read ``| Total rounds |
{s["current_round"]} |``. ``current_round`` is a 0-based round
index, so the table under-reported every session by one (``0`` for
a single-round session, ``N-1`` for an N-round session) even though
the Context line above already switched to ``round_total`` in
Round 25.
Reuse the ``round_total = len(s.get('rounds') or [])`` value already
computed earlier in the same function so the Quantitative Summary
and the Context paragraph agree on the session size. Downstream
issue readers no longer see conflicting counts between the two
sections of the generated report.
Signed-off-by: Chao Liu <chao.liu.zevorn@gmail.com>
|
@codex review this PR as entirety against dev branch |
|
Codex Review: Didn't find any major issues. Already looking forward to the next diff. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review this PR as entirety one more time |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6812053a13
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| exit 3 | ||
| } | ||
|
|
||
| SESSION_DIR="$PROJECT_ROOT/.humanize/rlcr/$SESSION_ID" |
There was a problem hiding this comment.
Validate session id before building cancel target path
SESSION_ID is interpolated directly into SESSION_DIR, so values like ../foo escape .humanize/rlcr and let this helper rename state files outside the intended session tree. In practice, cancel-rlcr-session.sh --project <proj> --session-id ../foo will operate on <proj>/.humanize/foo/state.md if present, which breaks the session-scoped safety contract and can mutate unrelated project state if any caller forwards unvalidated ids.
Useful? React with 👍 / 👎.
| self._idle_since[key] = time.monotonic() | ||
| self._sweep_idle_streams_locked() |
There was a problem hiding this comment.
Evict idle streams without requiring a later release call
Idle stream eviction only runs synchronously inside release(), so a stream that drops to refcount 0 without EOF is never removed if no subsequent release() happens after TTL. That leaves LogStream entries (including retained event buffers) resident for process lifetime after a one-off disconnect on an active session, which is a real memory leak in long-lived dashboard processes with low churn.
Useful? React with 👍 / 👎.
Summary
humanize monitor websubcommand toscripts/humanize.sh— a per-project Flask + SPA dashboard that reads the same.humanize/rlcr/data the terminal monitors use, runs in the foreground by default, and serves a live pipeline graph + sidebar + streaming cache-log pane.PolyArch/humanizeviagh issue create— all reachable from the session-detail ops menu./humanize:vizClaude command; its skill and slash-command entries are removed in favor of the CLI subcommand (foreground first;--daemonopt-in for tmux-backed daemon mode).How to use
Open
http://localhost:18000in a browser. The home page lists every session under the project; click an active card to enter the session detail page. The bottom live-log panel streams the most recent round'sround-N-codex-run.log; it auto-switches when a new round starts. A three-state toggle (collapsed / normal / expanded) and a node-click auto-fold keep the pipeline canvas legible.From the ops menu (
⋯top-right on the detail page), Preview Issue invokes local Claude to generate the methodology report (if missing), renders a sanitized preview modal, and can submit the resulting issue toPolyArch/humanizeviagh issue create(requiresgh auth loginonce).Architecture
Backend (Python/Flask, auto-managed venv in
.humanize/viz-venv):viz/server/app.py— REST + WebSocket + SSE endpoints; bearer-token auth (header for fetch,?token=for SSE); same-origin CSRF gate on every mutating route; cancel endpoint routed through project-scoped helper; Claude-CLI-driven report generation;gh issue createwrapper.viz/server/parser.py— RLCR session parser (YAML frontmatter, goal-tracker markdown, round summaries, review results, methodology reports). Accepts list-form and table-form acceptance criteria plus legacy / dashless / nested id forms.viz/server/watcher.py— watchdog observer on.humanize/rlcr/plus a per-session cache-dir observer that broadcastsround_addedthe momentround-N-*.logappears, so the frontend switches the live-log pane without waiting for the next state.md write.viz/server/log_streamer.py— per-session SSE log stream: snapshot / append / resync / eof with 256-event Last-Event-Id replay.viz/server/rlcr_sources.py— cache-dir derivation and filename parsing; shared by the backend and tests.Frontend (vanilla HTML/CSS/JS, zero build step):
Integration
scripts/humanize.sh— dispatchesmonitor webto_humanize_monitor_web, which auto-provisions the venv, finds a free port in 18000–18099, probes the configured bind for liveness, fails closed on a non-loopback bind without--auth-token, and eitherexecs Flask in the foreground or delegates to the opt-in tmux launcher.viz/scripts/viz-start.sh / viz-stop.sh / viz-status.sh— the opt-in--daemonpath uses per-project tmux session names (SHA-1 of the project path) withtmux -t =nameexact-match so prefix collisions cannot touch another project's dashboard. Startup probe fails closed and cleans upviz.port/viz.urlwhen readiness never lands.Security posture
--auth-token(orHUMANIZE_VIZ_TOKEN) and refuses to start without one.Host/Origin(handles--host 0.0.0.0wildcard binds and IPv6 literals)..., dotfiles, and symlink escapes; plan-file reads are constrained to the project or session directory.Test plan
humanize monitor web --project <path>launches in the foreground and serves the dashboard on the chosen port.humanize monitor web --host 0.0.0.0 --port 18000 --auth-token <tok>serves remotely and rejects unauthenticated requests.PolyArch/humanizewhenghis authenticated.test-viz.sh,test-viz-isolation.sh,test-streaming.sh,test-app-auth.sh,test-app-routes-live.sh,test-cancel-session.sh,test-frontend-migration.sh,test-rlcr-sources.sh,test-style-compliance.shall pass.test-template-loader.sh,test-bash-validator-patterns.sh, robustness/*, etc.).