Skip to content

Harden CodexBar cold-start menu readiness#1150

Closed
AmrMohamad wants to merge 8 commits into
steipete:mainfrom
AmrMohamad:enhance/UI-performance-issues
Closed

Harden CodexBar cold-start menu readiness#1150
AmrMohamad wants to merge 8 commits into
steipete:mainfrom
AmrMohamad:enhance/UI-performance-issues

Conversation

@AmrMohamad
Copy link
Copy Markdown
Contributor

Summary

This PR hardens CodexBar’s first-launch menu readiness path after macOS boot. The issue we investigated was an unstable first-open Codex tab where expensive adjunct UI, especially charts and hosted submenu content, could remain absent until the user manually refreshed or switched tabs.

The core fix is not a visual redesign. It keeps the UI shape intact and improves the lifecycle behind it: menu readiness invalidation, hosted submenu hydration, cached token data availability, duplicate-launch protection, and cold-start validation.

Why this changed

CodexBar’s menu is an AppKit NSStatusItem / NSMenu surface with SwiftUI-hosted rows and lazily hydrated submenus. On first launch after boot, several data sources can become ready after the menu has already been opened:

  • OpenAI dashboard attachment and usage data
  • credits state
  • Codex token-cost cache
  • plan utilization history
  • hosted chart submenu data

Before this change, the menu invalidation path was too coarse for those adjunct readiness transitions. A parent menu could remain open with stale structure, and a hosted submenu could stay stuck on its placeholder even after the backing data became available.

What changed

  • Adds an explicit menu adjunct readiness signature so open menus can refresh when dashboard, credits, token cache, or plan-history readiness changes.
  • Rebuilds open hosted submenu content from current store state instead of only resizing stale hosted views.
  • Requests an OpenAI dashboard refresh when the Codex parent menu opens and the chart history is stale.
  • Hydrates cached Codex token snapshots at startup so local token charts can appear without waiting for a fresh scanner pass.
  • Tracks plan utilization history revisions so menu readiness can respond to newly recorded samples.
  • Adds a single-instance launch guard so duplicate app launches do not race menu-bar state after packaging or login.
  • Makes Login Item registration idempotent to avoid unnecessary ServiceManagement churn.
  • Reduces avoidable UI cost by caching DateFormatter instances used in menu/chart rendering.
  • Avoids a keychain prompt lock inversion by serializing the alert on the main actor instead of holding the prompt lock before the main-queue hop.
  • Adds a cold-start validation harness that can install a one-shot LaunchAgent, disable/restore the normal Login Item, launch the packaged app after reboot, capture AX/screenshot artifacts, and verify first-open menu readiness.
  • Tightens package/run scripts so validation uses the freshly packaged app and detects duplicate launched instances.

Validation

Source and focused behavior checks:

  • swift test --filter StatusMenuHostedSubmenuRefreshTests
  • swift test --filter StatusMenuOpenRefreshTests
  • swift test --filter LaunchAtLoginManagerTests
  • make check
  • git diff --check
  • Focused AppKit smell scan on touched menu/keychain files

Runtime proof:

  • Built and validated .build/package/CodexBar.app
  • Verified code signing with codesign --verify --deep --strict --verbose=2 .build/package/CodexBar.app
  • Installed a one-shot cold-start LaunchAgent
  • Disabled the normal CodexBar Login Item before reboot
  • Rebooted the Mac and let the LaunchAgent perform the first-launch validation
  • Restored the normal CodexBar Login Item and removed the one-shot LaunchAgent afterward

Cold-start artifact:

  • /Users/amrmohamad/.codex/artifacts/CodexBar/cold-start-20260526-025603

Key artifact evidence:

  • post_boot_first_launch_candidate=true
  • existing_codexbar_process_count=0
  • codexbar_login_item_present=false at runner start
  • uptime_seconds=319
  • immediate parent menu status was complete
  • immediate parent AX included Buy Credits..., Cost | submenu=true, Usage Dashboard, Status Page, and Refresh
  • settled Cost submenu AX included Cost | submenu=true and Cost submenu item count=1
  • Scripts/prepare_next_boot_cold_start_validation.sh check-result passed against the captured artifact

Notes

The LaunchAgent’s original summary recorded an older checker failure because the validator used the stdlib PNG decoder fallback instead of Pillow metadata. The artifact itself is valid and immutable; after correcting the checker to accept image_decoder=stdlib_png, the same first-boot artifact validates successfully.

This PR intentionally does not change the visible UI design. The changes are lifecycle, data-readiness, validation, and performance hardening around the existing menu hierarchy.

Copy link
Copy Markdown
Contributor Author

@steipete I could not request you through GitHub's reviewer API from this account, so tagging you here for review.

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 26, 2026

Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 6:10 PM ET / 22:10 UTC.

Summary
The PR hardens cold-start menu readiness, hosted submenu refresh, cached Codex token hydration, duplicate launch handling, Login Item idempotence, keychain prompt serialization, packaging checks, and cold-start validation scripts/tests.

Reproducibility: no. this read-only review did not run a local cold-boot/AppKit bundle reproduction. The PR body provides credible packaged-app reboot proof, and source/tests support the cold-start readiness path, but exact first-after-boot behavior remains macOS-runtime-dependent.

Review metrics: 2 noteworthy metrics.

  • Changed surface: 32 files; 4,074 added, 66 removed. The patch spans runtime menu code, startup behavior, packaging scripts, validation scripts, and tests, so maintainer review should cover more than unit-level correctness.
  • Executable validation scripts: 3 added. The new scripts manipulate LaunchAgents/Login Items and capture runtime artifacts, which makes them compatibility-sensitive operator tools.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Get maintainer acceptance or adjustment of the duplicate-launch policy.
  • Confirm the LaunchAgent/Login Item validation harness should be supported in the repository.

Mantis proof suggestion
A desktop visual proof could independently show the first-open menu and hydrated Cost submenu behavior that this PR changes. A maintainer can ask Mantis to capture proof by posting a new PR comment that starts with the OpenClaw Mantis account mention, followed by:

visual task: verify CodexBar cold-start first-open Codex menu shows Usage Dashboard, Status Page, Refresh, and a hydrated Cost submenu without manual refresh.

Risk before merge

  • [P1] The new single-instance guard can terminate a freshly launched newer bundle while leaving an older already-running CodexBar process active, so explicit relaunch, package-test, and some upgrade workflows need maintainer acceptance or adjustment.
  • [P1] The cold-start validation harness is useful but operator-facing: when invoked it can install a user LaunchAgent, disable/restore the CodexBar Login Item, launch the packaged app after reboot, and capture AX/screenshot artifacts.
  • [P1] The patch is broad for startup/menu hardening, spanning runtime menu code, startup behavior, keychain prompt coordination, packaging scripts, validation scripts, and tests.

Maintainer options:

  1. Accept Oldest-Process Single Instance
    Maintainers can explicitly accept that later same-bundle-ID launches exit when an older CodexBar process is already running and rely on dev/package flows to kill stale instances first.
  2. Preserve Explicit Relaunches
    Before merge, adjust the guard so an explicitly launched newer or different bundle can replace or out-prioritize an older running bundle when bundle paths or executable hashes differ.
  3. Split Validation Harness
    If the LaunchAgent/Login Item harness is too much supported operator surface for this PR, split it from the runtime menu-readiness fix.

Next step before merge

  • [P2] The remaining action is maintainer acceptance or adjustment of compatibility-sensitive startup and validation-script behavior, not a narrow automated repair.

Security
Cleared: No concrete supply-chain, dependency, secret-handling, or permission-broadening vulnerability was found; the new scripts remain a compatibility and operator-surface review risk rather than a security finding.

Review details

Best possible solution:

Land the menu-readiness hardening after maintainers either accept the oldest-process duplicate-launch policy and validation harness as supported behavior or narrow those parts before merge.

Do we have a high-confidence way to reproduce the issue?

No; this read-only review did not run a local cold-boot/AppKit bundle reproduction. The PR body provides credible packaged-app reboot proof, and source/tests support the cold-start readiness path, but exact first-after-boot behavior remains macOS-runtime-dependent.

Is this the best way to solve the issue?

Unclear as a final product decision. The menu-readiness implementation is focused and covered by tests, but the duplicate-launch policy and LaunchAgent/Login Item harness need maintainer acceptance or narrowing before merge.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 01528f5ee6eb.

Label changes

Label justifications:

  • P2: This is normal-priority app hardening for cold-start menu correctness with limited user blast radius but non-trivial startup and script review needs.
  • merge-risk: 🚨 compatibility: Merging changes duplicate-launch and validation/setup behavior in ways that can affect existing relaunch, package-test, Login Item, and upgrade workflows.
  • rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes after-fix real behavior proof from a packaged CodexBar app after reboot, with LaunchAgent validation, code-sign verification, AX/menu readiness observations, and checker output.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes after-fix real behavior proof from a packaged CodexBar app after reboot, with LaunchAgent validation, code-sign verification, AX/menu readiness observations, and checker output.
Evidence reviewed

What I checked:

  • Repository policy read: The repository AGENTS.md was read fully and is relevant because this PR touches Swift app code, tests, scripts, UI/runtime validation, and keychain-prompt-sensitive behavior. (AGENTS.md:1, 01528f5ee6eb)
  • Current main does not already implement the central change: Current main has no SingleInstanceLaunchGuard, menu adjunct readiness signature, cached token hydration startup helper, or cold-start validation scripts, so the PR is not obsolete on main. (01528f5ee6eb)
  • PR scope: The GitHub API reports an open external PR with 32 changed files, 4,074 additions, 66 deletions, mergeable=true, and current labels including proof: sufficient and merge-risk compatibility. (fd116d39ef19)
  • Duplicate-launch behavior: The new guard chooses the lowest active same-bundle-ID process and terminates the current launch when an older process exists, without considering bundle path or executable hash. (Sources/CodexBar/SingleInstanceLaunchGuard.swift:21, fd116d39ef19)
  • Validation harness operator surface: The next-boot validation script can snapshot, delete, and restore the user's CodexBar Login Item and prepares a LaunchAgent runner for first-after-boot validation. (Scripts/prepare_next_boot_cold_start_validation.sh:166, fd116d39ef19)
  • Hosted submenu refresh path: The PR rebuilds open hosted submenus from current store state and preserves chart/provider identity when refreshing Cost, usage history, storage, and Zai chart submenus. (Sources/CodexBar/StatusItemController+HostedSubmenus.swift:97, fd116d39ef19)

Likely related people:

  • steipete: Local blame points the current menu refresh and LaunchAtLoginManager code to Peter Steinberger, and GitHub history shows recent central work on menu rebuilds, script packaging, and cost scanning. (role: current-main area contributor and repository owner; confidence: high; commits: f83fb70b7f24, 529e25538a20, 487a78ceda7d; files: Sources/CodexBar/StatusItemController.swift, Sources/CodexBar/StatusItemController+MenuTracking.swift, Sources/CodexBar/LaunchAtLoginManager.swift)
  • ratulsarna: GitHub history shows adjacent work on hosted submenu provider context, compile/run script behavior, and keychain prompt churn in the same broad surface this PR changes. (role: recent adjacent contributor; confidence: medium; commits: a4852cec29a1, f8b9fd2280e4, 37dc64170fa6; files: Sources/CodexBar/StatusItemController+HostedSubmenus.swift, Scripts/compile_and_run.sh, Sources/CodexBar/KeychainPromptCoordinator.swift)
  • LeoLin990405: GitHub history shows recent provider cost-card unification work in the hosted submenu and cost rendering paths that this PR refreshes during cold-start menu readiness. (role: recent adjacent cost-menu contributor; confidence: medium; commits: 8fccb454acbb; files: Sources/CodexBar/StatusItemController+HostedSubmenus.swift, Sources/CodexBar/MenuCardView+Costs.swift)
  • ngutman: GitHub history shows prior hosted submenu width alignment and cost-history work that overlaps with the menu rendering and Cost submenu areas touched here. (role: adjacent menu and cost-history contributor; confidence: medium; commits: a5100c3f4ea3, c23edb9dc048; files: Sources/CodexBar/StatusItemController+HostedSubmenus.swift, Sources/CodexBarCore/CostUsageFetcher.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5ee755640c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

try register()
} else {
try service.unregister()
guard status() == .enabled else { return }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Unregister pending login item when disabling autostart

Treating setEnabled(false) as a no-op unless status() == .enabled leaves the login item registered when Service Management reports .requiresApproval. In that state the service is already registered but awaiting user approval, so users who toggle Launch at Login off will not actually remove it and can still see the pending/background item in System Settings on next login. This regression was introduced by the new status guard and should also call unregister() for .requiresApproval (or for any state except .notRegistered).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 31a37207: setEnabled(false) now calls unregister() for any ServiceManagement state except .notRegistered, so pending .requiresApproval login items are removed instead of left registered. Added a focused regression test for the .requiresApproval disable path and verified with swift test --filter LaunchAtLoginManagerTests plus git diff --check.

@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
Copy link
Copy Markdown
Contributor Author

Fixed the hosted submenu provider-identity regression in 63f36891.

What changed:

  • Preserved provider.rawValue on hydrated Cost, OpenAI API usage, and Subscription Utilization hosted chart menu items in both rendering and non-rendering branches.
  • Added a focused regression test that opens an already-hydrated provider-specific Cost submenu, refreshes it while open, and verifies it remains provider-tagged and hydrated instead of falling back to No data available.

Verification run:

  • swift test --filter StatusMenuHostedSubmenuRefreshTests
  • swift test --filter AppDelegateTests
  • swift test --filter LaunchAtLoginManagerTests
  • git diff --check
  • make check

Duplicate-launch note: I left SingleInstanceLaunchGuard unchanged in this patch. Its current behavior intentionally terminates later duplicate launches by bundle identifier to avoid the user-visible double menu-bar instance; dev/package flows should kill stale instances before relaunch. That compatibility tradeoff remains ready for maintainer acceptance or adjustment.

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 26, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@AmrMohamad AmrMohamad force-pushed the enhance/UI-performance-issues branch from 63f3689 to 351dd19 Compare May 26, 2026 00:43
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 26, 2026
@AmrMohamad AmrMohamad force-pushed the enhance/UI-performance-issues branch from 351dd19 to 090937e Compare May 29, 2026 17:25
@AmrMohamad
Copy link
Copy Markdown
Contributor Author

@clawsweeper re-review

@clawsweeper
Copy link
Copy Markdown

clawsweeper Bot commented May 29, 2026

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@steipete
Copy link
Copy Markdown
Owner

Thanks @AmrMohamad. I could not update this PR branch directly because maintainer edits are disabled, so I extracted the low-risk cold-start menu readiness subset and landed it in #1231 / d7db992.

The landed change refreshes open Codex parent and hosted adjunct menus when dashboard, credits, token-cost, and plan-history data become ready after launch, and adds cached Codex token hydration with root-fingerprint validation and cached Pi session merging.

I left the duplicate launch guards, launch validation harnesses, keychain/login-item changes, docs churn, and broad cleanup out because they need separate review and proof. Please enable Allow edits by maintainers and split those pieces into smaller PRs if you want to continue them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 Merging this PR could break existing users, config, migrations, defaults, or upgrades. P2 Normal priority bug or improvement with limited blast radius. proof: sufficient Contributor real behavior proof is sufficient. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants