test: integration tests for first-run dispatch (issue #157) by greynewell · Pull Request #164 · supermodeltools/cli

greynewell · 2026-04-30T14:36:45Z

Closes #157.

Adds cmd/first_run_integration_test.go covering the non-PTY paths of the first-run dispatch matrix introduced in issue #151 and implemented in PR #152:

non-TTY + no key → exits 1 with "not authenticated" error
non-TTY + valid key env var → enters watch path; error is NOT "not authenticated"
version subcommand → exits 0, prints version (noConfigCommand bypass)
completion bash → exits 0, prints completion script (noConfigCommand bypass)
analyze without key → exits 1 with "run 'supermodel setup'" error (persistentPreRunE path)
config with empty api_key field → treated same as no key (edge case for config parsing)

PTY-based tests (wizard launch via setup.Run) require hardware and are left for manual testing.

The tests build the binary once via sync.Once in buildBinary to avoid redundant compilation, use a clean temp HOME, and pipe /dev/null to stdin to guarantee non-interactive mode across CI environments.

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Added integration tests for CLI authentication behavior and subcommand validation, including checks for root command, version, completion, and analyze functionality with and without API credentials.

Note: This release contains no user-facing changes; it consists solely of test infrastructure improvements to ensure product quality.

Adds integration tests covering the non-PTY dispatch matrix defined in issue #151 and implemented in PR #152: - non-TTY + no key → exits 1 with "not authenticated" error - non-TTY + key → enters watch path (error is NOT "not authenticated") - version subcommand → exits 0, prints version (noConfigCommand bypass) - completion bash → exits 0, prints completion script (noConfigCommand) - analyze without key → exits 1 with "run 'supermodel setup'" error - config with empty api_key field → treated same as no key PTY-based tests (wizard launch) require hardware and are left for manual testing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai · 2026-04-30T14:36:57Z

Warning

Rate limit exceeded

@greynewell has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 19 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5ff49e95-7a93-43ec-bc6c-54632d160354

📥 Commits

Reviewing files that changed from the base of the PR and between b9d8336 and 5972c7b.

📒 Files selected for processing (1)

cmd/first_run_integration_test.go

Walkthrough

Adds an integration test file that validates the supermodel CLI's first-run behavior across multiple scenarios. The test suite uses shared helpers to build the binary once and execute it in a sandboxed environment (isolated HOME, no API key by default, stdin redirected). It verifies that authentication checks, subcommand behavior, and configuration handling work correctly at startup.

Changes

Cohort / File(s)	Summary
Integration Test Suite `cmd/first_run_integration_test.go`	New test file with helper functions to build the CLI binary and run it in a sanitized environment. Includes test cases validating: root command auth behavior (fails without key), subcommand-specific handling (version and completion succeed without auth, analyze fails gracefully), API key environment variable processing, and config file api_key handling.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🧪 A fresh binary is born, once built to last,
Sandboxed tests dance through scenarios fast,
No home, no keys—does it fail just right?
Each command checked from install to sight. ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title clearly summarizes the main change: adding integration tests for first-run dispatch behavior linked to issue `#157`.
Description check	✅ Passed	Description covers all template sections with detailed test scenarios, implementation approach, and context about PTY-based exclusions.
Linked Issues check	✅ Passed	PR implements 4 of 6 requested test scenarios from `#157`: non-TTY paths (items 2, 3, 5, 6) are covered; PTY-based tests (items 1, 4) are appropriately deferred as noted.
Out of Scope Changes check	✅ Passed	All changes are focused on adding integration tests for the first-run dispatch matrix; no unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/first-run-integration

_{Review rate limit: 0/5 reviews remaining, refill in 6 minutes and 19 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/first_run_integration_test.go`:
- Around line 66-99: The subprocess execution in runSupermodel (and the similar
helper around lines 157-171) uses cmd.CombinedOutput() without a timeout; change
to create a context with a hard timeout (e.g., context.WithTimeout) and use
exec.CommandContext to start the process, then call CombinedOutput (or Wait) on
that command so the context can kill the process on timeout; detect
context.DeadlineExceeded and call t.Fatalf with a clear timeout message
including the timeout duration and command args, and ensure the context cancel
is deferred to release resources.
- Around line 122-257: Add two tests to close the non-PTY gaps: implement
TestFirstRun_Setup_WithKey that uses freshHome and writeConfig (or
SUPERMODEL_API_KEY env) to ensure running runSupermodel with args ["setup"]
still launches setup even when an API key exists (assert exit code indicates
setup path and output contains "supermodel setup" or the setup prompt), and
implement TestFirstRun_BadAPIKey_Path that invokes the binary (via runSupermodel
or exec.Command like in TestFirstRun_NoTTY_WithKey) with a deliberately invalid
key (e.g. SUPERMODEL_API_KEY=sm_bad_key or a config file with a bad key) and
asserts the process surfaces an actionable invalid-key/401-style message (not
the generic "not authenticated") and exits non-zero; use existing helpers
freshHome, runSupermodel, writeConfig, buildBinary and mirror the style and
assertions in TestFirstRun_NoTTY_WithKey and TestFirstRun_Analyze_NoKey.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 229e7d07-2ec7-411a-8d28-260f8b202eb2

📥 Commits

Reviewing files that changed from the base of the PR and between 037157d and b9d8336.

📒 Files selected for processing (1)

cmd/first_run_integration_test.go

coderabbitai · 2026-04-30T14:39:54Z

+// ---------------------------------------------------------------------------
+// Tests
+// ---------------------------------------------------------------------------
+
+// TestFirstRun_NoTTY_NoKey verifies that running bare `supermodel` in a
+// non-interactive context without an API key exits 1 with a clear "not
+// authenticated" message (issue #151 dispatch matrix, bottom-left cell).
+func TestFirstRun_NoTTY_NoKey(t *testing.T) {
+	home := freshHome(t)
+	out, code := runSupermodel(t, nil, home)
+
+	if code != 1 {
+		t.Errorf("expected exit code 1, got %d\noutput: %s", code, out)
+	}
+	if !strings.Contains(out, "not authenticated") {
+		t.Errorf("expected output to contain %q\nfull output: %s", "not authenticated", out)
+	}
+}
+
+// TestFirstRun_NoTTY_WithKey verifies that a non-interactive caller that
+// supplies a valid-format API key enters the watch path (not the auth-error
+// path). The watch daemon will fail quickly in the test environment (no real
+// API, no project cache), but the error must NOT be the "not authenticated"
+// message — proving the dispatch selected runWatch, not errNotAuthenticated.
+func TestFirstRun_NoTTY_WithKey(t *testing.T) {
+	home := freshHome(t)
+	// Create a project sub-directory so the binary doesn't refuse to run in
+	// HOME (which the watch daemon rejects as a safety measure).
+	projectDir := filepath.Join(home, "project")
+	if err := os.MkdirAll(projectDir, 0o755); err != nil {
+		t.Fatalf("create project dir: %v", err)
+	}
+
+	// Use an env var key. The format must look plausible; any non-empty
+	// value causes the dispatch to pick runWatch.
+	bin := buildBinary(t)
+	cmd := exec.Command(bin, "--dir", projectDir, "--notify-port", "0")
+	cmd.Env = []string{
+		"HOME=" + home,
+		"PATH=" + os.Getenv("PATH"),
+		"SUPERMODEL_API_KEY=sm_integration_test_key",
+	}
+	devNull, err := os.Open(os.DevNull)
+	if err != nil {
+		t.Fatalf("open /dev/null: %v", err)
+	}
+	defer devNull.Close()
+	cmd.Stdin = devNull
+
+	out, execErr := cmd.CombinedOutput()
+	combined := string(out)
+
+	// The process exits non-zero quickly (API call fails, port conflict, etc.)
+	// — that's expected. What we must NOT see is the "not authenticated" error.
+	if strings.Contains(combined, "not authenticated") {
+		t.Errorf("got 'not authenticated' error, but expected the watch path to be entered with a valid API key\noutput: %s", combined)
+	}
+	// Confirm the binary did exit (one way or another) — don't hang.
+	if execErr == nil {
+		// Unexpected clean exit — the daemon should not stop immediately.
+		// This isn't necessarily a failure, but log it for visibility.
+		t.Logf("binary exited 0 unexpectedly (may need a real project dir or API)\noutput: %s", combined)
+	}
+}
+
+// TestFirstRun_VersionSubcommand_NoKey ensures `supermodel version` exits 0
+// and prints a version string without requiring an API key. This exercises
+// the noConfigCommands bypass in persistentPreRunE.
+func TestFirstRun_VersionSubcommand_NoKey(t *testing.T) {
+	home := freshHome(t)
+	out, code := runSupermodel(t, []string{"version"}, home)
+
+	if code != 0 {
+		t.Errorf("expected exit code 0, got %d\noutput: %s", code, out)
+	}
+	if !strings.Contains(out, "supermodel") {
+		t.Errorf("version output should contain the word 'supermodel'\nfull output: %s", out)
+	}
+}
+
+// TestFirstRun_CompletionBash_NoKey checks that the nested `completion bash`
+// subcommand (a noConfigCommand) exits 0 and emits a bash completion script
+// without requiring an API key.
+func TestFirstRun_CompletionBash_NoKey(t *testing.T) {
+	home := freshHome(t)
+	out, code := runSupermodel(t, []string{"completion", "bash"}, home)
+
+	if code != 0 {
+		t.Errorf("expected exit code 0, got %d\noutput: %s", code, out)
+	}
+	// Bash completion scripts always begin with a comment or a function
+	// declaration. Both contain "supermodel" as the command name.
+	if !strings.Contains(out, "supermodel") {
+		t.Errorf("completion output should reference 'supermodel'\nfull output: %s", out)
+	}
+}
+
+// TestFirstRun_Analyze_NoKey verifies that `supermodel analyze` without an
+// API key exits 1 with an actionable "run 'supermodel setup'" message (via
+// persistentPreRunE), NOT the generic "not authenticated" error reserved for
+// the bare root command.
+func TestFirstRun_Analyze_NoKey(t *testing.T) {
+	home := freshHome(t)
+	out, code := runSupermodel(t, []string{"analyze"}, home)
+
+	if code != 1 {
+		t.Errorf("expected exit code 1, got %d\noutput: %s", code, out)
+	}
+	// persistentPreRunE should fire before analyze's own RequireAPIKey(),
+	// producing the "setup" guidance rather than the root-command error.
+	if !strings.Contains(out, "supermodel setup") {
+		t.Errorf("expected output to contain %q\nfull output: %s", "supermodel setup", out)
+	}
+	// Specifically must NOT surface the bare-root "not authenticated" error.
+	if strings.Contains(out, "not authenticated") {
+		t.Errorf("got 'not authenticated' (root-command error) but expected 'supermodel setup' guidance\nfull output: %s", out)
+	}
+}
+
+// TestFirstRun_EmptyAPIKeyInConfig verifies that a config file with an
+// explicit empty api_key field is treated the same as having no key at all —
+// i.e. the binary exits 1 with a "not authenticated" error in non-TTY mode.
+// This is an edge case that could silently break if config parsing changes.
+func TestFirstRun_EmptyAPIKeyInConfig(t *testing.T) {
+	home := freshHome(t)
+	writeConfig(t, home, "api_key: \"\"\n")
+
+	out, code := runSupermodel(t, nil, home)
+
+	if code != 1 {
+		t.Errorf("expected exit code 1 with empty api_key in config, got %d\noutput: %s", code, out)
+	}
+	if !strings.Contains(out, "not authenticated") {
+		t.Errorf("expected 'not authenticated' error for empty api_key config\nfull output: %s", out)
+	}
+}


🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Coverage still misses two non-PTY objectives from #157

Given this PR is positioned to close #157, tests for these non-PTY scenarios are still absent:

explicit supermodel setup should launch setup even when key exists (regression guard),

bad key path should surface actionable invalid-key handling (401-style behavior).

Please add those cases here (or update PR scope/closure note).

If helpful, I can draft both test cases using the same binary/home helpers.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@cmd/first_run_integration_test.go` around lines 122 - 257, Add two tests to close the non-PTY gaps: implement TestFirstRun_Setup_WithKey that uses freshHome and writeConfig (or SUPERMODEL_API_KEY env) to ensure running runSupermodel with args ["setup"] still launches setup even when an API key exists (assert exit code indicates setup path and output contains "supermodel setup" or the setup prompt), and implement TestFirstRun_BadAPIKey_Path that invokes the binary (via runSupermodel or exec.Command like in TestFirstRun_NoTTY_WithKey) with a deliberately invalid key (e.g. SUPERMODEL_API_KEY=sm_bad_key or a config file with a bad key) and asserts the process surfaces an actionable invalid-key/401-style message (not the generic "not authenticated") and exits non-zero; use existing helpers freshHome, runSupermodel, writeConfig, buildBinary and mirror the style and assertions in TestFirstRun_NoTTY_WithKey and TestFirstRun_Analyze_NoKey.

The binary built by buildBinary must have a .exe extension on Windows to be executable. Use runtime.GOOS to choose the correct name. Also skip `completion bash` on Windows (bash is not available by default) and use os.DevNull for cross-platform /dev/null handling. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Address CodeRabbit review: wrap all exec.Command calls in exec.CommandContext with a 20s hard timeout so that if the watch daemon behaviour changes, CI cannot hang indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 30, 2026

View reviewed changes

greynewell and others added 2 commits April 30, 2026 10:41

greynewell marked this pull request as ready for review April 30, 2026 14:49

greynewell requested a review from jonathanpopham as a code owner April 30, 2026 14:49

greynewell merged commit 518c2bc into main Apr 30, 2026
6 of 7 checks passed

greynewell deleted the test/first-run-integration branch April 30, 2026 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: integration tests for first-run dispatch (issue #157)#164

test: integration tests for first-run dispatch (issue #157)#164
greynewell merged 3 commits intomainfrom
test/first-run-integration

greynewell commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading

Rate limit exceeded

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

greynewell commented Apr 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greynewell commented Apr 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 30, 2026 •

edited

Loading