WIP: CodeQL integration — Phase 1 subphases + harness cleanup + CodeQL install scaffold by pruiz · Pull Request #29 · pruiz/CodeCome

pruiz · 2026-05-29T16:08:21Z

CodeQL Integration — groundwork PR

Branch: wip/codeql-integration-plan
Status: WIP — implementation in progress

Changes in this PR

Standardize workflow entrypoints through the harness and keep make opencode-raw as the explicit non-workflow debug path.
Split Phase 1 into subphases 1a / 1b / 1c with gates and dedicated prompts.
Refactor tools/gate-check.py into a thin CLI wrapper delegating to package logic in tools/phases/.
Add managed CodeQL installer/config/check scaffolding under tools/codeql/.
Rename make venv to make init (keeping venv as alias) and make it optionally install CodeQL.
Add CI coverage for make init, with full CodeQL install on one macOS and one Ubuntu job and skip-install coverage on the remaining matrix jobs.
Add focused tests for CodeQL config/install behavior and updated gate tests.

Phase 1 subphase orchestration

Phase 1 now runs as three subphases (1a / 1b / 1c) with gates and CodeQL analysis between 1a and 1b. The harness (tools/codecome/harness.py) delegates Phase 1 to dedicated subphase orchestration in tools/codecome/phase_1.py, which manages its own server lifecycle and retry loops.

Not in this PR

No actual CodeQL run subcommand yet.
No SARIF normalization yet.
No Phase 2 candidate generation from CodeQL yet.
No sweep context injection yet.

Context

Full integration plan: .project/codeql-integration-plan.md

Remaining PRs in sequence

Pack catalog and resolver
CodeQL run and SARIF normalization
Phase 1 deterministic CodeQL integration
Phase 2 candidates
Sweep context

Summary by CodeRabbit

New Features
- First-class CodeQL integration: managed install, pack/profile resolution, end-to-end runs with SARIF normalization, artifact gating, risk-index enrichment, candidate generation, and Phase‑1 split into gated substeps with automated repair/resume.
Enhanced UI / Events
- Raw SSE event recording, improved transcript flushing, model metadata in UI, patch/file-change renderers, and inflight write-file tracking.
Documentation
- New phase prompts, templates, repair guidance, and implementation checklist.
Chores
- Repo init target, non-workflow raw run target, cleanup target, CI init step.
Tests
- Large expanded test coverage across CodeQL flows, pack resolution, artifacts, normalization, pipeline, gates, and rendering.

coderabbitai · 2026-05-29T16:08:35Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

I can rebuild the full stack, but generating a correct one-shot mapping of every rangeId here is very large and error-prone. Please confirm you want a single-cohort hidden stack (all rangeIds placed in one reviewer-friendly cohort/layer) and I will produce the complete hidden review artifact in my next message.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch wip/codeql-integration-plan

github-actions · 2026-05-29T16:09:52Z

Coverage Report

Metric	Value
Line Coverage	71.7%
Lines Covered	0 / 0

Download detailed HTML coverage reports per OS/Python from the workflow artifacts.

Generated by pytest-cov on 2026-05-31T23:59:08.554Z

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Makefile`:
- Around line 194-196: The Makefile currently only checks PROMPT_FILE for
non-empty but not for existence/readability, so update the prerequisite checks
(the lines guarding AGENT and PROMPT_FILE) to validate PROMPT_FILE is a readable
file (e.g., use a shell test like -r or -f && -r) before calling opencode; if
the readability test fails, print the same usage/error and exit 1. Ensure the
opencode invocation that uses $(PROMPT_FILE) (the opencode run --agent
"$(AGENT)" ... "$$(cat "$(PROMPT_FILE)")") remains safe by keeping the
readable-file guard in place so cat won't be invoked on a missing/unreadable
file.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 45ce00ad-a012-4aa4-95af-df4488dc4585

📥 Commits

Reviewing files that changed from the base of the PR and between f3b49fb and 2a37f39.

📒 Files selected for processing (2)

.project/codeql-integration-plan.md
Makefile

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/gate-check.py (1)
352-617: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move the Phase 1 gate implementation out of the root script.

This PR adds more orchestration and validation logic directly into tools/gate-check.py, but the root tools/ scripts are supposed to stay as thin CLI wrappers. Putting the implementation in a package module would also remove the need for the script-path import workaround in tools/codecome/phase_1.py.

As per coding guidelines, Standalone scripts at the tools/ root (e.g., create-finding.py, list-findings.py, run-agent.py) should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/gate-check.py` around lines 352 - 617, The Phase 1 gate logic
(functions _notes_exist, _count_findings_since, check_phase_1a, check_phase_1b,
check_phase_1c and any helper logic) must be moved out of the root script into a
package module (e.g., tools.codecome.phase_1) so the root tools/gate-check.py
remains a thin CLI wrapper; create a new module exposing
check_phase_1a/check_phase_1b/check_phase_1c (and any helpers they need) and
update the root script to simply parse args and call into those exported
functions, removing the embedded implementation and the script-path import
workaround in tools/codecome/phase_1.py.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/codecome/cli.py`:
- Around line 69-73: The check in cli.py incorrectly exempts the literal "None"
from requiring prompt_file, which causes tools/codecome/harness.py to crash on
ROOT / args.prompt_file; update the condition that builds required (the list
manipulation around args.phase) so that prompt_file remains required for the
"None" phase (e.g., remove "None" from the tuple check or change to if
str(args.phase) != "1") ensuring args.prompt_file is validated, or alternatively
add explicit early handling for args.phase == "None" before dispatch; reference
args.phase, prompt_file, required and the harness.py usage ROOT /
args.prompt_file when making the change.

In `@tools/codecome/phase_1.py`:
- Around line 155-229: The loop retains finish_warning across retries causing a
later successful attempt to be treated as failed; reset finish_warning at the
start of each iteration. Inside the while True retry loop (after attempt_number
increment and/or right after calling _reset_subagent_state()) set finish_warning
= None so the subsequent logic that inspects last_finish_reason,
last_permission_error, any_step_finish_seen, etc., recomputes state cleanly;
ensure the variable is in scope for the later checks that potentially set it and
for the final "if finish_warning is not None" branch that sets returncode = 2.
- Around line 361-364: In _run_subphase, remove the unused and invalid import
"from tools.gate_check import _count_findings_since" and the surrounding
conditional block that only contains a no-op pass (the branch "if
findings_snapshot is not None and returncode == 0: ..."); this import crashes
because gate_check is loaded dynamically elsewhere by run_phase_1, and the
symbol isn't used—simply delete that conditional and its contents in the
_run_subphase function so findings_snapshot continues to be handled by the
caller/orchestrator.

In `@tools/gate-check.py`:
- Around line 472-501: The code only validates when data is a dict, so
non-mapping YAML (list/scalar) silently passes; modify the validation around
data to explicitly reject non-mapping values: if not isinstance(data, dict) then
call _emit(console, "fail", "file-risk-index.yml: must be a mapping") and return
1 (instead of falling through), so the subsequent logic that expects
data.get("files") only runs for dict data; update the check that currently reads
if isinstance(data, dict): ... to handle the else branch by emitting the fail
and returning 1.

---

Outside diff comments:
In `@tools/gate-check.py`:
- Around line 352-617: The Phase 1 gate logic (functions _notes_exist,
_count_findings_since, check_phase_1a, check_phase_1b, check_phase_1c and any
helper logic) must be moved out of the root script into a package module (e.g.,
tools.codecome.phase_1) so the root tools/gate-check.py remains a thin CLI
wrapper; create a new module exposing
check_phase_1a/check_phase_1b/check_phase_1c (and any helpers they need) and
update the root script to simply parse args and call into those exported
functions, removing the embedded implementation and the script-path import
workaround in tools/codecome/phase_1.py.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 7dcbb152-75dc-4f1f-be06-9d6d977ce878

📥 Commits

Reviewing files that changed from the base of the PR and between 2a37f39 and d89971c.

📒 Files selected for processing (9)

Makefile
prompts/phase-1a-profile.md
prompts/phase-1b-codeql-recon.md
prompts/phase-1c-sandbox.md
templates/codeql-plan.yml
tools/codecome/cli.py
tools/codecome/harness.py
tools/codecome/phase_1.py
tools/gate-check.py

✅ Files skipped from review due to trivial changes (1)

templates/codeql-plan.yml

🚧 Files skipped from review as they are similar to previous changes (1)

Makefile

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/gate-check.py (1)
302-363: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Move remaining gate logic out of the root CLI script.

tools/gate-check.py is still a full implementation module; this entrypoint should delegate to package code instead of containing gate/business logic.

As per coding guidelines, "Standalone scripts at the tools/ root ... should be thin wrappers that delegate to their respective packages. Implementation must live in the package, not the script."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/gate-check.py` around lines 302 - 363, The CLI contains business logic
in main() and build_parser() (calling
check_phase_1a/check_phase_1b/check_phase_1c and gate_phase_1..gate_phase_6)
instead of delegating to package code; refactor by moving all gate_* and
check_phase_* implementations into the package (e.g., package.gates or
package.gate_check module) and replace the body of main() in tools/gate-check.py
with a thin wrapper that builds/forwards parsed args to a single package
entrypoint (for example import package.gate_check and call
package.gate_check.run_from_cli(args) or package.gate_check.main(args)); ensure
the CLI only handles argument parsing (build_parser) and error/exit handling
while all gate/business logic lives in the package module and is imported/used
by the wrapper.

🧹 Nitpick comments (1)

tools/phases/phase_1_gates.py (1)

127-130: ⚡ Quick win

Narrow broad exception handling in YAML validation paths.

Prefer catching specific parse/read exceptions instead of Exception in both YAML load blocks to avoid masking unrelated runtime/control-flow errors.

Proposed change

-        except Exception as exc:
+        except (yaml.YAMLError, OSError) as exc:
             _emit(console, "fail", f"codeql-plan.yml is not valid YAML: {exc}")
             return 1
...
-        except Exception as exc:
+        except (yaml.YAMLError, OSError) as exc:
             _emit(console, "fail", f"file-risk-index.yml is not valid YAML: {exc}")
             return 1

Also applies to: 196-199

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/phases/phase_1_gates.py` around lines 127 - 130, The broad except
Exception around YAML loading should be narrowed: replace the generic except
block that catches errors from plan = yaml.safe_load(plan_path.read_text(...))
(and the similar block at lines 196-199) with a specific exception tuple such as
(yaml.YAMLError, OSError, UnicodeDecodeError) to only handle YAML parse/read
errors; keep the same handling body (_emit(console, "fail", ...); return 1) and
import yaml if not already present. Ensure both occurrences (the plan YAML load
and the other YAML load at the referenced block) are updated.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@codecome.yml`:
- Around line 88-115: The static_analysis.codeql block is currently placed under
the wrong parent; move the entire static_analysis: { codeql: {...} } section out
of the audit namespace and place it under validation so the runtime reads
validation.static_analysis.codeql; ensure you preserve all keys (enabled,
fail_policy, pack_catalog, install, output_dir, database_dir, cache_dir,
phase_1, phase_2, sweep and their values) and any indentation/formatting so the
YAML structure remains valid and references like static_analysis, codeql,
phase_1, phase_2, and sweep still match the original configuration.

In `@tools/codeql/config.py`:
- Around line 145-149: The env parsing in _get currently returns a raw string
when int(raw) fails, which later causes an unguarded int(CODEQL_MAX_CANDIDATES)
cast to raise ValueError; update either _get or the consumer to ensure numeric
conversion is safe: modify _get (the helper that tries int(raw)) to return None
or raise a controlled error on invalid integer, or change the
CODEQL_MAX_CANDIDATES usage to validate/parse with a safe conversion (wrap
int(...) in try/except and fall back to a default or None) so the unhandled
ValueError cannot abort config resolution; reference the _get helper and the
CODEQL_MAX_CANDIDATES int(...) cast where it is used and add a clear
fallback/default on parse failure.
- Around line 69-76: The CodeQL config loader currently only reads nested path
data["audit"]["static_analysis"]["codeql"] (function returning cq) which misses
configs placed under top-level "static_analysis.codeql" per docs; update the
loader to also check data.get("static_analysis") and use its "codeql" dict if
present (keep existing audit path as fallback) so the function that returns cq
can find either location. Also harden resolve_config() parsing of
CODEQL_MAX_CANDIDATES: instead of relying on _get(..., coerce=int) that may
return the raw string on ValueError, explicitly retrieve the env value (via _get
or os.getenv), then convert with a try/except around int(...) and fall back to a
safe default (or None) before calling max_candidates=int(max_candidates);
reference resolve_config and the CODEQL_MAX_CANDIDATES handling to add the safe
int conversion and avoid crashing.

In `@tools/codeql/install.py`:
- Around line 33-41: The _detect_platform function currently maps unknown
systems to "linux64", causing wrong asset selection; update _detect_platform to
explicitly handle known platforms (e.g., keep "darwin" -> "osx64", "linux" ->
"linux64", add "windows" -> "win64" if needed) and if the system value is not
recognized, raise a clear exception (e.g., RuntimeError or ValueError) including
the detected system and machine values so callers fail fast with an informative
message instead of silently using "linux64".
- Around line 125-126: The current use of
zipfile.ZipFile(...).extractall(dest_dir) is vulnerable to path traversal;
replace the extractall call by iterating over zf.infolist() (or zf.namelist()),
compute each target path with os.path.join(dest_dir, member.filename), normalize
with os.path.abspath, and ensure the resulting path starts with the absolute
dest_dir path (plus a path separator) before extracting; only then extract that
member (e.g., via zf.extract or by opening the member and writing bytes) so that
zipfile.ZipFile, zf.infolist()/zf.namelist(), and any existing extraction logic
around dest_dir are hardened against ../ escapes.

---

Outside diff comments:
In `@tools/gate-check.py`:
- Around line 302-363: The CLI contains business logic in main() and
build_parser() (calling check_phase_1a/check_phase_1b/check_phase_1c and
gate_phase_1..gate_phase_6) instead of delegating to package code; refactor by
moving all gate_* and check_phase_* implementations into the package (e.g.,
package.gates or package.gate_check module) and replace the body of main() in
tools/gate-check.py with a thin wrapper that builds/forwards parsed args to a
single package entrypoint (for example import package.gate_check and call
package.gate_check.run_from_cli(args) or package.gate_check.main(args)); ensure
the CLI only handles argument parsing (build_parser) and error/exit handling
while all gate/business logic lives in the package module and is imported/used
by the wrapper.

---

Nitpick comments:
In `@tools/phases/phase_1_gates.py`:
- Around line 127-130: The broad except Exception around YAML loading should be
narrowed: replace the generic except block that catches errors from plan =
yaml.safe_load(plan_path.read_text(...)) (and the similar block at lines
196-199) with a specific exception tuple such as (yaml.YAMLError, OSError,
UnicodeDecodeError) to only handle YAML parse/read errors; keep the same
handling body (_emit(console, "fail", ...); return 1) and import yaml if not
already present. Ensure both occurrences (the plan YAML load and the other YAML
load at the referenced block) are updated.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f4059406-db16-428d-a294-87f1bf4b8704

📥 Commits

Reviewing files that changed from the base of the PR and between d89971c and 41204f9.

📒 Files selected for processing (11)

.gitignore
Makefile
codecome.yml
tools/codecome/cli.py
tools/codecome/phase_1.py
tools/codeql.py
tools/codeql/__init__.py
tools/codeql/config.py
tools/codeql/install.py
tools/gate-check.py
tools/phases/phase_1_gates.py

✅ Files skipped from review due to trivial changes (1)

tools/codeql/init.py

🚧 Files skipped from review as they are similar to previous changes (2)

tools/codecome/cli.py
Makefile

Copilot

Pull request overview

This PR lays groundwork for CodeQL integration by adding a managed CodeQL CLI installer/config layer and introducing Phase 1 subphase gates/orchestration, alongside Makefile/CI wiring to standardize running everything through the harness.

Changes:

Added Phase 1a/1b/1c gate implementations and refactored gate-check.py into a thin CLI wrapper.
Introduced CodeQL tooling scaffold (tools/codeql.py, tools/codeql/config.py, tools/codeql/install.py) plus make init to set up venv + (optionally) install CodeQL.
Updated workflow entrypoints (Makefile, CI) and added tests for the new gates/config/install helpers.

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
tools/phases/phase_1_gates.py	Implements reusable gates for Phase 1 subphases (1a/1b/1c).
tools/phases/gates.py	Centralizes existing phase gates and adds subphase dispatch.
tools/gate-check.py	Slims CLI wrapper to call the gates module; adds subphase support.
tools/codeql/install.py	Adds managed CodeQL CLI download/extract/install + “current” symlink.
tools/codeql/config.py	Adds env/YAML/default resolution for CodeQL settings.
tools/codeql.py	Adds `install`/`check` wrapper CLI for CodeQL toolchain.
tools/codecome/phase_1.py	Adds Phase 1 subphase orchestration and CodeQL placeholder step.
tools/codecome/harness.py	Adds special-case harness path for Phase 1 orchestration/server lifecycle.
tools/codecome/cli.py	Adjusts required args so Phase 1 doesn’t require `prompt_file`.
Makefile	Renames `venv` → `init` (keeps alias), removes wrapper branching, adds `opencode-raw`.
codecome.yml	Adds CodeQL configuration block under `audit.static_analysis.codeql`.
templates/codeql-plan.yml	Adds CodeQL plan template used by Phase 1a output.
prompts/phase-1a-profile.md	Adds Phase 1a prompt.
prompts/phase-1b-codeql-recon.md	Adds Phase 1b prompt.
prompts/phase-1c-sandbox.md	Adds Phase 1c prompt.
.github/workflows/tests.yml	Runs `make init` during CI.
.gitignore	Ignores `.tools/` and `.cache/codeql/` for managed install/cache.
tests/test_gate_check.py	Updates tests to target `tools/phases/gates.py` implementation.
tests/test_codeql_install.py	Adds unit tests for zip extraction and binary path resolution.
tests/test_codeql_config.py	Adds unit tests for config resolution / env handling.
.project/codeql-integration-plan.md	Adds/updates the CodeQL integration plan document.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/codeql/install.py (1)
233-285: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Recompute binary_path after extraction to avoid a stale legacy path.

binary_path is resolved at Line 233 from the pre-extraction state of version_dir. On a force reinstall over an older nested install, _codeql_binary returns the legacy version_dir/codeql/codeql. After the rmtree + _extract writes the new flattened layout (version_dir/codeql), _verify(binary_path) still checks the now-missing legacy path and reports failure even though the install succeeded. Re-derive it after _extract.
🐛 Proposed fix
         _extract(zip_path, version_dir)
     finally:
         shutil.rmtree(tmp_dir, ignore_errors=True)
 
     # --- Create current symlink ---
     _ensure_symlink(version_dir, current_link)
 
     # --- Verify ---
+    binary_path = _codeql_binary(version_dir)
     return _verify(binary_path)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codeql/install.py` around lines 233 - 285, binary_path is computed from
version_dir before extraction which can point to a legacy nested path; after
removing the old install and calling _extract(version_dir) you must re-compute
binary_path by calling _codeql_binary(version_dir) again before calling
_verify(binary_path). Update the flow in the install routine to call
_codeql_binary(version_dir) (reassigning binary_path) after _extract(...) and
before _ensure_symlink(...)/_verify(...) so verification checks the newly
extracted layout; keep existing calls to _ensure_symlink and _verify but pass
the re-derived binary_path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tools/codeql/install.py`:
- Around line 233-285: binary_path is computed from version_dir before
extraction which can point to a legacy nested path; after removing the old
install and calling _extract(version_dir) you must re-compute binary_path by
calling _codeql_binary(version_dir) again before calling _verify(binary_path).
Update the flow in the install routine to call _codeql_binary(version_dir)
(reassigning binary_path) after _extract(...) and before
_ensure_symlink(...)/_verify(...) so verification checks the newly extracted
layout; keep existing calls to _ensure_symlink and _verify but pass the
re-derived binary_path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 271b2d06-4ca9-4e5d-ae3d-ae15dfdce5c4

📥 Commits

Reviewing files that changed from the base of the PR and between 39b8691 and c4a220b.

📒 Files selected for processing (12)

.github/workflows/tests.yml
templates/codeql-plan.yml
tests/test_codeql_config.py
tests/test_codeql_install.py
tests/test_gate_check.py
tools/codecome/phase_1.py
tools/codeql.py
tools/codeql/config.py
tools/codeql/install.py
tools/gate-check.py
tools/phases/gates.py
tools/phases/phase_1_gates.py

🚧 Files skipped from review as they are similar to previous changes (5)

templates/codeql-plan.yml
tools/phases/phase_1_gates.py
tools/codeql/config.py
tools/codeql.py
tools/codecome/phase_1.py

coderabbitai

🧹 Nitpick comments (1)

tools/codeql/packs.py (1)
128-137: 💤 Low value

allow_precreate defaults to true for any profile not listed in candidate_policy.

Both the no-policy and malformed-policy paths fall back to True, so a new source added to packs but forgotten in candidate_policy will auto-allow precreation. For a security-gating policy this is fail-open. If that default is intentional (the catalog already enumerates most sources as true), this is fine — otherwise consider defaulting to False for unknown profiles.

Please confirm the intended default for profiles omitted from candidate_policy (fail-open true vs fail-closed false), since no downstream consumer exists yet in this PR to disambiguate intent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codeql/packs.py` around lines 128 - 137, The allow_precreate function
currently treats missing or malformed candidate_policy entries as True
(fail-open); change it to fail-closed by returning False when the catalog lacks
a proper "candidate_policy" dict, when the profile_name key is missing or not a
dict, or when "allow_precreate" is absent or not a bool. Update
allow_precreate(profile_name: str, catalog: dict[str, Any]) so it explicitly
validates candidate_policy and policy types and only returns the stored boolean
when isinstance(value, bool), otherwise return False.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tools/codeql/packs.py`:
- Around line 128-137: The allow_precreate function currently treats missing or
malformed candidate_policy entries as True (fail-open); change it to fail-closed
by returning False when the catalog lacks a proper "candidate_policy" dict, when
the profile_name key is missing or not a dict, or when "allow_precreate" is
absent or not a bool. Update allow_precreate(profile_name: str, catalog:
dict[str, Any]) so it explicitly validates candidate_policy and policy types and
only returns the stored boolean when isinstance(value, bool), otherwise return
False.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 394ec9fc-29e2-49df-a56b-b9fd15ab39ce

📥 Commits

Reviewing files that changed from the base of the PR and between c4a220b and f1eff95.

📒 Files selected for processing (4)

templates/codeql-packs.yml
tests/test_codeql_packs.py
tools/codeql.py
tools/codeql/packs.py

…d opencode-raw target - Remove CODECOME_USE_WRAPPER branches from all phase targets; always use harness - Remove CHAT/WRAPPER_ARGS and CODECOME_OPENCODE_ENV_EXPORT dead vars - Add opencode-raw debug target for non-workflow opencode runs - Rename make venv -> make init with CodeQL install stub (kept venv alias) - Update help text, .PHONY declarations, and venv-check error messages

- Add prompts/phase-1a-profile.md (target profile + codeql plan) - Add prompts/phase-1b-codeql-recon.md (CodeQL-assisted recon notes) - Add prompts/phase-1c-sandbox.md (sandbox bootstrap, extracted from old prompt) - Add templates/codeql-plan.yml template for model to fill - Add codecome/phase_1.py subphase orchestration module (1a→1b→1c) - Add check_phase_1a/1b/1c gate functions to gate-check.py - Update harness.py: Phase 1 dispatches to phase_1 module with own server lifecycle - Update cli.py: --prompt-file not required for Phase 1 - Update Makefile: phase-1 target simplified

Also add 'make init' step to CI test job so managed CodeQL install is exercised on every run.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (3)

tools/codecome/phase_1.py (2)
84-91: 💤 Low value

findings_snapshot is threaded through but never updated — align docstring with behavior.

The docstring states the snapshot "is updated after the session completes," but _run_subphase returns findings_snapshot unchanged, and run_phase_1 discards the returned value (rc, _ = _run_subphase(...)), relying instead on the snapshot it computed at Line 395. As written the parameter and its return are inert. Either drop the threading or update the docstring to reflect the pass-through (future-use) intent so the contract isn't misleading.

Also applies to: 357-357
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codecome/phase_1.py` around lines 84 - 91, The docstring for
_run_subphase and the function signature claim findings_snapshot is "updated
after the session completes," but _run_subphase currently passes
findings_snapshot through unchanged and run_phase_1 ignores the returned
snapshot (rc, _ = _run_subphase(...)), so update the contract: either remove the
findings_snapshot parameter/return from _run_subphase and callers (e.g.,
run_phase_1) if it's unused, or implement the intended update behavior by having
_run_subphase compute and return an updated findings_snapshot that run_phase_1
uses; reference the symbols _run_subphase, run_phase_1, and findings_snapshot
and make the docstring consistent with whichever approach you choose.
228-233: 💤 Low value

Consider a timeout on the frontmatter validator subprocess.

subprocess.run([...check-frontmatter.py]) has no timeout, so a hung validator would block the consume/orchestration path indefinitely. A bounded timeout (with the timeout treated as a validation failure) keeps the retry loop resilient.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codecome/phase_1.py` around lines 228 - 233, The subprocess.run call
that assigns validation_result (the call to subprocess.run([...
"tools/check-frontmatter.py"])) needs a bounded timeout and timeout handling:
add a timeout parameter (e.g., timeout=30) to subprocess.run, wrap the call in a
try/except for subprocess.TimeoutExpired, and on timeout synthesize a
validation_result-like failure (non-zero return code and stderr/message
indicating timeout) so the existing retry/validation logic treats timeouts as
validation failures; keep the variable name validation_result and ensure any
caught TimeoutExpired does not block the orchestration loop.
tests/test_rendering_events.py (1)
102-109: 💤 Low value

Tests no longer assert the "skip" behavior they're named for.

These now confirm render(...) returns True, but don't verify that empty/whitespace/missing text produces no output — so a regression that returns True while emitting stray output would pass. TestReasoningEventRenderer.test_skips_empty_text (Line 132-135) pairs the True assertion with an empty-output check; consider doing the same here.
💡 Add output assertion
-    def test_skips_empty_text(self):
+    def test_skips_empty_text(self, capsys):
         r = TextEventRenderer(_ctx("plain"))
         assert r.render({"part": {"text": ""}}) is True
         assert r.render({"part": {"text": "   \n\t  "}}) is True
+        assert capsys.readouterr().out == ""
 
-    def test_skips_missing_text(self):
+    def test_skips_missing_text(self, capsys):
         r = TextEventRenderer(_ctx("plain"))
         assert r.render({"part": {}}) is True
+        assert capsys.readouterr().out == ""
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_rendering_events.py` around lines 102 - 109, The tests only assert
render(...) returns True but don't verify no output was produced; update
test_skips_empty_text and test_skips_missing_text to also assert the renderer
emitted no output (mirror TestReasoningEventRenderer.test_skips_empty_text).
After calling r.render(...) assert the renderer's output container is empty
(e.g., check r.output == "" or r.events == [] depending on this renderer's API)
so empty/whitespace/missing text truly produces no output.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/chat/app.py`:
- Line 422: The chat SSE events are being written twice because
tools/chat/app.py installs EventRecorder.record as the chat loop raw recorder
(self.chat_loop.set_raw_event_recorder(self.event_recorder.record)) and
tools/events/chat_loop.py/_consumer_worker also causes _chat_render_and_log to
call self.event_recorder.record(event); fix by preventing the duplicate write:
either stop installing the raw recorder in tools/chat/app.py or add a guard in
_chat_render_and_log to skip calling EventRecorder.record when raw recording is
enabled (detect via a flag or by checking whether the chat loop has a
raw_event_recorder set). Update EventRecorder or ChatLoop to expose a boolean
(e.g., raw_recording_enabled or raw_event_recorder attribute) and change
_chat_render_and_log to only call self.event_recorder.record(event) when that
flag is false (or when no raw recorder is installed), referencing
EventRecorder.record, chat_loop.set_raw_event_recorder,
tools/events/chat_loop.py::_consumer_worker, and _chat_render_and_log.

---

Nitpick comments:
In `@tests/test_rendering_events.py`:
- Around line 102-109: The tests only assert render(...) returns True but don't
verify no output was produced; update test_skips_empty_text and
test_skips_missing_text to also assert the renderer emitted no output (mirror
TestReasoningEventRenderer.test_skips_empty_text). After calling r.render(...)
assert the renderer's output container is empty (e.g., check r.output == "" or
r.events == [] depending on this renderer's API) so empty/whitespace/missing
text truly produces no output.

In `@tools/codecome/phase_1.py`:
- Around line 84-91: The docstring for _run_subphase and the function signature
claim findings_snapshot is "updated after the session completes," but
_run_subphase currently passes findings_snapshot through unchanged and
run_phase_1 ignores the returned snapshot (rc, _ = _run_subphase(...)), so
update the contract: either remove the findings_snapshot parameter/return from
_run_subphase and callers (e.g., run_phase_1) if it's unused, or implement the
intended update behavior by having _run_subphase compute and return an updated
findings_snapshot that run_phase_1 uses; reference the symbols _run_subphase,
run_phase_1, and findings_snapshot and make the docstring consistent with
whichever approach you choose.
- Around line 228-233: The subprocess.run call that assigns validation_result
(the call to subprocess.run([... "tools/check-frontmatter.py"])) needs a bounded
timeout and timeout handling: add a timeout parameter (e.g., timeout=30) to
subprocess.run, wrap the call in a try/except for subprocess.TimeoutExpired, and
on timeout synthesize a validation_result-like failure (non-zero return code and
stderr/message indicating timeout) so the existing retry/validation logic treats
timeouts as validation failures; keep the variable name validation_result and
ensure any caught TimeoutExpired does not block the orchestration loop.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: f71129f4-d3ed-4f73-a195-b0778e28497e

📥 Commits

Reviewing files that changed from the base of the PR and between f1eff95 and a01752a.

📒 Files selected for processing (10)

tests/test_codecome_runner.py
tests/test_events_loops.py
tests/test_rendering_events.py
tools/chat/app.py
tools/codecome/phase_1.py
tools/codecome/runner.py
tools/codecome/transcript.py
tools/events/chat_loop.py
tools/events/phase_loop.py
tools/rendering/events/text.py

✅ Files skipped from review due to trivial changes (1)

tools/codecome/transcript.py

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

Makefile (1)

108-109: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix stale phase references for sandbox bootstrap (1b → 1c).

The help section and SANDBOX_SCRIPT_HINT still reference Phase 1b, but sandbox bootstrap is now Phase 1c.

Proposed fix

-	`@printf` "  $(BOLD)$(CYAN)Sandbox bootstrap (Phase 1b):$(RESET)\n"
+	`@printf` "  $(BOLD)$(CYAN)Sandbox bootstrap (Phase 1c):$(RESET)\n"

-SANDBOX_SCRIPT_HINT := "No sandbox helper script found. Run 'make phase-1' (sub-stage 1b) to bootstrap sandbox/ from templates/sandboxes/, or place the helper script under sandbox/scripts/ manually."
+SANDBOX_SCRIPT_HINT := "No sandbox helper script found. Run 'make phase-1' (sub-stage 1c) to bootstrap sandbox/ from templates/sandboxes/, or place the helper script under sandbox/scripts/ manually."

Also applies to: 314-314

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 108 - 109, Help text and SANDBOX_SCRIPT_HINT still
call out "Phase 1b" even though sandbox bootstrap is now Phase 1c; update all
occurrences to "Phase 1c". Specifically, change the printf/echo in the Makefile
that prints "Sandbox bootstrap (Phase 1b)" and update the SANDBOX_SCRIPT_HINT
constant and any help section strings that mention "Phase 1b" so they
consistently read "Phase 1c" (search for SANDBOX_SCRIPT_HINT and the "Sandbox
bootstrap (Phase 1b)" literal to locate all places to edit).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.project/codeql-integration-plan.md:
- Around line 3-6: The Scope header in the plan (the "Scope:" line referencing
the branch `wip/codeql-integration-plan`) is stale because it says "planning
only; no implementation changes in this branch" while the PR now contains
implementation and test changes; update the "Scope:" header text to accurately
reflect the branch state (e.g., "planning + implementation and tests" or
similar) so the document matches the current PR contents and remove or replace
the "no implementation changes" phrase.

In `@prompts/phase-1c-sandbox.md`:
- Around line 63-64: The documentation is inconsistent about overwriting a
user-managed sandbox: the description near `apply` and `--force` (mentioning
movement to `sandbox/.backup-<timestamp>/` when the user accepts loss) conflicts
with the later guidance that says to halt; update both occurrences so they match
by explicitly documenting that `apply` will refuse to overwrite a user-managed
`sandbox/` unless the user re-runs with `--force`, and when `--force` is
provided the prior content is moved to `sandbox/.backup-<timestamp>/` (reference
`CODECOME-GENERATED.md`, `sandbox/`, `apply`, and `--force`), replacing the
halting wording at the later location to describe the forced-overwrite behavior.

In `@tests/test_codecome_runner.py`:
- Line 74: The test unpacks values from runner._run_single_attempt into
variables named res and path but never uses them; rename those two unpacked
variables to throwaway names (e.g., _res and _path or _ , _path) in the call to
runner._run_single_attempt so lint stops flagging unused variables while keeping
the test behavior unchanged.

In `@tools/codeql/install.py`:
- Around line 219-232: The code uses install_version (assigned to variable
version) directly to build filesystem paths (tools_dir / version → version_dir),
allowing path traversal; validate and sanitize install_version before using it:
in the install routine (where install_version, version, tools_dir, version_dir,
current_link and _fetch_latest_version are used) reject or normalize values
containing path separators or "..", permit only a safe pattern (e.g. semantic
versioning or a whitelist like /^[0-9A-Za-z\.-]+$/ or stricter semver regex),
and after constructing version_dir resolve it and assert that version_dir is a
subpath of tools_dir (e.g. compare resolved parents) before performing any
filesystem operations; on invalid input, log an error and return a non-zero exit
code instead of proceeding.

---

Outside diff comments:
In `@Makefile`:
- Around line 108-109: Help text and SANDBOX_SCRIPT_HINT still call out "Phase
1b" even though sandbox bootstrap is now Phase 1c; update all occurrences to
"Phase 1c". Specifically, change the printf/echo in the Makefile that prints
"Sandbox bootstrap (Phase 1b)" and update the SANDBOX_SCRIPT_HINT constant and
any help section strings that mention "Phase 1b" so they consistently read
"Phase 1c" (search for SANDBOX_SCRIPT_HINT and the "Sandbox bootstrap (Phase
1b)" literal to locate all places to edit).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 47a1ef1d-d256-419b-b3f2-f003c3ab207f

📥 Commits

Reviewing files that changed from the base of the PR and between a01752a and e900054.

📒 Files selected for processing (34)

.github/workflows/tests.yml
.gitignore
.project/codeql-integration-plan.md
Makefile
codecome.yml
prompts/phase-1a-profile.md
prompts/phase-1b-codeql-recon.md
prompts/phase-1c-sandbox.md
templates/codeql-packs.yml
templates/codeql-plan.yml
tests/test_codecome_runner.py
tests/test_codeql_config.py
tests/test_codeql_install.py
tests/test_codeql_packs.py
tests/test_events_loops.py
tests/test_gate_check.py
tests/test_rendering_events.py
tools/chat/app.py
tools/codecome/cli.py
tools/codecome/harness.py
tools/codecome/phase_1.py
tools/codecome/runner.py
tools/codecome/transcript.py
tools/codeql.py
tools/codeql/__init__.py
tools/codeql/config.py
tools/codeql/install.py
tools/codeql/packs.py
tools/events/chat_loop.py
tools/events/phase_loop.py
tools/gate-check.py
tools/phases/gates.py
tools/phases/phase_1_gates.py
tools/rendering/events/text.py

✅ Files skipped from review due to trivial changes (4)

templates/codeql-packs.yml
.gitignore
tools/codeql/init.py
templates/codeql-plan.yml

🚧 Files skipped from review as they are similar to previous changes (22)

codecome.yml
tools/chat/app.py
.github/workflows/tests.yml
tools/events/phase_loop.py
tools/codecome/transcript.py
tools/rendering/events/text.py
tools/events/chat_loop.py
tests/test_rendering_events.py
tools/codecome/harness.py
tests/test_events_loops.py
tests/test_codeql_install.py
tools/codecome/cli.py
tools/codecome/runner.py
tests/test_codeql_config.py
tools/codecome/phase_1.py
tests/test_gate_check.py
tools/phases/gates.py
tools/codeql/packs.py
tools/gate-check.py
tests/test_codeql_packs.py
tools/codeql/config.py
tools/phases/phase_1_gates.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/test_chat_app.py (1)
39-72: ⚡ Quick win

Add a regression test for the raw recorder wiring.

This now only exercises the pure render helper. Since chat persistence moved to ChatEventLoop.set_raw_event_recorder(...), the suite no longer covers the "record each SSE event exactly once" contract. A focused test around the callback wired in on_mount() would catch both missing writes and a reintroduction of the earlier double-write bug.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_chat_app.py` around lines 39 - 72, The test only exercises the
render helper and misses verifying that SSE events are recorded exactly once;
add a focused regression test that mounts ChatEventLoop (or simulates
on_mount()) and asserts the callback wired via
ChatEventLoop.set_raw_event_recorder(...) writes each raw event exactly once to
the raw recorder (and does not double-write), by installing a fake/raw recorder
spy, invoking the mounted callback with a sample SSE event, and asserting the
spy received a single write per event and the existing render behavior still
occurs.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/findings/checks_entry.py`:
- Around line 23-67: Add an optional ctx: Optional[FindingsContext] = None
parameter to run_frontmatter_validation and default it to
FindingsContext.default(); then pass ctx through to all helper calls that
resolve paths (replace direct globals/paths used in run_frontmatter_validation
with ctx-backed equivalents) — specifically use ctx when calling
iter_all_finding_files(ctx), validate_file_risk_index(ctx), and
validate_finding(path, ctx) and use ctx-provided
FILE_RISK_INDEX_PATH/FILE_RISK_INDEX_REL/ROOT equivalents instead of module
globals so callers can inject a temporary FindingsContext for tests and reuse.
- Around line 47-50: The branch that handles "if not paths" returns early and
skips appending the final summary/footer, so when FILE_RISK_INDEX_PATH exists
but has issues you never add the "Found N frontmatter error(s)." line; update
the logic in checks_entry.py so the "if not paths" branch does not return
directly but instead performs the same final-summary generation used by the
normal path (use the same footer construction that references total_errors) and
then return the tuple (0 if total_errors == 0 else 1, out.getvalue());
specifically modify the early-return in the "if not paths" block around
FILE_RISK_INDEX_PATH to fall through to the shared exit path that appends the
footer rather than returning immediately.

---

Nitpick comments:
In `@tests/test_chat_app.py`:
- Around line 39-72: The test only exercises the render helper and misses
verifying that SSE events are recorded exactly once; add a focused regression
test that mounts ChatEventLoop (or simulates on_mount()) and asserts the
callback wired via ChatEventLoop.set_raw_event_recorder(...) writes each raw
event exactly once to the raw recorder (and does not double-write), by
installing a fake/raw recorder spy, invoking the mounted callback with a sample
SSE event, and asserting the spy received a single write per event and the
existing render behavior still occurs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: a21e2c57-7177-4db3-abf1-3b477cb07c16

📥 Commits

Reviewing files that changed from the base of the PR and between e900054 and bad575a.

📒 Files selected for processing (7)

tests/test_chat_app.py
tests/test_rendering_events.py
tools/chat/app.py
tools/codecome/harness.py
tools/codecome/phase_1.py
tools/codeql/install.py
tools/findings/checks_entry.py

🚧 Files skipped from review as they are similar to previous changes (3)

tests/test_rendering_events.py
tools/codecome/harness.py
tools/codeql/install.py

…events + add PatchRenderer - runner._run_single_attempt: add phase_override/label_override params to thread subphase ids (1a/1b/1c) into the render context instead of the CLI-level phase number (1). - phase_1._run_subphase: pass phase_override=phase_id, label_override=label at the _run_single_attempt call site. - state_tracker: add explicit 'patch' branch that normalizes patch events (hash + files) from OpenCode. Normalize the unknown pass-through so UnknownEventRenderer receives a top-level 'part' key instead of <missing>. - rendering/events/patch.py: new PatchRenderer (mirrors ApplyPatchRenderer panel style) showing hash (8 chars), file count, and relative file list (truncated by CODECOME_APPLY_PATCH_MAX_FILES). Re-reads affected files in SnapshotCache so subsequent Write/Edit renderers see correct diffs. - Register PatchRenderer in dispatch.py and events/__init__.py.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/codeql/pipeline.py (1)

79-93: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Persist the final manifest after warning mutations.

run-manifest.yml is written before normalization/risk-import warnings are appended, so the on-disk manifest can be stale relative to returned data and codeql-summary.md.

Suggested fix

-    # Step 5: write summary
+    # Persist final manifest after post-processing warnings
+    write_manifest(manifest, output_dir)
+    _progress(progress, "CodeQL: manifest updated")
+
+    # Step 5: write summary
     write_summary(manifest, normalized_dir, output_dir)
     _progress(progress, "CodeQL: summary written")

Also applies to: 96-99

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/codeql/pipeline.py` around lines 79 - 93, Manifest is mutated with
appended warnings after normalization and risk import (see
manifest.setdefault(...) inside the try/except around normalization and the
import_risk(signals_path, risk_path) block), but the on-disk run-manifest.yml is
written earlier and becomes stale; after any code path that mutates manifest
(both the SARIF normalization exception handling and the Risk import exception
handling, as well as the similar block at lines 96-99), persist/overwrite the
run-manifest.yml on disk again using the same manifest-writing routine so the
file reflects the updated manifest returned by the function.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/codecome.py`:
- Around line 353-450: The check_codeql_status function (and its dependent logic
calling resolve_config, check_artifacts, load_codeql_plan and helper
_phase_1_notes_exist) implements domain logic that should be moved out of the
top-level CLI; extract the entire implementation into a package module (e.g., a
new function in tools.codecome.checks such as check_codeql_status or
check_codeql_status_impl) and keep the top-level check_codeql_status in the
script as a one-line wrapper that imports and calls the package function; ensure
the new module imports codeql.config.resolve_config,
codeql.artifacts.check_artifacts, codeql.packs.load_codeql_plan and preserves
behavior/return value and logging via C/ROOT, then update the script to only
perform the thin CLI wiring (import and call) so all domain logic lives in the
package module.

In `@tools/codeql/capabilities.py`:
- Around line 21-23: supported_build_modes currently returns the original set
value from BUILD_MODES_BY_LANGUAGE which exposes mutable global state; change
the return to a copy (e.g. return set(BUILD_MODES_BY_LANGUAGE.get(language_id,
set()))) or return an immutable
frozenset(BUILD_MODES_BY_LANGUAGE.get(language_id, set())) and update the
function signature accordingly so callers cannot mutate the module-level set.

---

Outside diff comments:
In `@tools/codeql/pipeline.py`:
- Around line 79-93: Manifest is mutated with appended warnings after
normalization and risk import (see manifest.setdefault(...) inside the
try/except around normalization and the import_risk(signals_path, risk_path)
block), but the on-disk run-manifest.yml is written earlier and becomes stale;
after any code path that mutates manifest (both the SARIF normalization
exception handling and the Risk import exception handling, as well as the
similar block at lines 96-99), persist/overwrite the run-manifest.yml on disk
again using the same manifest-writing routine so the file reflects the updated
manifest returned by the function.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 861ea28d-2ccf-4457-aaea-5b08b00f4e98

📥 Commits

Reviewing files that changed from the base of the PR and between 3538558 and 523e8c2.

📒 Files selected for processing (20)

Makefile
prompts/phase-1a-profile.md
templates/codeql-plan.yml
tests/test_codecome_check_codeql.py
tests/test_codecome_runner.py
tests/test_codeql_normalize.py
tests/test_codeql_packs.py
tests/test_codeql_pipeline.py
tests/test_codeql_runner.py
tests/test_phase_1_gates.py
tools/codecome.py
tools/codecome/phase_1.py
tools/codeql.py
tools/codeql/capabilities.py
tools/codeql/config.py
tools/codeql/normalize.py
tools/codeql/packs.py
tools/codeql/pipeline.py
tools/codeql/runner.py
tools/phases/phase_1_gates.py

✅ Files skipped from review due to trivial changes (2)

tests/test_phase_1_gates.py
prompts/phase-1a-profile.md

🚧 Files skipped from review as they are similar to previous changes (10)

tests/test_codeql_normalize.py
tests/test_codecome_runner.py
tests/test_codeql_packs.py
tools/codecome/phase_1.py
tools/codeql/config.py
tools/codeql/runner.py
tests/test_codeql_runner.py
tools/codeql/normalize.py
tools/codeql.py
Makefile

…estimates

…heck for empty languages

- runner.py: wire exclude_patterns into codeql database create via --codescanning-config (silent data loss bug) - pipeline.py: normalize SARIF for soft-failed runs too - normalize.py: use file-level running counter for monotonic alert IDs across multi-run SARIF files - pipeline.py: re-write manifest after post-processing warnings so on-disk manifest stays current - capabilities.py: return set() copy to prevent mutation of global BUILD_MODES_BY_LANGUAGE - docs: update stale scope headers in codeql-integration-plan.md and codeql-plan.yml

pruiz

Addressed review comments in the latest commit:

Comment	Fix
`runner.py` — exclude_patterns silently ignored	Wire into `codeql database create` via `--codescanning-config` temp YAML
`pipeline.py` — SARIF not normalized for soft-fail runs	Accept `status in ("completed", "soft-failed")`
`normalize.py` — alert IDs duplicate across multi-run SARIF	Use `start_index + len(alerts) + 1` file-level counter
`pipeline.py` — manifest stale after warning mutations	Re-write manifest after risk import step
`capabilities.py` — returns mutable global state	Return `set()` copy from `supported_build_modes()`
`codeql-integration-plan.md` — stale scope header	Updated to "Implementation complete"
`codeql-plan.yml` — misleading comment	Changed to "Consumed by CodeQL run orchestration"

~20 other comments were already addressed in 5d92fc6 or were false positives (already guarded in current code). All threads now resolved.

Phase 1a gate now checks fail_policy: unsupported languages emit a warning with soft policy (continue), fail with hard policy. Plan resolver (resolve_plan_packs) accepts skip_unsupported flag; run_codeql passes it when fail_policy is soft, so supported languages still run even when some are unsupported. PackResolverError catch now uses _tool_failure_status instead of hardcoded 'failed', producing 'soft-failed' with soft policy. _check_codeql_artifacts second status==failed block now guarded by fail_policy==hard; soft policy prints 'continuing'. 8 regression tests cover all four branches.

_run_codeql always returned 0 after fail_policy enforcement was moved to _check_codeql_artifacts. Change return type to None and remove the two unreachable 'if rc != 0: return rc' guards that depended on it.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_codeql_runner.py`:
- Around line 254-257: The test unpacks two values from _create_database into
ok, msg but never uses msg; change the call to ignore the unused second value to
silence linter warnings—e.g., call _create_database and unpack into ok, _ (or
assign ok = _create_database(...)[0]) in tests/test_codeql_runner.py so only the
used symbol ok remains referenced while still calling _create_database with the
same arguments.

In `@tools/codecome.py`:
- Around line 433-439: The import block inside check_codeql_status() currently
catches all exceptions and returns 0, masking runtime errors; change the except
clause to only catch ImportError (e.g., "except ImportError as exc") so missing
CodeQL packages are handled gracefully while letting other exceptions bubble up
(or be handled as failures), and keep the existing logging (print(C.warn(...)))
but only for ImportError so unexpected initialization errors don't report the
check as healthy.

In `@tools/codeql/pipeline.py`:
- Around line 84-95: The import of file-signals.yml can pull in stale data if
normalization failed; guard the import by tracking whether normalization
actually succeeded and only call import_risk when that flag is true. Concretely,
set a boolean (e.g., normalization_succeeded) to True at the end of the
normalization step (or False on exception) and change the block that computes
signals_path and calls import_risk(import_risk, signals_path, risk_path,
_progress) to be conditional on that flag; update manifest warnings as before
when skipping import due to failed normalization.

In `@tools/codeql/runner.py`:
- Around line 198-204: The check for build_mode/flags silently ignores an
explicit "manual" setting when build_command is missing; update the logic around
the build_mode, build_command and cmd assembly (the block handling build_mode ==
"none"/"manual"/"autobuild") to fail fast: when build_mode == "manual" and
build_command is falsy, raise a clear exception (e.g., ValueError) or log an
error and exit instead of falling through, otherwise continue to append
["--build-mode=manual", "-c", build_command]; keep the existing branches for
"none" and "autobuild" unchanged and reference the variables build_mode,
build_command and cmd to locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 96283bf0-cc8c-4411-a786-47fa975070e3

📥 Commits

Reviewing files that changed from the base of the PR and between 3538558 and 13d29ce.

📒 Files selected for processing (26)

.project/codeql-integration-plan.md
Makefile
prompts/phase-1-codeql-repair.md
prompts/phase-1a-profile.md
templates/codeql-plan.yml
tests/test_codecome_check_codeql.py
tests/test_codecome_runner.py
tests/test_codeql_artifacts.py
tests/test_codeql_normalize.py
tests/test_codeql_packs.py
tests/test_codeql_pipeline.py
tests/test_codeql_runner.py
tests/test_phase_1_codeql_plan_repair.py
tests/test_phase_1_gates.py
tools/codecome.py
tools/codecome/phase_1.py
tools/codeql.py
tools/codeql/artifacts.py
tools/codeql/capabilities.py
tools/codeql/config.py
tools/codeql/normalize.py
tools/codeql/packs.py
tools/codeql/pipeline.py
tools/codeql/runner.py
tools/phases/completion.py
tools/phases/phase_1_gates.py

✅ Files skipped from review due to trivial changes (2)

templates/codeql-plan.yml
prompts/phase-1a-profile.md

🚧 Files skipped from review as they are similar to previous changes (8)

tools/codeql/artifacts.py
tests/test_codeql_artifacts.py
tests/test_codeql_normalize.py
tests/test_codecome_runner.py
tools/codeql.py
.project/codeql-integration-plan.md
Makefile
tools/codeql/normalize.py

…k, strengthen CodeQL integration - Rename venv-check -> env-check; add CodeQL binary presence gate with .tools/codeql/.disabled marker managed by make init. - Add sandbox status section to 'make check' (state, gate, capabilities) using importlib to access sandbox-bootstrap module. - Add build_command shape validation rejecting shell operators, multiline, comments, and bash -c in CodeQL build commands. - Add _ensure_query_packs_available() for pack preflight/download with policy-aware failure handling (official=fail, optional=skip under soft). - Guard against false 'completed' when all profiles are skipped. - Update --no-sarif-add-query-help to --sarif-include-query-help=never. - Add AGENTS.md rule 12: no subprocess for internal module communication.

… test

… hallucination

All CodeQL subprocess calls (database create, database analyze, resolve queries, pack download, and the check CLI) now pass --common-caches=<workspace>/.cache/codeql so that CodeQL stores downloaded packs in the workspace-local cache instead of the global ~/.codeql/packages/ cache. This prevents a corrupted global cache entry (e.g. ~/.codeql/packages/codeql/python-queries/1.8.3) from breaking database creation in any workspace that shares the same user profile. Changes: - runner.py: add cache_dir param to _create_database and _run_analyze; all resolve/pack-download calls now go through _codeql_pack_cmd() which injects --common-caches before the '--' separator - tools/codeql.py: _cmd_check resolve qlpacks also uses the workspace cache - test_codeql_runner.py: 4 new tests covering cache propagation in db create, analyze, resolve queries, and lazy pack download

- Narrow except Exception to ImportError in check_codeql_status() - Guard stale risk signal import with normalized_ok flag in pipeline - Fail fast on manual build_mode without build_command in runner - Align conflicting overwrite policy wording in phase-1c-sandbox prompt - Add TODO markers for deferred refactors (#30, #31) - Update README: add CodeQL integration section, make init, phase-1 subphases

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread Makefile

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread tools/codecome/cli.py

Comment thread tools/codecome/phase_1.py

Comment thread tools/codecome/phase_1.py Outdated

Comment thread tools/gate-check.py Outdated

pruiz changed the title ~~WIP: CodeQL integration — PR 1: Harness simplification + init rename~~ WIP: CodeQL integration — Phase 1 subphases + harness cleanup + CodeQL install scaffold May 29, 2026

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

Comment thread codecome.yml

Comment thread tools/codeql/config.py

Comment thread tools/codeql/config.py

Comment thread tools/codeql/install.py Outdated

Comment thread tools/codeql/install.py Outdated

pruiz requested a review from Copilot May 30, 2026 15:06

Copilot started reviewing on behalf of pruiz May 30, 2026 15:06 View session

Copilot AI reviewed May 30, 2026

View reviewed changes

Comment thread tools/codeql/install.py

Comment thread tools/codeql.py

Comment thread tools/codecome/phase_1.py Outdated

Comment thread .github/workflows/tests.yml Outdated

Comment thread tests/test_gate_check.py

Comment thread templates/codeql-plan.yml

Comment thread tools/codecome/harness.py

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

pruiz added 12 commits May 30, 2026 18:29

docs: add CodeQL integration implementation plan

e4ff714

feat: add CodeQL install scaffold

00a113a

refactor: flatten CodeQL install layout and add verbose init output

2a22ad8

fix: use shutil.move in _extract to correctly flatten CodeQL bundle

2b133ad

Also add 'make init' step to CI test job so managed CodeQL install is exercised on every run.

fix: extract CodeQL bundle without post-processing

e2caf7a

refactor: move gate checks into phase modules

a9d5d42

fix: authenticate CodeQL release API requests

b9c55cf

refactor: keep one canonical CodeQL config path

221bc5d

fix: tighten CodeQL follow-up checks and CI

b24099f

feat: add CodeQL pack catalog resolver

396e6fe

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread tools/chat/app.py

fix: record raw transcripts and split phase-1 logs

e900054

pruiz force-pushed the wip/codeql-integration-plan branch from a01752a to e900054 Compare May 30, 2026 16:39

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread .project/codeql-integration-plan.md Outdated

Comment thread prompts/phase-1c-sandbox.md

Comment thread tests/test_codecome_runner.py Outdated

Comment thread tools/codeql/install.py

fix: review bot follow-up cleanups

bad575a

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread tools/findings/checks_entry.py Outdated

Comment thread tools/findings/checks_entry.py Outdated

pruiz added 2 commits May 31, 2026 00:31

feat: harden CodeQL phase integration

9ea504e

fix: mark unused runner test session id

523e8c2

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

Comment thread tools/codecome.py

Comment thread tools/codeql/capabilities.py Outdated

pruiz added 5 commits May 31, 2026 03:20

feat: stream CodeQL stderr live, add configurable timeouts with plan …

11e9674

…estimates

fix: return skipped when no CodeQL languages resolve, skip artifact c…

34e6615

…heck for empty languages

fix: add Recording tools header separator in make check output

549de46

feat: add Phase progress summary to make check output

6564ab5

pruiz commented May 31, 2026

View reviewed changes

pruiz added 10 commits May 31, 2026 04:46

fix: add CodeQL build repair retry

a254751

fix: address CodeQL review findings

4267c57

fix: skip non-CodeQL analysis units

58ab5a6

fix: auto-repair invalid CodeQL plans

861e113

fix: retry CodeQL manual build repairs

815e6da

fix: remove duplicate CodeQL hard-fail branch

68ffd9e

fix(phase-1): correctly handle codeql repair loop and hard-fail gate

67c0834

refactor: make _run_codeql return None, remove dead rc checks

e24502b

_run_codeql always returned 0 after fail_policy enforcement was moved to _check_codeql_artifacts. Change return type to None and remove the two unreachable 'if rc != 0: return rc' guards that depended on it.

fix: let soft CodeQL repair failures continue

13d29ce

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

Comment thread tests/test_codeql_runner.py

Comment thread tools/codecome.py

Comment thread tools/codeql/pipeline.py Outdated

Comment thread tools/codeql/runner.py

pruiz added 4 commits May 31, 2026 22:32

fix(codeql): prevent early abort on soft policy and rename misleading…

4357645

… test

feat(codeql): expose check-codeql-plan in cli to fix agent validation…

7027bf4

… hallucination

This was referenced May 31, 2026

Refactor: move CodeQL check logic to tools/codecome/checks.py #30

Open

Refactor: dedup event recording in chat TUI #31

Open

pruiz added 4 commits May 31, 2026 23:48

fix: report fresh sandbox as pending

d2caf0d

fix(phase-1): accept completed subphase artifacts without finish event

c40491c

fix(phase-1): preserve retry transcripts and guard resume

7d22780

Conversation

pruiz commented May 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodeQL Integration — groundwork PR

Changes in this PR

Phase 1 subphase orchestration

Not in this PR

Context

Remaining PRs in sequence

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pruiz left a comment

Choose a reason for hiding this comment

Uh oh!

pruiz commented May 29, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading