Improved ci prompt testing suite #270

dsarno · 2025-09-07T22:35:44Z

Summary by CodeRabbit

New Features
- Added an additive "T" test suite, tool-permissions config, a tool-result pruning utility, and an NL/T coverage validation script.
Changes
- Major CI/workflow overhaul: dual licensing support, two-pass NL→T execution, stronger report canonicalization and staging.
- Tool behaviors tightened: console defaults narrowed, search now returns precise positions, resource reads return metadata unless text requested, validation/sha responses summarized.
- Pre-validation guard now detects unbalanced edits.
Removals
- Removed legacy mini workflow and older NL/T prompt contracts.
Documentation
- Expanded developer README and CI/run guidance.
Tests
- Added focused unit tests for the new behaviors.

…container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions

@v2

…obust readiness wait; use game-ci @v2 actions

… grouping

@v2

…atterns, keep return-license @v2 for linter

…l_args array for -manualLicenseFile

… if missing

…undant EBL artifact copy; drop job-level if and unused UNITY_VERSION

… longer timeout); make license return non-blocking

… + host socket probe

…e); workflow_dispatch trigger + ASCII step names

…nnel startup; only match true activation/return failures

- Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands

- Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability.

- Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps.

…nsing-readiness-flepwk Improve Unity bridge wait-gate resilience

…-in-reports Normalize NL/T JUnit names and robust summary

- Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format

- Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats

…ingle-testcase guard; tighten prompt emissions; disallow Bash

…ct-file-id fix: keep file ID when canonicalizing test names

…eep placeholder at original site

…ir; prompt: standardize T-B validation to level=standard

… Unity log dumps

…(incl. failures), add T-F..T-J XML templates

…ep canonicalization + backfill

…rs scan in T-J

…ports/) to support multi-edit reporting

…aceholder detection

…egacy payload shapes

…, tuple entries)

…st emissions, staging, guard)

…present; dedupe promotion step

coderabbitai · 2025-09-07T22:35:51Z

Walkthrough

Removes legacy NL mini and full NL/T prompts; adds separate NL and T suite prompts, a permissions config, pruning and validation scripts, and a large CI workflow rewrite (licensing, two-pass NL→T execution, staging/backfill, canonicalization, reporting). Updates server tools (read_resource, find_in_file, read_console, manage_script) and adds a C# delimiter-balance guard. New tests cover these behaviors.

Changes

Cohort / File(s)	Summary
Prompts: deletions `.claude/prompts/nl-unity-claude-tests-mini.md`, `.claude/prompts/nl-unity-suite-full.md`	Removed legacy CI-oriented mini and full NL/T prompt contracts.
Prompts: additions/edits `.claude/prompts/nl-unity-suite-nl.md`, `.claude/prompts/nl-unity-suite-t.md`	Added refocused NL suite (NL-0..NL-4) with per-test XML fragment rules and a new T-suite (T-A..T-J) specifying additive, stateful tests and strict fragment emission.
Permissions `.claude/settings.json`	New static tool permissions (allow/deny) for Claude-runner tool usage.
CI workflows: removal `.github/workflows/claude-nl-suite-mini.yml`	Removed the mini NL GitHub Actions workflow.
CI workflows: overhaul `.github/workflows/claude-nl-suite.yml`	Large rewrite: dual licensing (ULF/EBL), license staging/activation, robust bridge wait logic, two-pass NL→T execution, fragment staging/backfill/canonicalization, JUnit/Markdown normalization, redaction, and artifact changes.
Editor C# guard `UnityMcpBridge/Editor/Tools/ManageScript.cs`	Add CheckBalancedDelimiters after non-sequential edits; return unbalanced_braces with line/expected on failure.
Server tools: manage_script `UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py`	validate_script now summarizes diagnostics as warning/error counts; get_sha now returns only `sha256` and `lengthBytes`.
Server tools: console `UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py`	Defaults tightened (types → `['error']`, format → `json`), normalizes `types` input, and returns non-stacktrace outputs as `lines` of `{level,message}`.
Server tools: resources `UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py`	`read_resource` adds `include_text` flag, always computes sha256/length and returns text only when requested; `find_in_file` default `max_results` reduced to 1 and returns coordinate ranges (`startLine,startCol,endLine,endCol`).
Utilities / scripts `prune_tool_results.py`, `scripts/validate-nlt-coverage.sh`	New tool to summarize/prune verbose tool_result JSON; new script to validate presence of NL/T XML fragments.
Docs `README-DEV.md`	Updated CI docs, prompt locations, MCP switching notes, and documentation for prune_tool_results.
Tests: resources / console / manage_script `tests/test_read_resource_minimal.py`, `tests/test_read_console_truncate.py`, `tests/test_validate_script_summary.py`	New tests validating metadata-only read_resource, console truncation/normalization, and validate_script diagnostics summarization.
Tests: find and sha `tests/test_find_in_file_minimal.py`, `tests/test_get_sha.py`	New positional-match test for find_in_file; tightened expectation for get_sha payload shape.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant GH as GitHub Actions
  participant Unity as Unity Container
  participant Bridge as Unity MCP Bridge
  participant Claude as Claude Runner
  participant Repo as Repository (reports)

  rect rgb(235,245,255)
    note over GH: Licensing decision (ULF vs EBL)
    GH->>GH: Decide license source
    alt ULF available
      GH->>Unity: Stage .ulf and start
    else EBL path
      GH->>Unity: Login (serial/user+pw), write entitlements
    end
    GH->>Unity: Warm up Unity environment
  end

  rect rgb(245,235,255)
    note over GH,Bridge: Start and wait for bridge
    GH->>Bridge: Start bridge
    GH-->>Bridge: Wait on port/log markers (timeout-aware)
    Bridge-->>GH: Ready status
  end

  rect rgb(235,255,235)
    note over Claude: Two-pass NL then T
    GH->>Claude: Run NL pass (nl-unity-suite-nl.md)
    Claude->>Bridge: Use AllowedTools per .claude/settings.json
    Claude-->>Repo: Emit per-test NL-# XML fragments (reports/)
    GH->>Claude: Run T pass (nl-unity-suite-t.md)
    Claude->>Bridge: Apply additive edits, validate, emit per-test T-# fragments
  end

  rect rgb(255,245,235)
    note over GH: Reporting and staging
    GH->>GH: Backfill missing fragments / canonicalize names
    GH->>GH: Merge JUnit, generate Markdown, upload artifacts
  end

sequenceDiagram
  autonumber
  participant Client as CI/Prompt
  participant Srv as UnityMcpServer Tools
  participant Editor as ManageScript (C#)

  note over Client,Srv: get_sha flow
  Client->>Srv: get_sha(uri)
  Srv->>Editor: request file hash
  Editor-->>Srv: {sha256, lengthBytes}
  Srv-->>Client: {"success":True,"data":{"sha256":...,"lengthBytes":...}}

  note over Client,Srv: validate_script flow
  Client->>Srv: validate_script(uri, level)
  Srv->>Editor: validate
  Editor-->>Srv: diagnostics[]
  Srv-->>Client: {"success":True,"data":{"warnings":N,"errors":M}}

  note over Client,Srv: read_resource / find_in_file
  Client->>Srv: read_resource(uri, include_text=false)
  Srv-->>Client: {"success":True,"data":{"sha256":...,"lengthBytes":...}}
  Client->>Srv: find_in_file(uri, pattern, max_results=1)
  Srv-->>Client: {"success":True,"data":{"matches":[{"startLine",...,"endCol":...}]}}

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Fix/brace validation improvements #260 — Related to brace/balance validation and ManageScript edit logic touching the same areas.
feat: Automatic Background Compile and Domain Reload for MCP Script Edits and New Script Creation #248 — Related edits to the Unity MCP edit/reload pipeline affecting manage_script and server-side edit handling.

Poem

I nibble logs with tidy teeth, so neat and quick and spry,
Two hops — NL then T — I stitch fragments, one by one.
I check your braces, count your hashes, trim the console crumbs,
Small edits safe, reports in line, a carrot for each run.
A rabbit hums: "Tests pass!" — then hides the last bug's bun. 🥕🐇

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

prune_tool_results.py (1)
58-59: Do not execute on import; add a proper entrypoint guard.

Running main() at import time can break consumers and tests.
-main()
+if __name__ == "__main__":
+    main()

🧹 Nitpick comments (36)

UnityMcpBridge/Editor/Tools/ManageScript.cs (1)

1350-1353: Good guard; align error payload with text-edit path by adding evidenceWindow

Add a small evidence window like the text-edits path to aid recovery and keep response shapes consistent.

-                // Guard against structural imbalance before validation
-                if (!CheckBalancedDelimiters(working, out int lineBal, out char expectedBal))
-                    return Response.Error("unbalanced_braces", new { status = "unbalanced_braces", line = lineBal, expected = expectedBal.ToString() });
+                // Guard against structural imbalance before validation
+                if (!CheckBalancedDelimiters(working, out int lineBal, out char expectedBal))
+                {
+                    int startLine = Math.Max(1, lineBal - 5);
+                    int endLine = lineBal + 5;
+                    return Response.Error(
+                        "unbalanced_braces",
+                        new { status = "unbalanced_braces", line = lineBal, expected = expectedBal.ToString(), evidenceWindow = new { startLine, endLine } }
+                    );
+                }

scripts/validate-nlt-coverage.sh (1)

1-12: Make expected IDs configurable and fail early if reports/ is missing

Parameterize the ID list and add a reports/ existence check for clearer CI failures.

 #!/usr/bin/env bash
 set -euo pipefail
 cd "$(git rev-parse --show-toplevel)"
-missing=()
-for id in NL-0 NL-1 NL-2 NL-3 NL-4 T-A T-B T-C T-D T-E T-F T-G T-H T-I T-J; do
-  [[ -s "reports/${id}_results.xml" ]] || missing+=("$id")
-done
+EXPECT_IDS=${EXPECT_IDS:-"NL-0 NL-1 NL-2 NL-3 NL-4 T-A T-B T-C T-D T-E T-F T-G T-H T-I T-J"}
+[[ -d reports ]] || { echo "Missing reports/ directory." >&2; exit 2; }
+missing=()
+for id in $EXPECT_IDS; do
+  [[ -s "reports/${id}_results.xml" ]] || missing+=("$id")
+done
 if (( ${#missing[@]} )); then
-  echo "Missing fragments: ${missing[*]}"
+  echo "Missing fragments: ${missing[*]}" >&2
   exit 2
 fi
 echo "All NL/T fragments present."

UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py (4)

398-405: Docstring mismatch with new return shape

Function now returns counts, not full diagnostics. Update the description to prevent consumer confusion.

-    @mcp.tool(description=(
-        "Validate a C# script and return diagnostics.\n\n"
+    @mcp.tool(description=(
+        "Validate a C# script and return counts of warnings and errors.\n\n"
         "Args: uri, level=('basic'|'standard').\n"
         "- basic: quick syntax checks.\n"
         "- standard: deeper checks (performance hints, common pitfalls).\n"
     ))

586-589: Docstring promises more fields than returned

Either return the documented fields (uri, path, lastModifiedUtc) or narrow the docstring to the minimal shape you now emit.

-    @mcp.tool(description=(
-        "Get SHA256 and metadata for a Unity C# script without returning file contents.\n\n"
-        "Args: uri (unity://path/Assets/... or file://... or Assets/...).\n"
-        "Returns: {sha256, lengthBytes, lastModifiedUtc, uri, path}."
-    ))
+    @mcp.tool(description=(
+        "Get SHA256 and size for a Unity C# script without returning file contents.\n\n"
+        "Args: uri (unity://path/Assets/... or file://... or Assets/...).\n"
+        "Returns: {sha256, lengthBytes}."
+    ))

596-604: Consider backward-compatible pass-through of extra metadata

Keeping sha256/lengthBytes minimal is fine; if low-risk, include uri/path/lastModifiedUtc when present to ease consumers migrating.

-            if isinstance(resp, dict) and resp.get("success"):
-                data = resp.get("data", {})
-                return {
-                    "success": True,
-                    "data": {
-                        "sha256": data.get("sha256"),
-                        "lengthBytes": data.get("lengthBytes"),
-                    },
-                }
+            if isinstance(resp, dict) and resp.get("success"):
+                data = resp.get("data", {}) or {}
+                minimal = {"sha256": data.get("sha256"), "lengthBytes": data.get("lengthBytes")}
+                # Optionally pass through extras if available
+                for k in ("uri", "path", "lastModifiedUtc"):
+                    if k in data:
+                        minimal[k] = data[k]
+                return {"success": True, "data": minimal}

419-424: Defensive diagnostic severity extraction

Use a helper that returns an empty string for non-dict entries to avoid AttributeError if diagnostics contains strings:

         if isinstance(resp, dict) and resp.get("success"):
             diags = resp.get("data", {}).get("diagnostics", []) or []
+            def _sev(x: Any) -> str:
+                return x.get("severity", "").lower() if isinstance(x, dict) else ""
             warnings = sum(_sev(d) == "warning" for d in diags)
             errors = sum(_sev(d) in ("error", "fatal") for d in diags)
             return {"success": True, "data": {"warnings": warnings, "errors": errors}}

No other code references data['diagnostics'].

tests/test_get_sha.py (1)

74-76: Good contract check; consider asserting types too.

Equality on the trimmed payload enforces the new shape. Add a quick type check for lengthBytes to avoid regressions on serialization.
     assert resp["success"] is True
-    assert resp["data"] == {"sha256": "abc", "lengthBytes": 1}
+    assert resp["data"] == {"sha256": "abc", "lengthBytes": 1}
+    assert isinstance(resp["data"]["lengthBytes"], int)

tests/test_validate_script_summary.py (1)

49-68: Assert tool routing params for stronger coverage.

You validate the counts; also assert that we send the expected action/name/path/level to the bridge.

 def test_validate_script_returns_counts(monkeypatch):
     tools = setup_tools()
     validate_script = tools["validate_script"]

-    def fake_send(cmd, params):
+    captured = {}
+
+    def fake_send(cmd, params):
+        captured["cmd"] = cmd
+        captured["params"] = params
         return {
             "success": True,
             "data": {
                 "diagnostics": [
                     {"severity": "warning"},
                     {"severity": "error"},
                     {"severity": "fatal"},
                 ]
             },
         }

     monkeypatch.setattr(manage_script, "send_command_with_retry", fake_send)

     resp = validate_script(None, uri="unity://path/Assets/Scripts/A.cs")
     assert resp == {"success": True, "data": {"warnings": 1, "errors": 2}}
+    assert captured["cmd"] == "manage_script"
+    assert captured["params"]["action"] == "validate"
+    assert captured["params"]["level"] == "basic"
+    assert captured["params"]["name"] == "A"
+    assert captured["params"]["path"].endswith("Assets/Scripts")

tests/test_read_console_truncate.py (2)

49-72: Add param assertions (action/types/format) to lock API defaults.

You already assert count/includeStacktrace; add defaults to prevent regressions.

     resp = read_console(ctx=None, count=10)
     assert resp == {
         "success": True,
         "data": {"lines": [{"level": "error", "message": "oops", "stacktrace": "trace", "time": "t"}]},
     }
     assert captured["params"]["count"] == 10
     assert captured["params"]["includeStacktrace"] is True
+    assert captured["params"]["action"] == "get"
+    assert captured["params"]["types"] == ["error"]
+    assert captured["params"]["format"] == "json"

74-93: Mirror the same defaults in the truncated path.

Keep the two code paths equally exercised.

     resp = read_console(ctx=None, count=10, include_stacktrace=False)
     assert resp == {"success": True, "data": {"lines": [{"level": "error", "message": "oops"}]}}
     assert captured["params"]["includeStacktrace"] is False
+    assert captured["params"]["action"] == "get"
+    assert captured["params"]["types"] == ["error"]
+    assert captured["params"]["format"] == "json"

UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py (4)

4-9: Remove unused imports.

time and config are not referenced.

-from typing import List, Dict, Any
-import time
+from typing import List, Dict, Any
 from mcp.server.fastmcp import FastMCP, Context
 from unity_connection import get_unity_connection, send_command_with_retry
-from config import config

40-42: Avoid unused variable; keep early connection touch if desired.

Rename to underscore to silence linters while preserving side-effects.

-        # Get the connection instance
-        bridge = get_unity_connection()
+        # Touch the connection early to fail fast if unavailable
+        _ = get_unity_connection()

69-75: Dead branch: count can’t be None after defaulting to 50.

Either allow None to mean “all” or drop this block. Given current defaulting, drop it.

-        # Remove None values unless it's 'count' (as None might mean 'all')
-        params_dict = {k: v for k, v in params_dict.items() if v is not None or k == 'count'} 
-        
-        # Add count back if it was None, explicitly sending null might be important for C# logic
-        if 'count' not in params_dict:
-             params_dict['count'] = None 
+        # Remove None values
+        params_dict = {k: v for k, v in params_dict.items() if v is not None}

95-96: Rename ambiguous loop var.

Fixes E741 and improves readability.

-            trimmed = [_entry(l) for l in (lines or [])]
+            trimmed = [_entry(line) for line in (lines or [])]

prune_tool_results.py (4)

5-9: Catch JSONDecodeError instead of a blind Exception.

Narrow the exception for better signal and auditability.
 def summarize(txt):
     try:
         obj = json.loads(txt)
-    except Exception:
+    except json.JSONDecodeError:
         return f"tool_result: {len(txt)} bytes"
2-2: Drop unused import.
-import sys, json, re
+import sys, json
28-33: Minor: use a clearer loop var and guard unknown levels.
-        lines = data["lines"] or []
-        lvls = {"info":0,"warning":0,"error":0}
-        for L in lines:
-            lvls[L.get("level","" ).lower()] = lvls.get(L.get("level","" ).lower(),0)+1
+        lines = data["lines"] or []
+        lvls = {"info": 0, "warning": 0, "error": 0}
+        for line in lines:
+            key = (line.get("level") or "").lower()
+            lvls[key] = lvls.get(key, 0) + 1
36-49: Preserve non-text tool_result chunks (optional).

Current pruning drops non-text chunks. If that’s intentional, ignore; otherwise, keep non-text as-is.

Would you like me to update prune_message to pass through non-text chunks unchanged under tool_result while still summarizing text?

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py (3)

296-299: Avoid loading entire files into memory when only metadata is needed

For large assets, p.read_bytes() copies the whole file just to hash/measure. Consider streaming hashing to reduce peak memory usage.

Example (conceptual):

sha256 = hashlib.sha256()
length = 0
with p.open("rb") as f:
    for chunk in iter(lambda: f.read(1024 * 1024), b""):
        sha256.update(chunk)
        length += len(chunk)
sha = sha256.hexdigest()

You can then read raw only if want_text is True.

325-333: Include lengthBytes with text for symmetry; also address TRY300 by using else

Align metadata fields whether or not text is returned and satisfy the static analyzer.

Apply:

-                return {
-                    "success": True,
-                    "data": {"text": text, "metadata": {"sha256": sha}},
-                }
-            return {
-                "success": True,
-                "data": {"metadata": {"sha256": sha, "lengthBytes": length}},
-            }
+                return {
+                    "success": True,
+                    "data": {"text": text, "metadata": {"sha256": sha, "lengthBytes": length}},
+                }
+            else:
+                return {
+                    "success": True,
+                    "data": {"metadata": {"sha256": sha, "lengthBytes": length}},
+                }

368-378: Capture multiple matches per line (and respect max_results across them)

Current code records only the first match on a line. Use finditer to avoid missing additional matches.

Apply:

-                m = rx.search(line)
-                if m:
-                    start_col, end_col = m.span()
-                    results.append(
-                        {
-                            "startLine": i,
-                            "startCol": start_col + 1,
-                            "endLine": i,
-                            "endCol": end_col + 1,
-                        }
-                    )
+                for m in rx.finditer(line):
+                    start_col, end_col = m.span()
+                    results.append(
+                        {
+                            "startLine": i,
+                            "startCol": start_col + 1,
+                            "endLine": i,
+                            "endCol": end_col + 1,
+                        }
+                    )
+                    if max_results and len(results) >= max_results:
+                        break

tests/test_read_resource_minimal.py (2)

56-64: Simplify async usage with pytest.mark.asyncio

Avoid manual loop management for readability.

Example:
import pytest

@pytest.mark.asyncio
async def test_read_resource_minimal_metadata_only(resource_tools, tmp_path):
    ...
    resp = await read_resource(uri="unity://path/Assets/A.txt", ctx=None, project_root=str(proj))
    assert resp["success"] is True
65-70: Stronger sha assertion (optional)

Optionally verify the sha value deterministically.

Example:
import hashlib
assert meta["sha256"] == hashlib.sha256(content.encode("utf-8")).hexdigest()

tests/test_find_in_file_minimal.py (2)

3-3: Remove unused import

Apply:

-import importlib.util

37-45: Optional: use pytest.mark.asyncio to avoid manual loop

Example:

@pytest.mark.asyncio
async def test_find_in_file_returns_positions(resource_tools, tmp_path):
    ...
    resp = await find_in_file(uri="unity://path/Assets/A.txt", pattern="world", ctx=None, project_root=str(proj))
    assert resp["success"] is True
    assert resp["data"]["matches"] == [{"startLine": 1, "startCol": 7, "endLine": 1, "endCol": 12}]

.github/workflows/claude-nl-suite.yml (4)

80-87: actionlint SC2129: consolidate GITHUB_OUTPUT writes.

Group echoes to avoid multiple appends per ShellCheck.
-            echo "use_ulf=$use_ulf" >> "$GITHUB_OUTPUT"
-            echo "use_ebl=$use_ebl" >> "$GITHUB_OUTPUT"
-            echo "has_serial=$([[ -n "${UNITY_SERIAL:-}" ]] && echo true || echo false)" >> "$GITHUB_OUTPUT"
+            {
+              echo "use_ulf=$use_ulf"
+              echo "use_ebl=$use_ebl"
+              echo "has_serial=$([[ -n "${UNITY_SERIAL:-}" ]] && echo true || echo false)"
+            } >> "$GITHUB_OUTPUT"
347-366: jq dependency assumption.

The step uses jq; ubuntu-latest typically has it, but not guaranteed. Add a quick install to harden.
-        - name: Verify Unity bridge status/port
+        - name: Verify Unity bridge status/port
           run: |
-            set -euxo pipefail
+            set -euxo pipefail
+            command -v jq >/dev/null || sudo apt-get update && sudo apt-get install -y jq
221-221: Trim trailing spaces and fix minor YAML style.

YAMLlint flags trailing spaces/indent around these lines. Clean to keep workflows lint‑clean.

Also applies to: 326-326, 366-369, 391-391

302-325: Align printed AllowedTools line with enforced settings.

The prompts print an AllowedTools list including “Write”, but the workflow restricts to Edit/MultiEdit via settings/allowed_tools. Either allow Write in the action or update the prompt’s printed list.

Would you like me to update .claude/settings.json and the prompts to keep them consistent?

.claude/prompts/nl-unity-suite-t.md (2)

4-6: Update the “AllowedTools” banner to match action settings.

The workflow allows Edit(reports/**) and MultiEdit(reports/**); Write is not allowed there. Adjust the printed line or allow Write in the action to avoid confusion.

-AllowedTools: Write,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha
+AllowedTools: Edit,MultiEdit,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha

248-256: Clarify that the XML templates are examples only (no fences in output).

You already forbid fences; add a one‑liner reminder above templates to prevent agents from copying fences into fragments.

 ## XML Fragment Templates (T-F .. T-J)
 
-Use these skeletons verbatim as a starting point. Replace the bracketed placeholders with your evidence. Ensure each file contains exactly one `<testcase>` element and that the `name` begins with the exact test id.
+Use these skeletons verbatim as a starting point (templates shown inside markdown fences here only; emitted files must contain a single `<testcase>` element with no fences or prologue). Replace placeholders; ensure the `name` starts with the exact test id.

README-DEV.md (3)

21-21: Fix typo: “roote” → “root”.

-Run this from the unity-mcp repo, not your game's roote directory. Use `mcp_source.py` to quickly switch between different MCP for Unity package sources:
+Run this from the unity-mcp repo, not your game's root directory. Use `mcp_source.py` to quickly switch between different MCP for Unity package sources:

102-106: Fix list indentation and “GitHun” typo (markdownlint MD007).

-***To run it***
- - Trigger: In GitHun "Actions" for the repo, trigger `workflow dispatch` (`Claude NL/T Full Suite (Unity live)`).
- - Image: `UNITY_IMAGE` (UnityCI) pulled by tag; the job resolves a digest at runtime. Logs are sanitized.
- - Execution: single pass with immediate per‑test fragment emissions (strict single `<testcase>` per file). A placeholder guard fails fast if any fragment is a bare ID. Staging (`reports/_staging`) is promoted to `reports/` to reduce partial writes.
- - Reports: JUnit at `reports/junit-nl-suite.xml`, Markdown at `reports/junit-nl-suite.md`.
- - Publishing: JUnit is normalized to `reports/junit-for-actions.xml` and published; artifacts upload all files under `reports/`.
+***To run it***
+- Trigger: In GitHub "Actions" for the repo, trigger `workflow dispatch` (`Claude NL/T Full Suite (Unity live)`).
+- Image: `UNITY_IMAGE` (UnityCI) pulled by tag; the job resolves a digest at runtime. Logs are sanitized.
+- Execution: single pass with immediate per‑test fragment emissions (strict single `<testcase>` per file). A placeholder guard fails fast if any fragment is a bare ID. Staging (`reports/_staging`) is promoted to `reports/` to reduce partial writes.
+- Reports: JUnit at `reports/junit-nl-suite.xml`, Markdown at `reports/junit-nl-suite.md`.
+- Publishing: JUnit is normalized to `reports/junit-for-actions.xml` and published; artifacts upload all files under `reports/`.

65-76: Tighten prune_tool_results.py description with stdin/stdout emphasis (consistency).

Minor wording to match actual usage example.

-Compacts large `tool_result` blobs in conversation JSON into concise one-line summaries.
+Compacts large `tool_result` blobs in a conversation JSON stream into concise one‑line summaries (reads stdin, writes stdout).

.claude/prompts/nl-unity-suite-nl.md (2)

5-7: Align “AllowedTools” banner with action settings.

Mirror the same adjustment as the T prompt to avoid implying Write is permitted when the workflow denies it.

-AllowedTools: Write,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha
+AllowedTools: Edit,MultiEdit,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha

122-124: Be explicit: “scan from EOF” technique.

Add a short regex hint (already implied) to prevent agents from naively picking the first brace.

-- Match the final class-closing brace by scanning from EOF (e.g., last `^\s*}\s*$`)
+- Match the final class-closing brace by scanning from EOF (e.g., last `^\s*}\s*$` via a reverse scan or by taking the final brace match from `find_in_file` results)

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e219080 and bcb0ccf.

📒 Files selected for processing (19)

.claude/prompts/nl-unity-claude-tests-mini.md (0 hunks)
.claude/prompts/nl-unity-suite-full.md (0 hunks)
.claude/prompts/nl-unity-suite-nl.md (8 hunks)
.claude/prompts/nl-unity-suite-t.md (1 hunks)
.claude/settings.json (1 hunks)
.github/workflows/claude-nl-suite-mini.yml (0 hunks)
.github/workflows/claude-nl-suite.yml (12 hunks)
README-DEV.md (3 hunks)
UnityMcpBridge/Editor/Tools/ManageScript.cs (1 hunks)
UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py (2 hunks)
UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py (2 hunks)
UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py (4 hunks)
prune_tool_results.py (1 hunks)
scripts/validate-nlt-coverage.sh (1 hunks)
tests/test_find_in_file_minimal.py (1 hunks)
tests/test_get_sha.py (1 hunks)
tests/test_read_console_truncate.py (1 hunks)
tests/test_read_resource_minimal.py (1 hunks)
tests/test_validate_script_summary.py (1 hunks)

💤 Files with no reviewable changes (3)

.claude/prompts/nl-unity-suite-full.md
.claude/prompts/nl-unity-claude-tests-mini.md
.github/workflows/claude-nl-suite-mini.yml

🧰 Additional context used

🧬 Code graph analysis (6)

tests/test_validate_script_summary.py (1)

UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py (3)

manage_script (432-557)

register_manage_script_tools (9-607)

validate_script (403-424)

UnityMcpBridge/Editor/Tools/ManageScript.cs (1)

UnityMcpBridge/Editor/Helpers/Response.cs (2)

Response (10-61)

Error (41-60)

tests/test_find_in_file_minimal.py (1)

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py (1)

find_in_file (337-384)

tests/test_read_resource_minimal.py (1)

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py (1)

read_resource (177-334)

tests/test_read_console_truncate.py (1)

UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py (2)

register_read_console_tools (10-97)

read_console (14-97)

UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py (1)

UnityMcpBridge/UnityMcpServer~/src/unity_connection.py (1)

send_command_with_retry (388-407)

🪛 Ruff (0.12.2)

tests/test_validate_script_summary.py

38-38: Unused method argument: args

(ARG002)

38-38: Unused method argument: kwargs

(ARG002)

53-53: Unused function argument: cmd

(ARG001)

53-53: Unused function argument: params

(ARG001)

68-68: Use of assert detected

(S101)

tests/test_find_in_file_minimal.py

18-18: Unused method argument: args

(ARG002)

18-18: Unused method argument: kwargs

(ARG002)

44-44: Use of assert detected

(S101)

45-45: Use of assert detected

(S101)

tests/test_read_resource_minimal.py

34-34: Unused method argument: args

(ARG002)

34-34: Unused method argument: kwargs

(ARG002)

65-65: Use of assert detected

(S101)

67-67: Use of assert detected

(S101)

69-69: Use of assert detected

(S101)

70-70: Use of assert detected

(S101)

tests/test_get_sha.py

74-74: Use of assert detected

(S101)

tests/test_read_console_truncate.py

38-38: Unused method argument: args

(ARG002)

38-38: Unused method argument: kwargs

(ARG002)

55-55: Unused function argument: cmd

(ARG001)

66-66: Use of assert detected

(S101)

70-70: Use of assert detected

(S101)

71-71: Use of assert detected

(S101)

80-80: Unused function argument: cmd

(ARG001)

91-91: Use of assert detected

(S101)

92-92: Use of assert detected

(S101)

UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py

95-95: Ambiguous variable name: l

(E741)

prune_tool_results.py

7-7: Do not catch blind exception: Exception

(BLE001)

37-37: Multiple statements on one line (colon)

(E701)

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py

329-332: Consider moving this statement to an else block

(TRY300)

🪛 actionlint (1.7.7)

.github/workflows/claude-nl-suite.yml

79-79: shellcheck reported issue in this script: SC2129:style:5:1: Consider using { cmd1; cmd2; } >> file instead of individual redirects

(shellcheck)

🪛 YAMLlint (1.37.1)

.github/workflows/claude-nl-suite.yml

[warning] 75-75: wrong indentation: expected 14 but found 12

(indentation)

[warning] 75-75: too many spaces after colon

(colons)

[warning] 76-76: too many spaces after colon

(colons)

[warning] 78-78: too many spaces after colon

(colons)

[warning] 92-92: wrong indentation: expected 14 but found 12

(indentation)

[warning] 122-122: wrong indentation: expected 14 but found 12

(indentation)

[warning] 122-122: too many spaces after colon

(colons)

[warning] 123-123: too many spaces after colon

(colons)

[warning] 125-125: too many spaces after colon

(colons)

[warning] 166-166: wrong indentation: expected 14 but found 12

(indentation)

[warning] 196-196: wrong indentation: expected 14 but found 12

(indentation)

[error] 221-221: trailing spaces

(trailing-spaces)

[error] 326-326: trailing spaces

(trailing-spaces)

[error] 366-366: trailing spaces

(trailing-spaces)

[error] 368-368: trailing spaces

(trailing-spaces)

[error] 369-369: trailing spaces

(trailing-spaces)

[warning] 376-376: wrong indentation: expected 14 but found 12

(indentation)

[error] 391-391: trailing spaces

(trailing-spaces)

[warning] 398-398: wrong indentation: expected 14 but found 12

(indentation)

[warning] 424-424: wrong indentation: expected 14 but found 12

(indentation)

[warning] 949-949: wrong indentation: expected 14 but found 12

(indentation)

[warning] 949-949: too many spaces after colon

(colons)

[warning] 951-951: too many spaces after colon

(colons)

🪛 markdownlint-cli2 (0.17.2)

README-DEV.md

102-102: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

103-103: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

104-104: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

105-105: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

106-106: Unordered list indentation
Expected: 0; Actual: 1

(MD007, ul-indent)

🔇 Additional comments (3)

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py (2)

343-347: LGTM: safer default for max_results

Defaulting max_results to 1 reduces payload/CPU in CI; coordinates contract is clear.

186-186: Audit read_resource call sites after default change

Ripgrep search only surfaced the minimal test in tests/test_read_resource_minimal.py calling read_resource without include_text; no other overrides were found. Verify all internal consumers, scripts, and CI pipelines accommodate the new metadata-only default or explicitly pass include_text=True where full text is required.

tests/test_read_resource_minimal.py (1)

48-71: Good coverage of metadata-only behavior

Test correctly asserts no text by default and validates sha length and byte count.

coderabbitai · 2025-09-07T22:43:38Z

.claude/settings.json

+{
+  "permissions": {
+    "allow": [
+      "mcp__unity",
+      "Edit(reports/**)",
+      "MultiEdit(reports/**)"
+    ],
+    "deny": [
+      "Bash",
+      "WebFetch",
+      "WebSearch",
+      "Task",
+      "TodoWrite",
+      "NotebookEdit",
+      "NotebookRead"
+    ]
+  }
+}


💡 Verification agent

🧩 Analysis chain

Constrain and verify permissions alignment with workflows/tools

Confirm that the tool IDs and operation names exactly match what the runners expect (e.g., "mcp__unity", "Edit(...)", "MultiEdit(...)"), and that deny precedence is enforced by the runner.

If the executor requires explicit read permission in addition to write, consider adding a "Read(reports/**)" allow.

Ensure that no broader edit capability is implicitly granted outside reports/**.

Run to cross-check usage and names:

🏁 Script executed:

#!/usr/bin/env bash set -euo pipefail rg -n --hidden -S "mcp__unity|Edit\(|MultiEdit\(" -C2 rg -n --hidden -S "reports/" -g ".github/**" -g ".claude/**" -C2

Length of output: 35004

Sync NL‐suite permissions with settings and workflows

In .github/workflows/claude-nl-suite.yml (around lines 308–313), the JSON permissions.allow block only includes "mcp__unity" and "Edit(reports/**)" but omits "MultiEdit(reports/**)" (present in .claude/settings.json and the allowed_tools flag). Add "MultiEdit(reports/**)".

If the runner requires explicit read permissions, consider adding "Read(reports/**)" to the allow list.

🤖 Prompt for AI Agents

In .claude/settings.json lines 1-18, the permissions.allow list differs from the workflow’s allowed tools: update the workflow (.github/workflows/claude-nl-suite.yml around lines 308-313) to include "MultiEdit(reports/**)" in the JSON permissions.allow array to match .claude/settings.json and the allowed_tools flag, and if the runner needs explicit read access also add "Read(reports/**)" to the allow list; ensure the entries are comma-separated and follow existing quoting/formatting conventions.

coderabbitai · 2025-09-07T22:43:38Z

.github/workflows/claude-nl-suite.yml

+            st="$(docker inspect -f '{{.State.Status}} {{.State.ExitCode}}' unity-mcp 2>/dev/null || true)"
+            case "$st" in exited*|dead*) docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'; exit 1;; esac
+


🛠️ Refactor suggestion

Broaden log redaction to include API keys/tokens.

Current sed omits api_key/apikey and Bearer tokens. Extend the pattern to reduce leakage risk.

- case "$st" in exited*|dead*) docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'; exit 1;; esac + case "$st" in exited*|dead*) docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*)/[REDACTED]/Ig'; exit 1;; esac @@ - echo "$logs" | tail -n 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig' + echo "$logs" | tail -n 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*)/[REDACTED]/Ig' @@ - if [[ "$st" != "running" ]]; then - echo "::error::Unity container exited during wait"; docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig' + if [[ "$st" != "running" ]]; then + echo "::error::Unity container exited during wait"; docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*)/[REDACTED]/Ig' exit 1 fi @@ - docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig' + docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*)/[REDACTED]/Ig' @@ - docker logs --tail 400 unity-mcp | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/ig' || true + docker logs --tail 400 unity-mcp | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*)/[REDACTED]/ig' || true

Also applies to: 257-259, 263-266, 272-274, 941-941

🤖 Prompt for AI Agents

.github/workflows/claude-nl-suite.yml around lines 231-233 (and similarly update 257-259, 263-266, 272-274, 941): the sed redaction pattern currently only matches email/serial/license/password/token and misses api_key/apikey and Bearer auth strings; update the sed regex to also match common API key parameter names (api_key, apikey, api-key, apiKey, case-insensitive) and Bearer tokens (e.g., "Bearer <token>") so logs redact those values, and apply the same improved pattern to the other indicated line ranges to ensure consistent redaction across the workflow.

.github/workflows/claude-nl-suite.yml

coderabbitai · 2025-09-07T22:43:38Z

prune_tool_results.py

+    if "matches" in data:
+        m = data["matches"] or []
+        if m:
+            first = m[0]
+            return f"find_in_file: {len(m)} match(es) first@{first.get('line',0)}:{first.get('col',0)}"
+        return "find_in_file: 0 matches"


🛠️ Refactor suggestion

Adapt to new find_in_file schema (startLine/startCol).

The PR moved to coordinate ranges; current summary will show 0:0. Handle both old and new keys.

- if "matches" in data: + if "matches" in data: m = data["matches"] or [] if m: first = m[0] - return f"find_in_file: {len(m)} match(es) first@{first.get('line',0)}:{first.get('col',0)}" + line = first.get("startLine") if isinstance(first, dict) else None + col = first.get("startCol") if isinstance(first, dict) else None + # Back-compat with older shape + line = first.get("line", 0) if line is None else line + col = first.get("col", 0) if col is None else col + return f"find_in_file: {len(m)} match(es) first@{line}:{col}" return "find_in_file: 0 matches"

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if "matches" in data:

m = data["matches"] or []

if m:

first = m[0]

return f"find_in_file: {len(m)} match(es) first@{first.get('line',0)}:{first.get('col',0)}"

return "find_in_file: 0 matches"

if "matches" in data:

m = data["matches"] or []

if m:

first = m[0]

# Prefer the new schema keys if present

line = first.get("startLine") if isinstance(first, dict) else None

col = first.get("startCol") if isinstance(first, dict) else None

# Back-compat with the old schema

line = first.get("line", 0) if line is None else line

col = first.get("col", 0) if col is None else col

return f"find_in_file: {len(m)} match(es) first@{line}:{col}"

return "find_in_file: 0 matches"

🤖 Prompt for AI Agents

In prune_tool_results.py around lines 21 to 26, the summary uses first.get('line',0) and first.get('col',0) which breaks with the new find_in_file schema that uses startLine/startCol (coordinate ranges); update the logic to prefer the new keys but fall back to the old ones: extract line = first.get('startLine') or first.get('line') or 0 and col = first.get('startCol') or first.get('col') or 0 (or their snake/camel variants if present), then use those values in the returned string so both old and new schemas are supported.

coderabbitai · 2025-09-07T22:43:38Z

tests/test_find_in_file_minimal.py

+from tools.resource_tools import register_resource_tools  # type: ignore
+


⚠️ Potential issue

Import will fail without mcp stub; mirror the other test’s stub

resource_tools imports mcp.server.fastmcp at import time. This test doesn’t stub it, so it can raise ImportError in environments without mcp installed.

Apply:

sys.path.insert(0, str(SRC)) -from tools.resource_tools import register_resource_tools # type: ignore +# Stub mcp.server.fastmcp to satisfy imports without full package +mcp_pkg = types.ModuleType("mcp") +server_pkg = types.ModuleType("mcp.server") +fastmcp_pkg = types.ModuleType("mcp.server.fastmcp") +class _Dummy: + pass +fastmcp_pkg.FastMCP = _Dummy +fastmcp_pkg.Context = _Dummy +server_pkg.fastmcp = fastmcp_pkg +mcp_pkg.server = server_pkg +sys.modules.setdefault("mcp", mcp_pkg) +sys.modules.setdefault("mcp.server", server_pkg) +sys.modules.setdefault("mcp.server.fastmcp", fastmcp_pkg) + +from tools.resource_tools import register_resource_tools # type: ignore

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from tools.resource_tools import register_resource_tools # type: ignore

sys.path.insert(0, str(SRC))

# Stub mcp.server.fastmcp to satisfy imports without full package

mcp_pkg = types.ModuleType("mcp")

server_pkg = types.ModuleType("mcp.server")

fastmcp_pkg = types.ModuleType("mcp.server.fastmcp")

class _Dummy:

pass

fastmcp_pkg.FastMCP = _Dummy

fastmcp_pkg.Context = _Dummy

server_pkg.fastmcp = fastmcp_pkg

mcp_pkg.server = server_pkg

sys.modules.setdefault("mcp", mcp_pkg)

sys.modules.setdefault("mcp.server", server_pkg)

sys.modules.setdefault("mcp.server.fastmcp", fastmcp_pkg)

from tools.resource_tools import register_resource_tools # type: ignore

🤖 Prompt for AI Agents

In tests/test_find_in_file_minimal.py around lines 12-13, the test imports tools.resource_tools which imports mcp.server.fastmcp at import time; since this test does not stub mcp the import can fail in environments without mcp. Before importing tools.resource_tools, create a minimal stub for the mcp.server.fastmcp module (e.g., inject a dummy module object into sys.modules["mcp"], sys.modules["mcp.server"], and sys.modules["mcp.server.fastmcp"] or use unittest.mock to patch sys.modules) mirroring the other test’s setup so the import succeeds; then proceed to import register_resource_tools.

coderabbitai · 2025-09-07T22:43:38Z

UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py

+            want_text = (
+                bool(include_text)
+                or (head_bytes is not None and head_bytes >= 0)
+                or (tail_lines is not None and tail_lines > 0)
+                or (start_line is not None and line_count is not None)
+            )


🛠️ Refactor suggestion

Validate window arguments to prevent surprising behavior on negatives

Negative values currently fall through and can return the whole file. Reject invalid inputs early.

Apply:

length = len(raw) + # Validate window args (reject negatives) + if (start_line is not None and start_line < 1) or \ + (line_count is not None and line_count < 0) or \ + (tail_lines is not None and tail_lines < 0) or \ + (head_bytes is not None and head_bytes < 0): + return {"success": False, "error": "Invalid window arguments (negative values not allowed)"} + want_text = ( bool(include_text) or (head_bytes is not None and head_bytes >= 0) or (tail_lines is not None and tail_lines > 0) or (start_line is not None and line_count is not None) )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

want_text = (

bool(include_text)

or (head_bytes is not None and head_bytes >= 0)

or (tail_lines is not None and tail_lines > 0)

or (start_line is not None and line_count is not None)

)

length = len(raw)

# Validate window args (reject negatives)

if (start_line is not None and start_line < 1) or \

(line_count is not None and line_count < 0) or \

(tail_lines is not None and tail_lines < 0) or \

(head_bytes is not None and head_bytes < 0):

return {"success": False, "error": "Invalid window arguments (negative values not allowed)"}

want_text = (

bool(include_text)

or (head_bytes is not None and head_bytes >= 0)

or (tail_lines is not None and tail_lines > 0)

or (start_line is not None and line_count is not None)

)

🤖 Prompt for AI Agents

In UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py around lines 300 to 305, the window-argument logic allows negative numeric inputs to "fall through" and effectively return whole files; add explicit validation that rejects negative values for any of head_bytes, tail_lines, start_line, or line_count when they are not None. Implement an early check that raises a ValueError (or equivalent) with a clear message if any of these arguments are provided and < 0, and keep the existing boolean logic for want_text but rely on the validated non-negative values thereafter.

…)); respect _staging in probe

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

.github/workflows/claude-nl-suite.yml (2)

420-461: Good fix: explicit T coverage probe unblocks retry.

This replaces failure()-gating with a concrete “missing fragments” check and properly conditions the retry. Matches prior feedback; looks correct.

231-233: Broaden log redaction to cover API keys and Bearer tokens.

Current sed misses api_key/apikey/authorization headers and “Bearer …” tokens. Expand the pattern everywhere logs are echoed.

- docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'
+ docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*|[Bb]earer[[:space:]]+[A-Za-z0-9._~-]+)/[REDACTED]/Ig'
@@
- echo "$logs" | tail -n 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'
+ echo "$logs" | tail -n 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*|[Bb]earer[[:space:]]+[A-Za-z0-9._~-]+)/[REDACTED]/Ig'
@@
- echo "::error::Unity container exited during wait"; docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'
+ echo "::error::Unity container exited during wait"; docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*|[Bb]earer[[:space:]]+[A-Za-z0-9._~-]+)/[REDACTED]/Ig'
@@
- docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/Ig'
+ docker logs unity-mcp --tail 200 | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*|[Bb]earer[[:space:]]+[A-Za-z0-9._~-]+)/[REDACTED]/Ig'
@@
- docker logs --tail 400 unity-mcp | sed -E 's/((email|serial|license|password|token)[^[:space:]]*)/[REDACTED]/ig' || true
+ docker logs --tail 400 unity-mcp | sed -E 's/((email|serial|license|password|token|api[_-]?key|authorization)[^[:space:]]*|[Bb]earer[[:space:]]+[A-Za-z0-9._~-]+)/[REDACTED]/ig' || true

Also applies to: 255-259, 263-266, 272-274, 958-959

🧹 Nitpick comments (3)

.github/workflows/claude-nl-suite.yml (3)
239-247: Make status JSON probing robust to missing files (avoid noisy jq errors).

Here, jq is called against a glob that may not match; errors are suppressed but you can mirror the safer pattern you already use later (Lines 353–366) to avoid spurious stderr and edge cases.
-              port="$(jq -r '.unity_port // empty' "$RUNNER_TEMP"/unity-status/unity-mcp-status-*.json 2>/dev/null | head -n1 || true)"
+              shopt -s nullglob
+              files=("$RUNNER_TEMP"/unity-status/unity-mcp-status-*.json)
+              if ((${#files[@]})); then
+                port="$(jq -r '.unity_port // empty' "${files[@]}" 2>/dev/null | head -n1 || true)"
+              else
+                port=""
+              fi
221-221: Clean up YAML lint issues (indentation, extra spaces, trailing whitespace).

YAMLlint flags trailing spaces (e.g., Lines 221, 326, 366, 368–369, 391) and colon-spacing/indentation at several env lines (75–79, 92, 122–126, 166, 196, 376, 398, 441, 966, 968). Quick whitespace fixes will quiet the linter.

Also applies to: 326-326, 366-366, 368-369, 391-391, 75-79, 92-92, 122-126, 166-166, 196-196, 376-376, 398-398, 441-441, 966-966, 968-968

117-157: Optional: proactively mask secrets at runtime.

Even with sed redaction, consider adding add-mask for critical vars before any echo to ensure GitHub scrubs them globally.
           run: |
             set -euxo pipefail
+            printf '%s\n' "::add-mask::${UNITY_EMAIL:-}" "::add-mask::${UNITY_PASSWORD:-}" "::add-mask::${UNITY_SERIAL:-}" || true
Also applies to: 161-183, 193-221

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bcb0ccf and 4def6dd.

📒 Files selected for processing (1)

.github/workflows/claude-nl-suite.yml (12 hunks)

🧰 Additional context used

🪛 YAMLlint (1.37.1)

.github/workflows/claude-nl-suite.yml

[warning] 75-75: wrong indentation: expected 14 but found 12

(indentation)

[warning] 75-75: too many spaces after colon

(colons)

[warning] 76-76: too many spaces after colon

(colons)

[warning] 78-78: too many spaces after colon

(colons)

[warning] 92-92: wrong indentation: expected 14 but found 12

(indentation)

[warning] 122-122: wrong indentation: expected 14 but found 12

(indentation)

[warning] 122-122: too many spaces after colon

(colons)

[warning] 123-123: too many spaces after colon

(colons)

[warning] 125-125: too many spaces after colon

(colons)

[warning] 166-166: wrong indentation: expected 14 but found 12

(indentation)

[warning] 196-196: wrong indentation: expected 14 but found 12

(indentation)

[error] 221-221: trailing spaces

(trailing-spaces)

[error] 326-326: trailing spaces

(trailing-spaces)

[error] 366-366: trailing spaces

(trailing-spaces)

[error] 368-368: trailing spaces

(trailing-spaces)

[error] 369-369: trailing spaces

(trailing-spaces)

[warning] 376-376: wrong indentation: expected 14 but found 12

(indentation)

[error] 391-391: trailing spaces

(trailing-spaces)

[warning] 398-398: wrong indentation: expected 14 but found 12

(indentation)

[warning] 441-441: wrong indentation: expected 14 but found 12

(indentation)

[warning] 966-966: wrong indentation: expected 14 but found 12

(indentation)

[warning] 966-966: too many spaces after colon

(colons)

[warning] 968-968: too many spaces after colon

(colons)

🔇 Additional comments (2)

.github/workflows/claude-nl-suite.yml (2)

70-116: ULF/EBL dual-path licensing flow reads solid.

Nice handling of base64 vs. plain ULF, mis-pasted entitlement XML, and conditional outputs; host-mounted entitlement pathing is consistent with later runs.

373-389: NL/T step wiring looks coherent (models, timeouts, tool scopes).

Per-step tool allowlists align with the intended two-pass flow; once the MultiEdit deny is fixed, this should work as designed.

Also applies to: 393-417

coderabbitai · 2025-09-07T22:50:42Z

.github/workflows/claude-nl-suite.yml

+        - name: Pin Claude tool permissions (.claude/settings.json)
+          run: |
+            set -eux
+            mkdir -p .claude
+            cat > .claude/settings.json <<'JSON'
+            {
+              "permissions": {
+                "allow": [
+                  "mcp__unity",
+                  "Edit(reports/**)"
+                ],
+                "deny": [
+                  "Bash",
+                  "MultiEdit",
+                  "WebFetch",
+                  "WebSearch",
+                  "Task",
+                  "TodoWrite",
+                  "NotebookEdit",
+                  "NotebookRead"
+                ]
+              }
+            }
+            JSON


🛠️ Refactor suggestion

Fix MultiEdit denial vs. step-level allowlist (blocks preferred batch writes).

.settings.json denies MultiEdit globally, but both NL/T steps ask the agent to “Prefer a single MultiEdit(reports/**)”. As written, MultiEdit calls will be rejected, degrading behavior and possibly causing partial output.

Apply this diff to scope the deny and allow the reports-only MultiEdit:

- cat > .claude/settings.json <<'JSON' + cat > .claude/settings.json <<'JSON' { "permissions": { "allow": [ "mcp__unity", - "Edit(reports/**)" + "Edit(reports/**)", + "MultiEdit(reports/**)" ], "deny": [ "Bash", - "MultiEdit", + "MultiEdit(/!(reports/**))", "WebFetch", "WebSearch", "Task", "TodoWrite", "NotebookEdit", "NotebookRead" ] } } JSON

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Pin Claude tool permissions (.claude/settings.json)

run: |

set -eux

mkdir -p .claude

cat > .claude/settings.json <<'JSON'

{

"permissions": {

"allow": [

"mcp__unity",

"Edit(reports/**)"

],

"deny": [

"Bash",

"MultiEdit",

"WebFetch",

"WebSearch",

"Task",

"TodoWrite",

"NotebookEdit",

"NotebookRead"

]

}

}

JSON

- name: Pin Claude tool permissions (.claude/settings.json)

run: |

set -eux

mkdir -p .claude

cat > .claude/settings.json <<'JSON'

{

"permissions": {

"allow": [

"mcp__unity",

"Edit(reports/**)",

"MultiEdit(reports/**)"

],

"deny": [

"Bash",

"MultiEdit(/!(reports/**))",

"WebFetch",

"WebSearch",

"Task",

"TodoWrite",

"NotebookEdit",

"NotebookRead"

]

}

}

JSON

🤖 Prompt for AI Agents

.github/workflows/claude-nl-suite.yml around lines 302 to 325: the current .claude/settings.json denies "MultiEdit" globally which blocks step-level requests to "Prefer a single MultiEdit(reports/**)"; remove "MultiEdit" from the deny list and instead add the scoped entry "MultiEdit(reports/**)" to the allow list so only reports/* multi-edits are permitted while other dangerous multi-edit capabilities remain denied.

…emetry * commit '3e83f993bfe632034bf7302d4319e3cd16353eb8': Improved ci prompt testing suite (CoplayDev#270) chore: bump version to 3.3.2 Fix: Unity Editor reload crash + debug-noise reduction (CoplayDev#266) Revise README for improved clarity and organization docs: install uv via official installer (curl/winget) Update README.md docs: fix Windows uv path to use WinGet shim, keep macOS AppSupport symlink path docs: update README.md with improved installation paths, documentation, and logo fix: Update README installation paths to match ServerInstaller.cs

dsarno added 30 commits September 5, 2025 15:02

CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-…

e6d6c98

…container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions

CI: support both ULF + EBL; validate ULF before -manualLicenseFile; r…

6e2a1b3

…obust readiness wait; use game-ci @v2 actions

CI: activate EBL via container using UNITY_IMAGE; fix readiness regex…

3a33f75

… grouping

CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error p…

f9141e4

…atterns, keep return-license @v2 for linter

CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manua…

6a253ed

…l_args array for -manualLicenseFile

CI: assert EBL activation writes entitlement to host mount; fail fast…

e69b460

… if missing

CI: use heredoc in wait step to avoid nested-quote issues; remove red…

cb6c694

…undant EBL artifact copy; drop job-level if and unused UNITY_VERSION

CI: harden wait step (container status check, broader ready patterns,…

124a872

… longer timeout); make license return non-blocking

CI: wait step — confirm bridge readiness via status JSON (unity_port)…

6f8695b

… + host socket probe

CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP prob…

deac721

…e); workflow_dispatch trigger + ASCII step names

CI: refine license error pattern to ignore benign LicensingClient cha…

2ffcb56

…nnel startup; only match true activation/return failures

fix workflow YAML parse

5a11cbf

Merge pull request #65 from dsarno/codex/fix-wait-gate-for-unity-lice…

0d43cfd

…nsing-readiness-flepwk Improve Unity bridge wait-gate resilience

Normalize NL/T JUnit names and robust summary

1b63bb6

Merge pull request #66 from dsarno/codex/fix-testcase-naming-mismatch…

ebbed23

…-in-reports Normalize NL/T JUnit names and robust summary

Fix Python import syntax in workflow debug step

0e2c324

CI: NL/T hardening — enforce filename-derived IDs, robust backfill, s…

181f3ad

…ingle-testcase guard; tighten prompt emissions; disallow Bash

fix: keep file ID when canonicalizing test names

3e49259

Merge pull request #67 from dsarno/codex/fix-testcase-naming-to-respe…

40463d1

…ct-file-id fix: keep file ID when canonicalizing test names

CI: move Unity Pro license return to teardown after stopping Unity; k…

d0937e8

…eep placeholder at original site

CI: remove revert helper & baseline snapshot; stop creating scripts d…

c86c683

…ir; prompt: standardize T-B validation to level=standard

CI: remove mini workflow and obsolete NL prompts; redact email in all…

e1d8ac5

… Unity log dumps

NL/T prompt: enforce allowed ops, require per-test fragment emission …

8234a5d

…(incl. failures), add T-F..T-J XML templates

NL suite: enforce strict NL-4 emission; remove brittle relabeling; ke…

c92f605

…ep canonicalization + backfill

NL/T: minimize transcript; tighten NL-4 console reads; add final erro…

2598516

…rs scan in T-J

dsarno added 8 commits September 7, 2025 11:48

ci: add local validate-nlt-coverage helper

26ead87

CI: add staged report fragment promotion step (reports/_staging -> re…

33009bf

…ports/) to support multi-edit reporting

CI: add staged report fragment promotion step (reports/_staging -> re…

aebbb25

…ports/) to support multi-edit reporting

CI: minor polish and guardrails; keep staged reports promotion and pl…

6c83bbd

…aceholder detection

read_console: default count=50; normalize types str->list; tolerate l…

744cd4b

…egacy payload shapes

read_console: harden response parsing for legacy shapes (data as list…

a90c132

…, tuple entries)

Docs: refresh CI workflow and prompts (remove mini suite refs; per-te…

3039e62

…st emissions, staging, guard)

CI: move T coverage check after staged promotion; accept _staging as …

bcb0ccf

…present; dedupe promotion step

coderabbitai bot reviewed Sep 7, 2025

View reviewed changes

CI: make T retry conditional on explicit coverage probe (not failure(…

4def6dd

…)); respect _staging in probe

dsarno merged commit 3e83f99 into CoplayDev:main Sep 7, 2025
1 check was pending

coderabbitai bot reviewed Sep 7, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Sep 9, 2025

Added optional telemetry #264

Merged

dsarno deleted the improved-ci-prompt-testing-suite branch September 12, 2025 17:58

This was referenced Sep 27, 2025

Standardize how we define MCP tools #292

Merged

Make it easier to add tools #301

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improved ci prompt testing suite #270

Improved ci prompt testing suite #270

Uh oh!

dsarno commented Sep 7, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Sep 7, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Sep 7, 2025

Uh oh!

Uh oh!

		st="$(docker inspect -f '{{.State.Status}} {{.State.ExitCode}}' unity-mcp 2>/dev/null \|\| true)"
		case "$st" in exited\|dead) docker logs unity-mcp --tail 200 \| sed -E 's/((email\|serial\|license\|password\|token)[^[:space:]]*)/[REDACTED]/Ig'; exit 1;; esac

		from tools.resource_tools import register_resource_tools # type: ignore

-from tools.resource_tools import register_resource_tools  # type: ignore
+ sys.path.insert(0, str(SRC))
+ # Stub mcp.server.fastmcp to satisfy imports without full package
+ mcp_pkg = types.ModuleType("mcp")
+ server_pkg = types.ModuleType("mcp.server")
+ fastmcp_pkg = types.ModuleType("mcp.server.fastmcp")
+ class _Dummy:
+     pass
+ fastmcp_pkg.FastMCP = _Dummy
+ fastmcp_pkg.Context = _Dummy
+ server_pkg.fastmcp = fastmcp_pkg
+ mcp_pkg.server = server_pkg
+ sys.modules.setdefault("mcp", mcp_pkg)
+ sys.modules.setdefault("mcp.server", server_pkg)
+ sys.modules.setdefault("mcp.server.fastmcp", fastmcp_pkg)
+ from tools.resource_tools import register_resource_tools  # type: ignore

Improved ci prompt testing suite #270

Improved ci prompt testing suite #270

Uh oh!

Conversation

dsarno commented Sep 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Sep 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dsarno commented Sep 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Sep 7, 2025 •

edited

Loading