fix: add missing cli v2 parser src #36

O1ahmad · 2026-01-20T20:23:04Z

Note

Introduces a robust, AI-assisted CLI parsing and binary discovery toolchain with Docker/host support, plus tests and developer tooling.

New modules: src/binary_finder.py (find executables in containers/host with caching and timeouts) and src/command_executor.py (help/version execution with fallbacks)
CLI tools: src/discover_binaries.py to auto-discover/verify binaries in images and update configs; src/process_cli_tools_v2.py to batch-parse tools (sequential/parallel) using the unified src/parser.py
Parser overhaul: Replaces parser_v2.py with enhanced src/parser.py (AI-driven help/version parsing, recursive subcommand extraction); updates imports to use it
Tests: Adds tests/test_parser.py with mocked subprocess/OpenAI to validate AWS CLI parsing
Tooling: Adds .pre-commit-config.yaml; Makefile clean stops deleting result.json; minor setup.py formatting tweaks

^{Written by Cursor Bugbot for commit 72e8228. This will update automatically on new commits. Configure here.}

Summary by CodeRabbit

New Features
- Binary discovery and execution tools for host/container environments, plus a CLI to discover and validate binaries inside Docker images with ranking and safe config updates.
- AI-driven CLI help/version parsing to generate structured command documentation via a new parsing CLI.
- Pre-commit hooks configuration added.
Bug Fixes
- Safer handling when image tags are missing to prevent failures.
Chores
- Simplified clean target to stop removing result.json.
Removed
- Legacy parser implementation replaced/removed.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-20T20:23:23Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

📝 Walkthrough

Walkthrough

Adds three new discovery modules and a CLI to locate/verify executables on host and in Docker images; replaces parser_v2 with an AI-driven parser in src/parser.py; modifies processing logic, tests, pre-commit config, setup formatting, and simplifies the Makefile clean target.

Changes

Cohort / File(s)	Summary
Build config `Makefile`, `setup.py`	Simplified `clean` target (removed `result.json`). Minor formatting/quote style changes in `setup.py`.
Pre-commit `.pre-commit-config.yaml`	New pre-commit configuration added with standard hooks (trailing-whitespace, check-yaml, debug-statements, etc.).
Binary discovery core `src/binary_finder.py`, `src/command_executor.py`	New `BinaryFinder` for host/container discovery (quick lookups, full filesystem scan with caching/timeouts, candidate generation/matching/scoring, optional help verification). New `CommandExecutor` to run commands on host or in-container with help/version variations, timeouts, and validators.
Discovery orchestrator / CLI `src/discover_binaries.py`	New `BinaryDiscoveryTool` CLI: loads/saves config (dry-run/backup), checks/pulls images, orchestrates discovery and validation, applies thresholds, selects primary/alternates, and updates config.
Parser migration `src/parser.py`, `src/parser_v2.py`	`src/parser_v2.py` removed. `src/parser.py` rewritten to an AI-driven parser using `BinaryFinder`/`CommandExecutor` with `analyze_with_ai`, `parse_command`, `parse_binary`, and CLI entry.
Tools processing tweaks `src/process_cli_tools_v2.py`	Defensive handling for empty `image_tags` (defaults to `["latest"]`), category detection tweaks, and stylistic/quoting normalization.
Tests `tests/test_parser.py`	Tests updated to use `parse_binary`, broadened subprocess/OpenAI mocks, and adjusted expectations for AI-driven parse outputs.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Discovery as BinaryDiscoveryTool
    participant Finder as BinaryFinder
    participant Executor as CommandExecutor
    participant Docker
    participant Host

    User->>Discovery: process_tools(config, filters)
    Discovery->>Docker: check_docker_available()
    Docker-->>Discovery: available

    Discovery->>Docker: pull_image(image)
    Docker-->>Discovery: image_ready

    Discovery->>Finder: discover_binaries_for_tool(image, tool_name)
    Finder->>Docker: which / command -v (quick lookup)
    Docker-->>Finder: path_or_none

    rect rgba(100, 150, 200, 0.5)
    Note over Finder,Docker: Full filesystem scan if quick lookup misses
    Finder->>Docker: docker run -- find / -executable (with timeout)
    Docker-->>Finder: executables_list
    end

    Finder->>Finder: generate_candidates(tool_name)
    Finder->>Finder: match_executables_to_candidates(...)

    rect rgba(150, 100, 200, 0.5)
    Note over Discovery,Executor: Validate top candidates
    Discovery->>Executor: test_help_variations(candidate_path, image)
    Executor->>Docker: docker run candidate --help (or Host execution)
    Docker-->>Executor: stdout/stderr/exit_code
    Executor-->>Discovery: validation_result
    end

    Discovery->>Discovery: select_primary_and_alternates()
    Discovery->>Discovery: save_config(updated_tool) (dry-run/backup handled)
    Discovery-->>User: print_summary()

Estimated Code Review Effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐰 I hop through images and folders wide,

sniffing binaries where secrets hide.
Help texts hum and tests sing true,
I choose the best paths and leave crumbs for you.
A tiny rabbit cheers: discovery, woohoo!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title 'fix: add missing cli v2 parser src' is vague and misleading. The PR does not simply fix missing code; it comprehensively refactors the CLI parsing system by adding new modules (binary_finder, command_executor, discover_binaries, parser), removing parser_v2, and reorganizing the architecture.	Revise the title to accurately reflect the scope: 'refactor: consolidate CLI parser with binary discovery and AI-assisted parsing' or 'feat: implement unified CLI parser with binary finder and command executor'.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 95.35% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@src/binary_finder.py`:
- Around line 188-224: The current loop in verify_executable_responds_to_help
calls `docker run` without overriding the image entrypoint, so it invokes the
image's default entrypoint instead of the target binary; change the command to
use Docker's --entrypoint to run the target executable (use the existing
binary_path or binary_name) and pass the help_arg as the container argument
(e.g. replace ['docker','run','--rm', docker_image, help_arg] with
['docker','run','--rm','--entrypoint', binary_path, docker_image, help_arg]),
keeping the timeout, capture_output and returncode checks the same.
- Around line 92-104: The current subprocess.run builds a shell command with
f'command -v {binary_name}', which allows shell injection; update the
subprocess.run invocation to pass the binary name as a positional parameter to
sh instead of interpolating it: replace the f-string with a constant command
like 'command -v "$1"' (or similar) and add the binary_name as a separate
argument after the image (e.g., using '--' then binary_name) so the container's
sh receives the binary name as $1; keep the same subprocess.run options
(capture_output, text, timeout) and preserve the existing return check
(result.returncode == 0 and result.stdout.strip()) and the returned tag
"command_v".

In `@src/command_executor.py`:
- Around line 62-73: In the Docker branch, remove the unused import os and the
unused binary_name computation, and build the docker run command to explicitly
set the container entrypoint to the target binary using --entrypoint; i.e., when
docker_image is set, construct cmd to include '--entrypoint' followed by
binary_path (then docker_image and args) so the intended binary is executed
inside the container instead of relying on the image entrypoint (update
references to docker_image, binary_path, cmd, and args accordingly).

In `@src/discover_binaries.py`:
- Around line 179-183: The code assumes image_tags is non-empty; change the
lookup to provide a default when the key is present but empty by using a
fallback (e.g., set image_tags = tool.get('image_tags') or ['latest']) so both
missing and empty lists yield ['latest']; then keep the existing selection logic
for image_tag ('latest' if present else image_tags[0]) and the construction of
docker_image; apply the same change where image_tags is read in
src/process_cli_tools_v2.py to avoid IndexError.

🧹 Nitpick comments (3)

src/binary_finder.py (2)

21-22: Annotate the class-level cache for typing clarity.

This cache is shared across instances, so marking it as ClassVar improves type checking and intent.

♻️ Proposed typing tweak

-from typing import Optional, List, Tuple, Set
+from typing import Optional, List, Tuple, Set, Dict, ClassVar
@@
-    _container_cache = {}
+    _container_cache: ClassVar[Dict[str, List[str]]] = {}

251-273: Consider case-insensitive candidate matching.

Candidate words are lowercased, but executable names aren’t, which can miss matches for mixed-case binaries.

🔧 Case-insensitive matching

-            exe_name = os.path.basename(exe_path)
+            exe_name = os.path.basename(exe_path)
+            exe_name_lower = exe_name.lower()
@@
-            for word in candidates:
-                if word in exe_name:
+            for word in candidates:
+                if word in exe_name_lower:
                     # Simple confidence based on match quality
-                    if exe_name == word:
+                    if exe_name_lower == word:
                         confidence = 1.0
                         match_type = 'exact'
-                    elif exe_name.startswith(word):
+                    elif exe_name_lower.startswith(word):
                         confidence = 0.9
                         match_type = 'starts_with'

src/command_executor.py (1)

21-38: Annotate class-level command lists with ClassVar.

These are class constants (not per-instance), so adding ClassVar improves typing clarity and intent.
♻️ Proposed typing tweaks
-from typing import Optional, Dict, List, Tuple
+from typing import Optional, Dict, List, Tuple, ClassVar
@@
-    HELP_VARIATIONS = [
+    HELP_VARIATIONS: ClassVar[List[List[str]]] = [
@@
-    VERSION_VARIATIONS = [
+    VERSION_VARIATIONS: ClassVar[List[List[str]]] = [

coderabbitai · 2026-01-20T20:34:21Z

src/binary_finder.py

+        # Try command -v (works in more minimal containers)
+        try:
+            result = subprocess.run(
+                ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, 
+                 '-c', f'command -v {binary_name}'],
+                capture_output=True,
+                text=True,
+                timeout=30
+            )
+            if result.returncode == 0 and result.stdout.strip():
+                return result.stdout.strip(), "command_v"
+        except subprocess.TimeoutExpired:
+            pass


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the binary_finder.py file find . -name "binary_finder.py" -type f

Repository: O1ahmad/cli-parser

Length of output: 84

🏁 Script executed:

# Check the content around lines 92-104 to see the exact code if [ -f "src/binary_finder.py" ]; then cat -n src/binary_finder.py | sed -n '85,110p' fi

Repository: O1ahmad/cli-parser

Length of output: 1180

🏁 Script executed:

# Look for where binary_name comes from and if there's any validation if [ -f "src/binary_finder.py" ]; then rg -n "binary_name" src/binary_finder.py -B 2 -A 2 fi

Repository: O1ahmad/cli-parser

Length of output: 1769

🏁 Script executed:

# Check function signature and parameter origin to understand if binary_name is user-controlled if [ -f "src/binary_finder.py" ]; then cat -n src/binary_finder.py | sed -n '1,95p' fi

Repository: O1ahmad/cli-parser

Length of output: 3773

🏁 Script executed:

# Find where find_in_container is called rg -n "find_in_container" src/binary_finder.py -B 2 -A 2

Repository: O1ahmad/cli-parser

Length of output: 604

🏁 Script executed:

# Check for other uses of sh -c in the file rg -n "sh.*-c" src/binary_finder.py -B 2 -A 2

Repository: O1ahmad/cli-parser

Length of output: 44

🏁 Script executed:

# Look for any validation/sanitization of binary_name rg -n "binary_name" src/binary_finder.py | head -20

Repository: O1ahmad/cli-parser

Length of output: 638

🏁 Script executed:

# Check for callers of find_in_container outside this file rg -n "find_in_container" --type py

Repository: O1ahmad/cli-parser

Length of output: 402

🏁 Script executed:

# Check if binary_name is validated anywhere before being passed if [ -f "src/binary_finder.py" ]; then cat -n src/binary_finder.py | sed -n '100,210p' fi

Repository: O1ahmad/cli-parser

Length of output: 5025

🏁 Script executed:

# Check parser_v2.py to understand the other call site rg -n "find_in_container" src/parser_v2.py -B 5 -A 5

Repository: O1ahmad/cli-parser

Length of output: 503

🏁 Script executed:

# Verify the syntax of the proposed fix is valid python3 << 'EOF' import subprocess # Test that the proposed syntax is valid (without executing) # When using a list with subprocess.run, shell=False (the default) # The arguments are passed directly without shell interpretation proposed_command = [ 'echo', 'test', # Simulated docker run command '-c', 'command -v -- "$1"', 'sh', 'test-binary' ] # This demonstrates the list structure is valid print("Proposed command structure (as list):") print(proposed_command) print("\nWhen passed to subprocess.run with list (shell=False):") print("- Arguments 0-2: echo test -c") print("- Argument 3: 'command -v -- \"$1\"' (literal string, no shell interpretation)") print("- Argument 4: 'sh'") print("- Argument 5: 'test-binary' (passed as $1 to sh, no interpolation)") print("\nConclusion: The proposed syntax is correct for safe parameter passing") EOF

Repository: O1ahmad/cli-parser

Length of output: 148

Fix shell injection vulnerability in container command check.

The f'command -v {binary_name}' string passed to sh -c allows command injection. If binary_name contains shell metacharacters (e.g., test; rm -rf / or $(malicious)), arbitrary commands execute in the container. Pass the name as a positional parameter instead.

🔒 Safer parameter passing

- result = subprocess.run( - ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, - '-c', f'command -v {binary_name}'], + result = subprocess.run( + ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, + '-c', 'command -v -- "$1"', 'sh', binary_name], capture_output=True, text=True, timeout=30 )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Try command -v (works in more minimal containers)

try:

result = subprocess.run(

['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,

'-c', f'command -v {binary_name}'],

capture_output=True,

text=True,

timeout=30

)

if result.returncode == 0 and result.stdout.strip():

return result.stdout.strip(), "command_v"

except subprocess.TimeoutExpired:

pass

# Try command -v (works in more minimal containers)

try:

result = subprocess.run(

['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,

'-c', 'command -v -- "$1"', 'sh', binary_name],

capture_output=True,

text=True,

timeout=30

)

if result.returncode == 0 and result.stdout.strip():

return result.stdout.strip(), "command_v"

except subprocess.TimeoutExpired:

pass

🧰 Tools

🪛 Ruff (0.14.13)

94-94: subprocess call: check for execution of untrusted input

(S603)

95-96: Starting a process with a partial executable path

(S607)

🤖 Prompt for AI Agents

In `@src/binary_finder.py` around lines 92 - 104, The current subprocess.run builds a shell command with f'command -v {binary_name}', which allows shell injection; update the subprocess.run invocation to pass the binary name as a positional parameter to sh instead of interpolating it: replace the f-string with a constant command like 'command -v "$1"' (or similar) and add the binary_name as a separate argument after the image (e.g., using '--' then binary_name) so the container's sh receives the binary name as $1; keep the same subprocess.run options (capture_output, text, timeout) and preserve the existing return check (result.returncode == 0 and result.stdout.strip()) and the returned tag "command_v".

src/binary_finder.py

src/command_executor.py

src/discover_binaries.py

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

src/parser.py

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@src/parser.py`:
- Around line 192-196: The preview printing of subcommands assumes every entry
in parsed['subcommands'] has a 'name' key and will KeyError on malformed AI
output; update the join expression in the block that prints the first
subcommands (using parsed, parsed['subcommands'], num_subcmds) to safely extract
names (e.g., use s.get('name', '<unknown>') after verifying s is a dict, coerce
to str, and optionally filter None/empty) so the print never raises and still
shows a sensible placeholder for missing names.
- Around line 293-307: When docker_image is truthy,
BinaryFinder.find_in_container(docker_image, binary_name) may return binary_path
as None; add a guard right after that call to check if binary_path is falsy and
if so log/print the same "not found" message and return None to avoid downstream
failures in parse_command (which calls os.path.basename on binary_path). Update
the block handling docker_image to mirror the host-path error path and return
early when binary_path is missing so parse_command and any subsequent use of
binary_path don't crash.

In `@tests/test_parser.py`:
- Around line 45-47: The test test_parse_aws depends on analyze_with_ai which
early-returns if OPENAI_API_KEY is missing; modify the test_parse_aws to set the
env var using the test fixture monkeypatch (e.g.,
monkeypatch.setenv("OPENAI_API_KEY", "test-key")) before calling analyze_with_ai
so the function proceeds and your mocked requests.post is exercised; ensure the
monkeypatch call happens at the start of the test to guarantee reliability
across CI environments.

♻️ Duplicate comments (1)

src/binary_finder.py (1)

92-99: Avoid shell injection in command -v invocation.

This still interpolates binary_name into sh -c, allowing injection. Use positional parameters instead.

🔒 Safer parameter passing

-            result = subprocess.run(
-                ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,
-                 '-c', f'command -v {binary_name}'],
+            result = subprocess.run(
+                ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,
+                 '-c', 'command -v -- "$1"', 'sh', binary_name],
                 capture_output=True,
                 text=True,
                 timeout=30
             )

🧹 Nitpick comments (3)

src/discover_binaries.py (3)

28-33: Prefer package-relative imports over sys.path mutation.
Line 28–33 modifies sys.path, which can shadow installed modules and makes packaging/testing harder. Consider switching to package-relative imports and running via python -m ... instead.

♻️ Suggested refactor

-# Add src to path
-sys.path.insert(0, os.path.dirname(__file__))
-
-from binary_finder import BinaryFinder
-from command_executor import CommandExecutor
+from .binary_finder import BinaryFinder
+from .command_executor import CommandExecutor

123-135: Surface image-inspect failures instead of silently swallowing them.
Line 124–134 uses a blanket except: pass, which hides daemon/permission issues and makes troubleshooting harder. Consider logging the error (at least WARN/DEBUG) and narrowing exception types.

🔧 Suggested change

         try:
             result = subprocess.run(
                 ['docker', 'image', 'inspect', image],
                 capture_output=True,
                 timeout=10
             )
             if result.returncode == 0:
-                self.log(f"  → Image available locally")
+                self.log("  → Image available locally")
                 return True
-        except Exception:
-            pass
+        except FileNotFoundError as e:
+            self.log(f"  → Docker not found: {e}", "WARN")
+        except subprocess.TimeoutExpired as e:
+            self.log(f"  → Timeout checking image: {e}", "WARN")
+        except Exception as e:
+            self.log(f"  → Error inspecting image: {e}", "WARN")

469-480: Make --update and --dry-run mutually exclusive.
Allowing both flags can confuse users; a mutually exclusive group avoids ambiguous intent and simplifies dry-run logic.

♻️ Suggested refactor

-    parser.add_argument('--update', action='store_true', help='Update config file with discoveries')
-    parser.add_argument('--dry-run', action='store_true', help='Dry run (no config updates)')
+    mode = parser.add_mutually_exclusive_group()
+    mode.add_argument('--update', action='store_true', help='Update config file with discoveries')
+    mode.add_argument('--dry-run', action='store_true', help='Dry run (no config updates)')
@@
-    dry_run = args.dry_run or not args.update
+    dry_run = not args.update

coderabbitai · 2026-01-20T21:46:15Z

src/parser.py

+    # Show first few subcommands and options
+    if num_subcmds > 0:
+        print(
+            f"  → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}"
+        )


⚠️ Potential issue | 🟠 Major

Guard against malformed AI output in subcommand preview.

If the model returns any subcommand missing name, this line will raise KeyError and abort parsing. Add a safe extraction path to keep parsing resilient.

💡 Proposed fix

- if num_subcmds > 0: - print( - f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}" - ) + if num_subcmds > 0: + preview_names = [ + s.get("name") + for s in parsed["subcommands"][:5] + if isinstance(s, dict) and s.get("name") + ] + if preview_names: + print( + f" → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Show first few subcommands and options

if num_subcmds > 0:

print(

f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}"

)

# Show first few subcommands and options

if num_subcmds > 0:

preview_names = [

s.get("name")

for s in parsed["subcommands"][:5]

if isinstance(s, dict) and s.get("name")

]

if preview_names:

print(

f" → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}"

)

🤖 Prompt for AI Agents

In `@src/parser.py` around lines 192 - 196, The preview printing of subcommands assumes every entry in parsed['subcommands'] has a 'name' key and will KeyError on malformed AI output; update the join expression in the block that prints the first subcommands (using parsed, parsed['subcommands'], num_subcmds) to safely extract names (e.g., use s.get('name', '<unknown>') after verifying s is a dict, coerce to str, and optionally filter None/empty) so the print never raises and still shows a sensible placeholder for missing names.

src/parser.py

coderabbitai · 2026-01-20T21:46:16Z

tests/test_parser.py

 def test_parse_aws(mock_subprocess_run):
+    """Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""
    # Mock outputs for aws commands


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

git ls-files | head -30

Repository: O1ahmad/cli-parser

Length of output: 1067

🏁 Script executed:

# Find the main code and test files fd -t f -e py | grep -E "(test_|analyze|parser)" | head -20

Repository: O1ahmad/cli-parser

Length of output: 96

🏁 Script executed:

# Look for analyze_with_ai function rg "def analyze_with_ai" -A 10

Repository: O1ahmad/cli-parser

Length of output: 520

🏁 Script executed:

# Check the test file around lines 45-47 cat -n tests/test_parser.py | sed -n '30,60p'

Repository: O1ahmad/cli-parser

Length of output: 1629

🏁 Script executed:

# Search for OPENAI_API_KEY usage rg "OPENAI_API_KEY" -B 2 -A 2

Repository: O1ahmad/cli-parser

Length of output: 470

🏁 Script executed:

# Get more of the test_parse_aws function cat -n tests/test_parser.py | sed -n '45,150p'

Repository: O1ahmad/cli-parser

Length of output: 4679

🏁 Script executed:

# Check for conftest.py and any global fixtures find . -name "conftest.py" -exec cat -n {} \;

Repository: O1ahmad/cli-parser

Length of output: 44

🏁 Script executed:

# Check for mock_subprocess_run fixture definition rg "def mock_subprocess_run" -B 2 -A 10

Repository: O1ahmad/cli-parser

Length of output: 653

🏁 Script executed:

# Check if analyze_with_ai is called within the parse function rg "analyze_with_ai" -B 3 -A 3

Repository: O1ahmad/cli-parser

Length of output: 1032

🏁 Script executed:

# Check the CI configuration for env vars cat -n .github/workflows/CI.yml

Repository: O1ahmad/cli-parser

Length of output: 1009

🏁 Script executed:

# Get the rest of test_parse_aws to see what it actually does cat -n tests/test_parser.py | sed -n '150,200p'

Repository: O1ahmad/cli-parser

Length of output: 2330

🏁 Script executed:

# Check if there's any mocking of OpenAI API or requests rg "mock|patch|Mock" tests/test_parser.py -B 1 -A 1

Repository: O1ahmad/cli-parser

Length of output: 859

🏁 Script executed:

# Check the actual parse function to understand the flow cat -n src/parser.py | sed -n '1,100p'

Repository: O1ahmad/cli-parser

Length of output: 5168

Set OPENAI_API_KEY in the test using monkeypatch to ensure reliability.

analyze_with_ai checks for the env var at the start and returns None if missing, before any API calls are made. Even though requests.post is mocked, the test will fail if OPENAI_API_KEY is not set, creating a fragile dependency on CI secrets configuration.

💡 Proposed fix

-def test_parse_aws(mock_subprocess_run): +def test_parse_aws(mock_subprocess_run, monkeypatch): + monkeypatch.setenv("OPENAI_API_KEY", "test-key") """Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_parse_aws(mock_subprocess_run):

"""Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""

# Mock outputs for aws commands

def test_parse_aws(mock_subprocess_run, monkeypatch):

"""Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""

monkeypatch.setenv("OPENAI_API_KEY", "test-key")

# Mock outputs for aws commands

🤖 Prompt for AI Agents

In `@tests/test_parser.py` around lines 45 - 47, The test test_parse_aws depends on analyze_with_ai which early-returns if OPENAI_API_KEY is missing; modify the test_parse_aws to set the env var using the test fixture monkeypatch (e.g., monkeypatch.setenv("OPENAI_API_KEY", "test-key")) before calling analyze_with_ai so the function proceeds and your mocked requests.post is exercised; ensure the monkeypatch call happens at the start of the test to guarantee reliability across CI environments.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/command_executor.py`:
- Around line 196-197: The return type annotation for test_help_variations is
incorrect: change Dict[str, any] to Dict[str, Any] and add Any to the typing
imports (alongside Dict/Optional) so the annotation uses the proper typing.Any
type; update the import statement (e.g., add "Any") and replace the annotation
on the static method test_help_variations accordingly.
- Around line 100-113: The help_indicators list in command_executor.py contains
a duplicate "usage:" entry; edit the help_indicators definition (variable
help_indicators) to remove the redundant "usage:" string (the third occurrence)
so each indicator is unique (keep "usage:" and "usage :" as you prefer),
ensuring the list no longer contains duplicate entries.

♻️ Duplicate comments (3)

tests/test_parser.py (1)
45-46: Set OPENAI_API_KEY to ensure test reliability.

The test depends on analyze_with_ai which early-returns None if OPENAI_API_KEY is not set (before any API calls are made). Even though requests.post is mocked, the test will fail in environments without this env var set, creating CI fragility.
🐛 Fix: Add monkeypatch fixture
-def test_parse_aws(mock_subprocess_run):
+def test_parse_aws(mock_subprocess_run, monkeypatch):
+    monkeypatch.setenv("OPENAI_API_KEY", "test-key")
     """Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""
src/binary_finder.py (1)
92-104: Shell injection vulnerability in container command check.

The f'command -v {binary_name}' string passed to sh -c allows command injection. If binary_name contains shell metacharacters (e.g., test; rm -rf /), arbitrary commands execute in the container.
🔒 Safer parameter passing
         try:
             result = subprocess.run(
                 ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,
-                 '-c', f'command -v {binary_name}'],
+                 '-c', 'command -v -- "$1"', 'sh', binary_name],
                 capture_output=True,
                 text=True,
                 timeout=30
             )
This passes binary_name as a positional parameter $1 to the shell, avoiding interpolation.
src/parser.py (1)
188-192: Guard against malformed AI output in subcommand/option preview.

If the AI returns a subcommand without a name key, line 189 will raise KeyError and abort parsing. Use safe access to maintain resilience.
🐛 Safer extraction
     if num_subcmds > 0:
-        print(f"  → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}")
+        preview_names = [s.get('name', '<unnamed>') for s in parsed['subcommands'][:5] if isinstance(s, dict)]
+        if preview_names:
+            print(f"  → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}")
     if num_opts > 0:
-        print(f"  → First options: {', '.join([o.get('option', o.get('shortcut', '?')) for o in parsed['options'][:5]])}{' ...' if num_opts > 5 else ''}")
+        preview_opts = [o.get('option', o.get('shortcut', '?')) for o in parsed['options'][:5] if isinstance(o, dict)]
+        if preview_opts:
+            print(f"  → First options: {', '.join(preview_opts)}{' ...' if num_opts > 5 else ''}")

🧹 Nitpick comments (4)

src/process_cli_tools_v2.py (1)

322-335: Variable parser shadows the imported parser module.

The argparse.ArgumentParser instance at line 322 shadows the parser module imported at line 24. While this doesn't cause a runtime issue in main() since it doesn't call parser.parse_binary after this point, it's confusing and error-prone.

♻️ Suggested fix: rename the argument parser

 def main():
     """CLI entry point"""
-    parser = argparse.ArgumentParser(
+    arg_parser = argparse.ArgumentParser(
         description="Process multiple CLI tools from configuration",
         formatter_class=argparse.RawDescriptionHelpFormatter,
     )

-    parser.add_argument("config", help="Path to config JSON file")
-    parser.add_argument("--output-dir", default="data/results", help="Base output directory")
-    parser.add_argument("--max-depth", type=int, default=20, help="Maximum recursion depth (default: 20)")
-    parser.add_argument("--only", nargs="+", help="Only process these tools (by name)")
-    parser.add_argument("--parallel", action="store_true", help="Process tools in parallel")
-    parser.add_argument("--workers", type=int, default=4, help="Number of parallel workers (default: 4)")
-    parser.add_argument("--no-skip", action="store_true", help="Reprocess even if output exists")
+    arg_parser.add_argument("config", help="Path to config JSON file")
+    arg_parser.add_argument("--output-dir", default="data/results", help="Base output directory")
+    arg_parser.add_argument("--max-depth", type=int, default=20, help="Maximum recursion depth (default: 20)")
+    arg_parser.add_argument("--only", nargs="+", help="Only process these tools (by name)")
+    arg_parser.add_argument("--parallel", action="store_true", help="Process tools in parallel")
+    arg_parser.add_argument("--workers", type=int, default=4, help="Number of parallel workers (default: 4)")
+    arg_parser.add_argument("--no-skip", action="store_true", help="Reprocess even if output exists")

-    args = parser.parse_args()
+    args = arg_parser.parse_args()

src/command_executor.py (1)

186-193: Move import re to module level and simplify condition.

The import re inside the function (line 189) is executed on every call, adding unnecessary overhead. Also, len(output) > 0 is redundant when output is already truthy.
♻️ Suggested fix

Move import to top of file:
import re
import subprocess
from typing import Dict, List, Optional, Tuple
Then simplify the condition:
-            if output and (exit_code in [0, 1]) and len(output) > 0:
+            if output and exit_code in [0, 1]:
                 # Check if it looks like version output (has numbers)
-                import re
-
                 if re.search(r"\d+\.\d+", output):
                     return output

src/parser.py (1)

87-96: Consider lowering temperature for more consistent parsing.

A temperature of 0.7 can produce varied outputs. For structured JSON parsing tasks, a lower temperature (0.0–0.3) typically yields more deterministic and consistent results.
         'response_format': {'type': "json_object"},
-        'temperature': 0.7,
+        'temperature': 0.1,
     }

src/binary_finder.py (1)

204-222: LGTM on --entrypoint usage; consider logging silenced exceptions.

The Docker command now correctly uses --entrypoint to test the specific binary. However, the bare except at line 220 silently swallows all exceptions - consider logging these for debugging.
-            except (subprocess.TimeoutExpired, Exception):
+            except subprocess.TimeoutExpired:
+                continue
+            except Exception as e:
+                # Log for debugging but continue trying other variations
+                print(f"  → Help variation failed: {e}")
                 continue

coderabbitai · 2026-01-20T21:54:00Z

src/command_executor.py

+        help_indicators = [
+            "usage:",
+            "usage :",
+            "usage:",
+            "options:",
+            "flags:",
+            "commands:",
+            "subcommands:",
+            "help",
+            "examples:",
+            "arguments:",
+            "synopsis",
+            "description:",
+        ]


⚠️ Potential issue | 🟡 Minor

Duplicate entry in help_indicators list.

Lines 101 and 103 both contain "usage:". Remove the duplicate.

🐛 Fix

help_indicators = [ "usage:", - "usage :", "usage:", + "usage :", "options:",

Wait, looking again - line 101 is "usage:", line 102 is "usage :" (with space), and line 103 is "usage:" again. Remove line 103:

help_indicators = [ "usage:", "usage :", - "usage:", "options:",

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

help_indicators = [

"usage:",

"usage :",

"usage:",

"options:",

"flags:",

"commands:",

"subcommands:",

"help",

"examples:",

"arguments:",

"synopsis",

"description:",

]

help_indicators = [

"usage:",

"usage :",

"options:",

"flags:",

"commands:",

"subcommands:",

"help",

"examples:",

"arguments:",

"synopsis",

"description:",

]

🤖 Prompt for AI Agents

In `@src/command_executor.py` around lines 100 - 113, The help_indicators list in command_executor.py contains a duplicate "usage:" entry; edit the help_indicators definition (variable help_indicators) to remove the redundant "usage:" string (the third occurrence) so each indicator is unique (keep "usage:" and "usage :" as you prefer), ensuring the list no longer contains duplicate entries.

coderabbitai · 2026-01-20T21:54:00Z

src/command_executor.py

+    @staticmethod
+    def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:


⚠️ Potential issue | 🟡 Minor

Fix type annotation: use Any instead of any.

any (lowercase) is a built-in function, not a type. The return type annotation should use Any from the typing module.

🐛 Fix

Update the import at line 12:

-from typing import Dict, List, Optional, Tuple +from typing import Any, Dict, List, Optional, Tuple

Fix the annotation:

- def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]: + def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, Any]:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@staticmethod

def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:

from typing import Any, Dict, List, Optional, Tuple

Suggested change

@staticmethod

def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:

`@staticmethod`

def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, Any]:

🤖 Prompt for AI Agents

In `@src/command_executor.py` around lines 196 - 197, The return type annotation for test_help_variations is incorrect: change Dict[str, any] to Dict[str, Any] and add Any to the typing imports (alongside Dict/Optional) so the annotation uses the proper typing.Any type; update the import statement (e.g., add "Any") and replace the annotation on the static method test_help_variations accordingly.

fix: add missing cli v2 parser src

c079357

This comment was marked as outdated.

Sign in to view

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

incorporate CR feedback

0716558

This comment was marked as outdated.

Sign in to view

O1ahmad force-pushed the ahmad/add_missing_src branch from 4865555 to a2811bd Compare January 20, 2026 21:35

cursor bot reviewed Jan 20, 2026

View reviewed changes

src/parser.py Show resolved Hide resolved

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

add pre-commit and fix tests

72e8228

O1ahmad force-pushed the ahmad/add_missing_src branch from a2811bd to 72e8228 Compare January 20, 2026 21:48

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

O1ahmad merged commit 206f3bb into main Jan 20, 2026
4 checks passed

-    # Show first few subcommands and options
-    if num_subcmds > 0:
-        print(
-            f"  → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}"
-        )
+    # Show first few subcommands and options
+    if num_subcmds > 0:
+        preview_names = [
+            s.get("name")
+            for s in parsed["subcommands"][:5]
+            if isinstance(s, dict) and s.get("name")
+        ]
+        if preview_names:
+            print(
+                f"  → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}"
+            )

		@staticmethod
		def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:

	@staticmethod
	def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:
	from typing import Any, Dict, List, Optional, Tuple

fix: add missing cli v2 parser src #36

fix: add missing cli v2 parser src #36

Uh oh!

Conversation

O1ahmad commented Jan 20, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated Code Review Effort

Poem

Uh oh!

This comment was marked as outdated.

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

O1ahmad commented Jan 20, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 20, 2026 •

edited

Loading