-
Notifications
You must be signed in to change notification settings - Fork 0
fix: add missing cli v2 parser src #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds three new discovery modules and a CLI to locate/verify executables on host and in Docker images; replaces parser_v2 with an AI-driven parser in Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Discovery as BinaryDiscoveryTool
participant Finder as BinaryFinder
participant Executor as CommandExecutor
participant Docker
participant Host
User->>Discovery: process_tools(config, filters)
Discovery->>Docker: check_docker_available()
Docker-->>Discovery: available
Discovery->>Docker: pull_image(image)
Docker-->>Discovery: image_ready
Discovery->>Finder: discover_binaries_for_tool(image, tool_name)
Finder->>Docker: which / command -v (quick lookup)
Docker-->>Finder: path_or_none
rect rgba(100, 150, 200, 0.5)
Note over Finder,Docker: Full filesystem scan if quick lookup misses
Finder->>Docker: docker run -- find / -executable (with timeout)
Docker-->>Finder: executables_list
end
Finder->>Finder: generate_candidates(tool_name)
Finder->>Finder: match_executables_to_candidates(...)
rect rgba(150, 100, 200, 0.5)
Note over Discovery,Executor: Validate top candidates
Discovery->>Executor: test_help_variations(candidate_path, image)
Executor->>Docker: docker run candidate --help (or Host execution)
Docker-->>Executor: stdout/stderr/exit_code
Executor-->>Discovery: validation_result
end
Discovery->>Discovery: select_primary_and_alternates()
Discovery->>Discovery: save_config(updated_tool) (dry-run/backup handled)
Discovery-->>User: print_summary()
Estimated Code Review Effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@src/binary_finder.py`:
- Around line 188-224: The current loop in verify_executable_responds_to_help
calls `docker run` without overriding the image entrypoint, so it invokes the
image's default entrypoint instead of the target binary; change the command to
use Docker's --entrypoint to run the target executable (use the existing
binary_path or binary_name) and pass the help_arg as the container argument
(e.g. replace ['docker','run','--rm', docker_image, help_arg] with
['docker','run','--rm','--entrypoint', binary_path, docker_image, help_arg]),
keeping the timeout, capture_output and returncode checks the same.
- Around line 92-104: The current subprocess.run builds a shell command with
f'command -v {binary_name}', which allows shell injection; update the
subprocess.run invocation to pass the binary name as a positional parameter to
sh instead of interpolating it: replace the f-string with a constant command
like 'command -v "$1"' (or similar) and add the binary_name as a separate
argument after the image (e.g., using '--' then binary_name) so the container's
sh receives the binary name as $1; keep the same subprocess.run options
(capture_output, text, timeout) and preserve the existing return check
(result.returncode == 0 and result.stdout.strip()) and the returned tag
"command_v".
In `@src/command_executor.py`:
- Around line 62-73: In the Docker branch, remove the unused import os and the
unused binary_name computation, and build the docker run command to explicitly
set the container entrypoint to the target binary using --entrypoint; i.e., when
docker_image is set, construct cmd to include '--entrypoint' followed by
binary_path (then docker_image and args) so the intended binary is executed
inside the container instead of relying on the image entrypoint (update
references to docker_image, binary_path, cmd, and args accordingly).
In `@src/discover_binaries.py`:
- Around line 179-183: The code assumes image_tags is non-empty; change the
lookup to provide a default when the key is present but empty by using a
fallback (e.g., set image_tags = tool.get('image_tags') or ['latest']) so both
missing and empty lists yield ['latest']; then keep the existing selection logic
for image_tag ('latest' if present else image_tags[0]) and the construction of
docker_image; apply the same change where image_tags is read in
src/process_cli_tools_v2.py to avoid IndexError.
🧹 Nitpick comments (3)
src/binary_finder.py (2)
21-22: Annotate the class-level cache for typing clarity.This cache is shared across instances, so marking it as
ClassVarimproves type checking and intent.♻️ Proposed typing tweak
-from typing import Optional, List, Tuple, Set +from typing import Optional, List, Tuple, Set, Dict, ClassVar @@ - _container_cache = {} + _container_cache: ClassVar[Dict[str, List[str]]] = {}
251-273: Consider case-insensitive candidate matching.Candidate words are lowercased, but executable names aren’t, which can miss matches for mixed-case binaries.
🔧 Case-insensitive matching
- exe_name = os.path.basename(exe_path) + exe_name = os.path.basename(exe_path) + exe_name_lower = exe_name.lower() @@ - for word in candidates: - if word in exe_name: + for word in candidates: + if word in exe_name_lower: # Simple confidence based on match quality - if exe_name == word: + if exe_name_lower == word: confidence = 1.0 match_type = 'exact' - elif exe_name.startswith(word): + elif exe_name_lower.startswith(word): confidence = 0.9 match_type = 'starts_with'src/command_executor.py (1)
21-38: Annotate class-level command lists withClassVar.These are class constants (not per-instance), so adding
ClassVarimproves typing clarity and intent.♻️ Proposed typing tweaks
-from typing import Optional, Dict, List, Tuple +from typing import Optional, Dict, List, Tuple, ClassVar @@ - HELP_VARIATIONS = [ + HELP_VARIATIONS: ClassVar[List[List[str]]] = [ @@ - VERSION_VARIATIONS = [ + VERSION_VARIATIONS: ClassVar[List[List[str]]] = [
| # Try command -v (works in more minimal containers) | ||
| try: | ||
| result = subprocess.run( | ||
| ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, | ||
| '-c', f'command -v {binary_name}'], | ||
| capture_output=True, | ||
| text=True, | ||
| timeout=30 | ||
| ) | ||
| if result.returncode == 0 and result.stdout.strip(): | ||
| return result.stdout.strip(), "command_v" | ||
| except subprocess.TimeoutExpired: | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the binary_finder.py file
find . -name "binary_finder.py" -type fRepository: O1ahmad/cli-parser
Length of output: 84
🏁 Script executed:
# Check the content around lines 92-104 to see the exact code
if [ -f "src/binary_finder.py" ]; then
cat -n src/binary_finder.py | sed -n '85,110p'
fiRepository: O1ahmad/cli-parser
Length of output: 1180
🏁 Script executed:
# Look for where binary_name comes from and if there's any validation
if [ -f "src/binary_finder.py" ]; then
rg -n "binary_name" src/binary_finder.py -B 2 -A 2
fiRepository: O1ahmad/cli-parser
Length of output: 1769
🏁 Script executed:
# Check function signature and parameter origin to understand if binary_name is user-controlled
if [ -f "src/binary_finder.py" ]; then
cat -n src/binary_finder.py | sed -n '1,95p'
fiRepository: O1ahmad/cli-parser
Length of output: 3773
🏁 Script executed:
# Find where find_in_container is called
rg -n "find_in_container" src/binary_finder.py -B 2 -A 2Repository: O1ahmad/cli-parser
Length of output: 604
🏁 Script executed:
# Check for other uses of sh -c in the file
rg -n "sh.*-c" src/binary_finder.py -B 2 -A 2Repository: O1ahmad/cli-parser
Length of output: 44
🏁 Script executed:
# Look for any validation/sanitization of binary_name
rg -n "binary_name" src/binary_finder.py | head -20Repository: O1ahmad/cli-parser
Length of output: 638
🏁 Script executed:
# Check for callers of find_in_container outside this file
rg -n "find_in_container" --type pyRepository: O1ahmad/cli-parser
Length of output: 402
🏁 Script executed:
# Check if binary_name is validated anywhere before being passed
if [ -f "src/binary_finder.py" ]; then
cat -n src/binary_finder.py | sed -n '100,210p'
fiRepository: O1ahmad/cli-parser
Length of output: 5025
🏁 Script executed:
# Check parser_v2.py to understand the other call site
rg -n "find_in_container" src/parser_v2.py -B 5 -A 5Repository: O1ahmad/cli-parser
Length of output: 503
🏁 Script executed:
# Verify the syntax of the proposed fix is valid
python3 << 'EOF'
import subprocess
# Test that the proposed syntax is valid (without executing)
# When using a list with subprocess.run, shell=False (the default)
# The arguments are passed directly without shell interpretation
proposed_command = [
'echo', 'test', # Simulated docker run command
'-c', 'command -v -- "$1"', 'sh', 'test-binary'
]
# This demonstrates the list structure is valid
print("Proposed command structure (as list):")
print(proposed_command)
print("\nWhen passed to subprocess.run with list (shell=False):")
print("- Arguments 0-2: echo test -c")
print("- Argument 3: 'command -v -- \"$1\"' (literal string, no shell interpretation)")
print("- Argument 4: 'sh'")
print("- Argument 5: 'test-binary' (passed as $1 to sh, no interpolation)")
print("\nConclusion: The proposed syntax is correct for safe parameter passing")
EOFRepository: O1ahmad/cli-parser
Length of output: 148
Fix shell injection vulnerability in container command check.
The f'command -v {binary_name}' string passed to sh -c allows command injection. If binary_name contains shell metacharacters (e.g., test; rm -rf / or $(malicious)), arbitrary commands execute in the container. Pass the name as a positional parameter instead.
🔒 Safer parameter passing
- result = subprocess.run(
- ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,
- '-c', f'command -v {binary_name}'],
+ result = subprocess.run(
+ ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image,
+ '-c', 'command -v -- "$1"', 'sh', binary_name],
capture_output=True,
text=True,
timeout=30
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Try command -v (works in more minimal containers) | |
| try: | |
| result = subprocess.run( | |
| ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, | |
| '-c', f'command -v {binary_name}'], | |
| capture_output=True, | |
| text=True, | |
| timeout=30 | |
| ) | |
| if result.returncode == 0 and result.stdout.strip(): | |
| return result.stdout.strip(), "command_v" | |
| except subprocess.TimeoutExpired: | |
| pass | |
| # Try command -v (works in more minimal containers) | |
| try: | |
| result = subprocess.run( | |
| ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, | |
| '-c', 'command -v -- "$1"', 'sh', binary_name], | |
| capture_output=True, | |
| text=True, | |
| timeout=30 | |
| ) | |
| if result.returncode == 0 and result.stdout.strip(): | |
| return result.stdout.strip(), "command_v" | |
| except subprocess.TimeoutExpired: | |
| pass |
🧰 Tools
🪛 Ruff (0.14.13)
94-94: subprocess call: check for execution of untrusted input
(S603)
95-96: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
In `@src/binary_finder.py` around lines 92 - 104, The current subprocess.run
builds a shell command with f'command -v {binary_name}', which allows shell
injection; update the subprocess.run invocation to pass the binary name as a
positional parameter to sh instead of interpolating it: replace the f-string
with a constant command like 'command -v "$1"' (or similar) and add the
binary_name as a separate argument after the image (e.g., using '--' then
binary_name) so the container's sh receives the binary name as $1; keep the same
subprocess.run options (capture_output, text, timeout) and preserve the existing
return check (result.returncode == 0 and result.stdout.strip()) and the returned
tag "command_v".
4865555 to
a2811bd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@src/parser.py`:
- Around line 192-196: The preview printing of subcommands assumes every entry
in parsed['subcommands'] has a 'name' key and will KeyError on malformed AI
output; update the join expression in the block that prints the first
subcommands (using parsed, parsed['subcommands'], num_subcmds) to safely extract
names (e.g., use s.get('name', '<unknown>') after verifying s is a dict, coerce
to str, and optionally filter None/empty) so the print never raises and still
shows a sensible placeholder for missing names.
- Around line 293-307: When docker_image is truthy,
BinaryFinder.find_in_container(docker_image, binary_name) may return binary_path
as None; add a guard right after that call to check if binary_path is falsy and
if so log/print the same "not found" message and return None to avoid downstream
failures in parse_command (which calls os.path.basename on binary_path). Update
the block handling docker_image to mirror the host-path error path and return
early when binary_path is missing so parse_command and any subsequent use of
binary_path don't crash.
In `@tests/test_parser.py`:
- Around line 45-47: The test test_parse_aws depends on analyze_with_ai which
early-returns if OPENAI_API_KEY is missing; modify the test_parse_aws to set the
env var using the test fixture monkeypatch (e.g.,
monkeypatch.setenv("OPENAI_API_KEY", "test-key")) before calling analyze_with_ai
so the function proceeds and your mocked requests.post is exercised; ensure the
monkeypatch call happens at the start of the test to guarantee reliability
across CI environments.
♻️ Duplicate comments (1)
src/binary_finder.py (1)
92-99: Avoid shell injection incommand -vinvocation.This still interpolates
binary_nameintosh -c, allowing injection. Use positional parameters instead.🔒 Safer parameter passing
- result = subprocess.run( - ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, - '-c', f'command -v {binary_name}'], + result = subprocess.run( + ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, + '-c', 'command -v -- "$1"', 'sh', binary_name], capture_output=True, text=True, timeout=30 )
🧹 Nitpick comments (3)
src/discover_binaries.py (3)
28-33: Prefer package-relative imports oversys.pathmutation.
Line 28–33 modifiessys.path, which can shadow installed modules and makes packaging/testing harder. Consider switching to package-relative imports and running viapython -m ...instead.♻️ Suggested refactor
-# Add src to path -sys.path.insert(0, os.path.dirname(__file__)) - -from binary_finder import BinaryFinder -from command_executor import CommandExecutor +from .binary_finder import BinaryFinder +from .command_executor import CommandExecutor
123-135: Surface image-inspect failures instead of silently swallowing them.
Line 124–134 uses a blanketexcept: pass, which hides daemon/permission issues and makes troubleshooting harder. Consider logging the error (at least WARN/DEBUG) and narrowing exception types.🔧 Suggested change
try: result = subprocess.run( ['docker', 'image', 'inspect', image], capture_output=True, timeout=10 ) if result.returncode == 0: - self.log(f" → Image available locally") + self.log(" → Image available locally") return True - except Exception: - pass + except FileNotFoundError as e: + self.log(f" → Docker not found: {e}", "WARN") + except subprocess.TimeoutExpired as e: + self.log(f" → Timeout checking image: {e}", "WARN") + except Exception as e: + self.log(f" → Error inspecting image: {e}", "WARN")
469-480: Make--updateand--dry-runmutually exclusive.
Allowing both flags can confuse users; a mutually exclusive group avoids ambiguous intent and simplifies dry-run logic.♻️ Suggested refactor
- parser.add_argument('--update', action='store_true', help='Update config file with discoveries') - parser.add_argument('--dry-run', action='store_true', help='Dry run (no config updates)') + mode = parser.add_mutually_exclusive_group() + mode.add_argument('--update', action='store_true', help='Update config file with discoveries') + mode.add_argument('--dry-run', action='store_true', help='Dry run (no config updates)') @@ - dry_run = args.dry_run or not args.update + dry_run = not args.update
src/parser.py
Outdated
| # Show first few subcommands and options | ||
| if num_subcmds > 0: | ||
| print( | ||
| f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard against malformed AI output in subcommand preview.
If the model returns any subcommand missing name, this line will raise KeyError and abort parsing. Add a safe extraction path to keep parsing resilient.
💡 Proposed fix
- if num_subcmds > 0:
- print(
- f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}"
- )
+ if num_subcmds > 0:
+ preview_names = [
+ s.get("name")
+ for s in parsed["subcommands"][:5]
+ if isinstance(s, dict) and s.get("name")
+ ]
+ if preview_names:
+ print(
+ f" → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}"
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Show first few subcommands and options | |
| if num_subcmds > 0: | |
| print( | |
| f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}" | |
| ) | |
| # Show first few subcommands and options | |
| if num_subcmds > 0: | |
| preview_names = [ | |
| s.get("name") | |
| for s in parsed["subcommands"][:5] | |
| if isinstance(s, dict) and s.get("name") | |
| ] | |
| if preview_names: | |
| print( | |
| f" → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}" | |
| ) |
🤖 Prompt for AI Agents
In `@src/parser.py` around lines 192 - 196, The preview printing of subcommands
assumes every entry in parsed['subcommands'] has a 'name' key and will KeyError
on malformed AI output; update the join expression in the block that prints the
first subcommands (using parsed, parsed['subcommands'], num_subcmds) to safely
extract names (e.g., use s.get('name', '<unknown>') after verifying s is a dict,
coerce to str, and optionally filter None/empty) so the print never raises and
still shows a sensible placeholder for missing names.
| def test_parse_aws(mock_subprocess_run): | ||
| """Test parsing AWS CLI with mocked subprocess and OpenAI API calls""" | ||
| # Mock outputs for aws commands |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
git ls-files | head -30Repository: O1ahmad/cli-parser
Length of output: 1067
🏁 Script executed:
# Find the main code and test files
fd -t f -e py | grep -E "(test_|analyze|parser)" | head -20Repository: O1ahmad/cli-parser
Length of output: 96
🏁 Script executed:
# Look for analyze_with_ai function
rg "def analyze_with_ai" -A 10Repository: O1ahmad/cli-parser
Length of output: 520
🏁 Script executed:
# Check the test file around lines 45-47
cat -n tests/test_parser.py | sed -n '30,60p'Repository: O1ahmad/cli-parser
Length of output: 1629
🏁 Script executed:
# Search for OPENAI_API_KEY usage
rg "OPENAI_API_KEY" -B 2 -A 2Repository: O1ahmad/cli-parser
Length of output: 470
🏁 Script executed:
# Get more of the test_parse_aws function
cat -n tests/test_parser.py | sed -n '45,150p'Repository: O1ahmad/cli-parser
Length of output: 4679
🏁 Script executed:
# Check for conftest.py and any global fixtures
find . -name "conftest.py" -exec cat -n {} \;Repository: O1ahmad/cli-parser
Length of output: 44
🏁 Script executed:
# Check for mock_subprocess_run fixture definition
rg "def mock_subprocess_run" -B 2 -A 10Repository: O1ahmad/cli-parser
Length of output: 653
🏁 Script executed:
# Check if analyze_with_ai is called within the parse function
rg "analyze_with_ai" -B 3 -A 3Repository: O1ahmad/cli-parser
Length of output: 1032
🏁 Script executed:
# Check the CI configuration for env vars
cat -n .github/workflows/CI.ymlRepository: O1ahmad/cli-parser
Length of output: 1009
🏁 Script executed:
# Get the rest of test_parse_aws to see what it actually does
cat -n tests/test_parser.py | sed -n '150,200p'Repository: O1ahmad/cli-parser
Length of output: 2330
🏁 Script executed:
# Check if there's any mocking of OpenAI API or requests
rg "mock|patch|Mock" tests/test_parser.py -B 1 -A 1Repository: O1ahmad/cli-parser
Length of output: 859
🏁 Script executed:
# Check the actual parse function to understand the flow
cat -n src/parser.py | sed -n '1,100p'Repository: O1ahmad/cli-parser
Length of output: 5168
Set OPENAI_API_KEY in the test using monkeypatch to ensure reliability.
analyze_with_ai checks for the env var at the start and returns None if missing, before any API calls are made. Even though requests.post is mocked, the test will fail if OPENAI_API_KEY is not set, creating a fragile dependency on CI secrets configuration.
💡 Proposed fix
-def test_parse_aws(mock_subprocess_run):
+def test_parse_aws(mock_subprocess_run, monkeypatch):
+ monkeypatch.setenv("OPENAI_API_KEY", "test-key")
"""Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_parse_aws(mock_subprocess_run): | |
| """Test parsing AWS CLI with mocked subprocess and OpenAI API calls""" | |
| # Mock outputs for aws commands | |
| def test_parse_aws(mock_subprocess_run, monkeypatch): | |
| """Test parsing AWS CLI with mocked subprocess and OpenAI API calls""" | |
| monkeypatch.setenv("OPENAI_API_KEY", "test-key") | |
| # Mock outputs for aws commands |
🤖 Prompt for AI Agents
In `@tests/test_parser.py` around lines 45 - 47, The test test_parse_aws depends
on analyze_with_ai which early-returns if OPENAI_API_KEY is missing; modify the
test_parse_aws to set the env var using the test fixture monkeypatch (e.g.,
monkeypatch.setenv("OPENAI_API_KEY", "test-key")) before calling analyze_with_ai
so the function proceeds and your mocked requests.post is exercised; ensure the
monkeypatch call happens at the start of the test to guarantee reliability
across CI environments.
a2811bd to
72e8228
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/command_executor.py`:
- Around line 196-197: The return type annotation for test_help_variations is
incorrect: change Dict[str, any] to Dict[str, Any] and add Any to the typing
imports (alongside Dict/Optional) so the annotation uses the proper typing.Any
type; update the import statement (e.g., add "Any") and replace the annotation
on the static method test_help_variations accordingly.
- Around line 100-113: The help_indicators list in command_executor.py contains
a duplicate "usage:" entry; edit the help_indicators definition (variable
help_indicators) to remove the redundant "usage:" string (the third occurrence)
so each indicator is unique (keep "usage:" and "usage :" as you prefer),
ensuring the list no longer contains duplicate entries.
♻️ Duplicate comments (3)
tests/test_parser.py (1)
45-46: SetOPENAI_API_KEYto ensure test reliability.The test depends on
analyze_with_aiwhich early-returnsNoneifOPENAI_API_KEYis not set (before any API calls are made). Even thoughrequests.postis mocked, the test will fail in environments without this env var set, creating CI fragility.🐛 Fix: Add monkeypatch fixture
-def test_parse_aws(mock_subprocess_run): +def test_parse_aws(mock_subprocess_run, monkeypatch): + monkeypatch.setenv("OPENAI_API_KEY", "test-key") """Test parsing AWS CLI with mocked subprocess and OpenAI API calls"""src/binary_finder.py (1)
92-104: Shell injection vulnerability in container command check.The
f'command -v {binary_name}'string passed tosh -callows command injection. Ifbinary_namecontains shell metacharacters (e.g.,test; rm -rf /), arbitrary commands execute in the container.🔒 Safer parameter passing
try: result = subprocess.run( ['docker', 'run', '--rm', '--entrypoint', 'sh', docker_image, - '-c', f'command -v {binary_name}'], + '-c', 'command -v -- "$1"', 'sh', binary_name], capture_output=True, text=True, timeout=30 )This passes
binary_nameas a positional parameter$1to the shell, avoiding interpolation.src/parser.py (1)
188-192: Guard against malformed AI output in subcommand/option preview.If the AI returns a subcommand without a
namekey, line 189 will raiseKeyErrorand abort parsing. Use safe access to maintain resilience.🐛 Safer extraction
if num_subcmds > 0: - print(f" → First subcommands: {', '.join([s['name'] for s in parsed['subcommands'][:5]])}{' ...' if num_subcmds > 5 else ''}") + preview_names = [s.get('name', '<unnamed>') for s in parsed['subcommands'][:5] if isinstance(s, dict)] + if preview_names: + print(f" → First subcommands: {', '.join(preview_names)}{' ...' if num_subcmds > 5 else ''}") if num_opts > 0: - print(f" → First options: {', '.join([o.get('option', o.get('shortcut', '?')) for o in parsed['options'][:5]])}{' ...' if num_opts > 5 else ''}") + preview_opts = [o.get('option', o.get('shortcut', '?')) for o in parsed['options'][:5] if isinstance(o, dict)] + if preview_opts: + print(f" → First options: {', '.join(preview_opts)}{' ...' if num_opts > 5 else ''}")
🧹 Nitpick comments (4)
src/process_cli_tools_v2.py (1)
322-335: Variableparsershadows the importedparsermodule.The
argparse.ArgumentParserinstance at line 322 shadows theparsermodule imported at line 24. While this doesn't cause a runtime issue inmain()since it doesn't callparser.parse_binaryafter this point, it's confusing and error-prone.♻️ Suggested fix: rename the argument parser
def main(): """CLI entry point""" - parser = argparse.ArgumentParser( + arg_parser = argparse.ArgumentParser( description="Process multiple CLI tools from configuration", formatter_class=argparse.RawDescriptionHelpFormatter, ) - parser.add_argument("config", help="Path to config JSON file") - parser.add_argument("--output-dir", default="data/results", help="Base output directory") - parser.add_argument("--max-depth", type=int, default=20, help="Maximum recursion depth (default: 20)") - parser.add_argument("--only", nargs="+", help="Only process these tools (by name)") - parser.add_argument("--parallel", action="store_true", help="Process tools in parallel") - parser.add_argument("--workers", type=int, default=4, help="Number of parallel workers (default: 4)") - parser.add_argument("--no-skip", action="store_true", help="Reprocess even if output exists") + arg_parser.add_argument("config", help="Path to config JSON file") + arg_parser.add_argument("--output-dir", default="data/results", help="Base output directory") + arg_parser.add_argument("--max-depth", type=int, default=20, help="Maximum recursion depth (default: 20)") + arg_parser.add_argument("--only", nargs="+", help="Only process these tools (by name)") + arg_parser.add_argument("--parallel", action="store_true", help="Process tools in parallel") + arg_parser.add_argument("--workers", type=int, default=4, help="Number of parallel workers (default: 4)") + arg_parser.add_argument("--no-skip", action="store_true", help="Reprocess even if output exists") - args = parser.parse_args() + args = arg_parser.parse_args()src/command_executor.py (1)
186-193: Moveimport reto module level and simplify condition.The
import reinside the function (line 189) is executed on every call, adding unnecessary overhead. Also,len(output) > 0is redundant whenoutputis already truthy.♻️ Suggested fix
Move import to top of file:
import re import subprocess from typing import Dict, List, Optional, TupleThen simplify the condition:
- if output and (exit_code in [0, 1]) and len(output) > 0: + if output and exit_code in [0, 1]: # Check if it looks like version output (has numbers) - import re - if re.search(r"\d+\.\d+", output): return outputsrc/parser.py (1)
87-96: Consider lowering temperature for more consistent parsing.A temperature of 0.7 can produce varied outputs. For structured JSON parsing tasks, a lower temperature (0.0–0.3) typically yields more deterministic and consistent results.
'response_format': {'type': "json_object"}, - 'temperature': 0.7, + 'temperature': 0.1, }src/binary_finder.py (1)
204-222: LGTM on--entrypointusage; consider logging silenced exceptions.The Docker command now correctly uses
--entrypointto test the specific binary. However, the bareexceptat line 220 silently swallows all exceptions - consider logging these for debugging.- except (subprocess.TimeoutExpired, Exception): + except subprocess.TimeoutExpired: + continue + except Exception as e: + # Log for debugging but continue trying other variations + print(f" → Help variation failed: {e}") continue
| help_indicators = [ | ||
| "usage:", | ||
| "usage :", | ||
| "usage:", | ||
| "options:", | ||
| "flags:", | ||
| "commands:", | ||
| "subcommands:", | ||
| "help", | ||
| "examples:", | ||
| "arguments:", | ||
| "synopsis", | ||
| "description:", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate entry in help_indicators list.
Lines 101 and 103 both contain "usage:". Remove the duplicate.
🐛 Fix
help_indicators = [
"usage:",
- "usage :",
"usage:",
+ "usage :",
"options:",Wait, looking again - line 101 is "usage:", line 102 is "usage :" (with space), and line 103 is "usage:" again. Remove line 103:
help_indicators = [
"usage:",
"usage :",
- "usage:",
"options:",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| help_indicators = [ | |
| "usage:", | |
| "usage :", | |
| "usage:", | |
| "options:", | |
| "flags:", | |
| "commands:", | |
| "subcommands:", | |
| "help", | |
| "examples:", | |
| "arguments:", | |
| "synopsis", | |
| "description:", | |
| ] | |
| help_indicators = [ | |
| "usage:", | |
| "usage :", | |
| "options:", | |
| "flags:", | |
| "commands:", | |
| "subcommands:", | |
| "help", | |
| "examples:", | |
| "arguments:", | |
| "synopsis", | |
| "description:", | |
| ] |
🤖 Prompt for AI Agents
In `@src/command_executor.py` around lines 100 - 113, The help_indicators list in
command_executor.py contains a duplicate "usage:" entry; edit the
help_indicators definition (variable help_indicators) to remove the redundant
"usage:" string (the third occurrence) so each indicator is unique (keep
"usage:" and "usage :" as you prefer), ensuring the list no longer contains
duplicate entries.
| @staticmethod | ||
| def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix type annotation: use Any instead of any.
any (lowercase) is a built-in function, not a type. The return type annotation should use Any from the typing module.
🐛 Fix
Update the import at line 12:
-from typing import Dict, List, Optional, Tuple
+from typing import Any, Dict, List, Optional, TupleFix the annotation:
- def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]:
+ def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, Any]:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @staticmethod | |
| def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]: | |
| from typing import Any, Dict, List, Optional, Tuple |
| @staticmethod | |
| def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, any]: | |
| `@staticmethod` | |
| def test_help_variations(binary_path: str, docker_image: Optional[str] = None) -> Dict[str, Any]: |
🤖 Prompt for AI Agents
In `@src/command_executor.py` around lines 196 - 197, The return type annotation
for test_help_variations is incorrect: change Dict[str, any] to Dict[str, Any]
and add Any to the typing imports (alongside Dict/Optional) so the annotation
uses the proper typing.Any type; update the import statement (e.g., add "Any")
and replace the annotation on the static method test_help_variations
accordingly.
Note
Introduces a robust, AI-assisted CLI parsing and binary discovery toolchain with Docker/host support, plus tests and developer tooling.
src/binary_finder.py(find executables in containers/host with caching and timeouts) andsrc/command_executor.py(help/version execution with fallbacks)src/discover_binaries.pyto auto-discover/verify binaries in images and update configs;src/process_cli_tools_v2.pyto batch-parse tools (sequential/parallel) using the unifiedsrc/parser.pyparser_v2.pywith enhancedsrc/parser.py(AI-driven help/version parsing, recursive subcommand extraction); updates imports to use ittests/test_parser.pywith mocked subprocess/OpenAI to validate AWS CLI parsing.pre-commit-config.yaml; Makefilecleanstops deletingresult.json; minorsetup.pyformatting tweaksWritten by Cursor Bugbot for commit 72e8228. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Removed
✏️ Tip: You can customize this high-level summary in your review settings.