-
Notifications
You must be signed in to change notification settings - Fork 7.2k
[CODEX-4427] improve parsed commands #8933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@codex fix lint annd tests |
|
Summary
Testing
|
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5e59ff1a1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
codex-rs/core/src/parse_command.rs
Outdated
| let args_no_connector = trim_at_connector(args); | ||
| let non_flags: Vec<&String> = args_no_connector | ||
| .iter() | ||
| .filter(|p| !p.starts_with('-')) | ||
| .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip grep option values before extracting query
The new parse_grep_like only drops args starting with -, so option values that are positional (e.g. -m 1, -C 3) are treated as the search pattern. Per git grep -h, flags like -m, --max-count <n> and -C <n> take a value, so a command such as git grep -m 1 TODO src will be summarized as searching for 1 with path TODO instead of pattern TODO and path src. Now that git/egrep/fgrep parsing routes here, this mislabels common usages and should skip flag values similarly to skip_flag_values (and respect --).
Useful? React with 👍 / 👎.
| if let Some(path) = single_non_flag_operand(tail, &[]) { | ||
| let name = short_display_path(&path); | ||
| ParsedCommand::Read { | ||
| cmd: shlex_join(main_cmd), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve operands after
-- for cat reads
single_non_flag_operand filters out any operand that starts with -, which means cat -- -weird (a standard way to read a file whose name begins with -) is now parsed as Unknown. The previous logic handled cat -- <file> without rejecting dash-prefixed filenames, so this change regresses support for that valid CLI form. This also affects other readers using the same helper (e.g., bat/less/more).
Useful? React with 👍 / 👎.
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5e59ff1a1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
codex-rs/core/src/parse_command.rs
Outdated
| fn parse_grep_like(main_cmd: &[String], args: &[String]) -> ParsedCommand { | ||
| let args_no_connector = trim_at_connector(args); | ||
| let non_flags: Vec<&String> = args_no_connector | ||
| .iter() | ||
| .filter(|p| !p.starts_with('-')) | ||
| .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle grep patterns/paths after
--
The new parse_grep_like drops every argument that starts with - and does not treat -- as the end of options. That means valid commands like grep -- -foo src or git grep -- -foo src (where the pattern or path intentionally starts with -) are parsed as query="src" with no path, which is incorrect. This regression only affects grep-like commands that rely on -- to pass dash-prefixed operands. Consider honoring -- and including subsequent operands even if they start with - (e.g., via skip_flag_values and/or explicit -- handling).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should take a second look at this function given the dual comments from codex here
| Some((head, tail)) if head == "cat" => { | ||
| // Support both `cat <file>` and `cat -- <file>` forms. | ||
| let effective_tail: &[String] = if tail.first().map(String::as_str) == Some("--") { | ||
| &tail[1..] | ||
| if let Some(path) = single_non_flag_operand(tail, &[]) { | ||
| let name = short_display_path(&path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve
cat -- -file path parsing
The cat parser now uses single_non_flag_operand, which filters out operands starting with - even after a -- sentinel. As a result, a valid invocation like cat -- -weird (file named -weird) is no longer recognized as a Read, whereas it was previously supported via the -- handling. This only affects dash-prefixed filenames, but it is a real regression for standard cat semantics. Consider special-casing -- (or avoiding the starts_with('-') filter after it) for cat.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fairly specific edge case, but this does indeed seem like a regression @aibrahim-oai
codex-rs/core/src/parse_command.rs
Outdated
| fn parse_grep_like(main_cmd: &[String], args: &[String]) -> ParsedCommand { | ||
| let args_no_connector = trim_at_connector(args); | ||
| let non_flags: Vec<&String> = args_no_connector | ||
| .iter() | ||
| .filter(|p| !p.starts_with('-')) | ||
| .collect(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should take a second look at this function given the dual comments from codex here
| } | ||
| } | ||
| } | ||
| Some((head, tail)) if is_python_command(head) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@codex does head here strip the path to the binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer
No—head is the raw first token from main_cmd.split_first(), and is_python_command(head) is evaluated directly against that string, so any leading path (e.g., /usr/bin/python3) is not stripped before the check at the call site. If the command token is a path, is_python_command will not match unless you add explicit path-stripping logic elsewhere. codex-rs/core/src/parse_command.rsL2388-L2415
The only normalization that happens prior to this is normalize_tokens, which removes simple yes/no pipeline prefixes and bash/zsh -c/-lc wrappers; it does not strip command paths or basenames. codex-rs/core/src/parse_command.rsL1502-L1522
Commands run
nl -ba /workspace/codex/codex-rs/core/src/parse_command.rs | sed -n '2388,2415p'codex-rs/core/src/parse_command.rsL2388-L2415nl -ba /workspace/codex/codex-rs/core/src/parse_command.rs | sed -n '1495,1538p'codex-rs/core/src/parse_command.rsL1495-L1538
| Some((head, tail)) if head == "cat" => { | ||
| // Support both `cat <file>` and `cat -- <file>` forms. | ||
| let effective_tail: &[String] = if tail.first().map(String::as_str) == Some("--") { | ||
| &tail[1..] | ||
| if let Some(path) = single_non_flag_operand(tail, &[]) { | ||
| let name = short_display_path(&path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a fairly specific edge case, but this does indeed seem like a regression @aibrahim-oai
|
@codex address all feedback above |
|
Summary
Testing
|
b5e59ff to
ccb61a9
Compare
|
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c4a4666409
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Do not shorten the query: grep patterns may legitimately contain slashes | ||
| // and should be preserved verbatim. Only paths should be shortened. | ||
| let query = operands.first().cloned().map(String::from); | ||
| let path = operands.get(1).map(|s| short_display_path(s)); | ||
| ParsedCommand::Search { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preserve search path when grep uses -f pattern files
When -f/--file is used (e.g., grep -f patterns.txt src or git grep -f patterns.txt src), parse_grep_like skips the pattern-file argument and then treats the first remaining operand as the query ("src") and omits the path entirely. This produces incorrect summaries for a common grep invocation because the actual search path becomes the “query” and no path is reported. Consider treating -f/--file as supplying the query (or setting query to the pattern file) and then using the first remaining operand as the path.
Useful? React with 👍 / 👎.
|
@codex review this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bcd381226a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| fn sed_read_path(args: &[String]) -> Option<String> { | ||
| let args_no_connector = trim_at_connector(args); | ||
| if !args_no_connector.iter().any(|arg| arg == "-n") { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid treating in-place sed as a read
The new sed_read_path helper only checks for -n and a range script and never filters out in‑place edit flags (e.g., -i/--in-place). As a result, commands like sed -i -n 10p file will be summarized as ParsedCommand::Read even though -i rewrites the file (and with -n can truncate it to just the printed range). This mislabels mutating commands as reads, which can hide real file modifications in the command summaries.
Useful? React with 👍 / 👎.
make command summaries more accurate by distinguishing list/search/read operations across common CLI tools.