Skip to content

Conversation

@aibrahim-oai
Copy link
Collaborator

@aibrahim-oai aibrahim-oai commented Jan 8, 2026

make command summaries more accurate by distinguishing list/search/read operations across common CLI tools.

  • Added parsing helpers to centralize operand/flag handling (cd_target, sed_read_path, first_non_flag_operand, single_non_flag_operand, parse_grep_like, awk_data_file_operand, python_walks_files, is_python_command) and reused them in summarize_main_tokens/shell parsing.
  • Newly parsed list-files commands: git ls-files, rg --files (incl. rga/ripgrep-all), eza/exa, tree, du, python -c file-walks, plus fd/find map to ListFiles when no query.
  • Newly parsed search commands: git grep, grep/egrep/fgrep, ag/ack/pt, rg/rga files-with-matches flags (-l/-L, --files-with-matches, --files-without-match), with improved flag skipping to avoid misclassifying args as paths.
  • Newly parsed read commands: bat/batcat, less, more, awk , and more flexible sed -n range + file detection.
  • refine “small formatting command” detection for awk/sed, handle cd with -- or multiple operands, keep pipeline summaries focused on primary command.

@edward-bayes edward-bayes changed the title improve parsed commands [CODEX-4427] improve parsed commands Jan 9, 2026
@aibrahim-oai
Copy link
Collaborator Author

@codex fix lint annd tests

@chatgpt-codex-connector
Copy link
Contributor

Summary

Testing

  • just fmt
  • just fix -p codex-core
  • ⚠️ cargo test -p codex-core (fails in this environment: loopback 403 responses, /bin/sh incompatibility, and exec timeout test still failing)
  • cargo test -p codex-core exec::tests::kill_child_process_group_kills_grandchildren_on_timeout -- --nocapture
  • ⚠️ Not run (requires approval): cargo test --all-features

View task →

@aibrahim-oai
Copy link
Collaborator Author

@codex review

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5e59ff1a1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 1625 to 1647
let args_no_connector = trim_at_connector(args);
let non_flags: Vec<&String> = args_no_connector
.iter()
.filter(|p| !p.starts_with('-'))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip grep option values before extracting query

The new parse_grep_like only drops args starting with -, so option values that are positional (e.g. -m 1, -C 3) are treated as the search pattern. Per git grep -h, flags like -m, --max-count <n> and -C <n> take a value, so a command such as git grep -m 1 TODO src will be summarized as searching for 1 with path TODO instead of pattern TODO and path src. Now that git/egrep/fgrep parsing routes here, this mislabels common usages and should skip flag values similarly to skip_flag_values (and respect --).

Useful? React with 👍 / 👎.

Comment on lines +2240 to 2308
if let Some(path) = single_non_flag_operand(tail, &[]) {
let name = short_display_path(&path);
ParsedCommand::Read {
cmd: shlex_join(main_cmd),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve operands after -- for cat reads

single_non_flag_operand filters out any operand that starts with -, which means cat -- -weird (a standard way to read a file whose name begins with -) is now parsed as Unknown. The previous logic handled cat -- <file> without rejecting dash-prefixed filenames, so this change regresses support for that valid CLI form. This also affects other readers using the same helper (e.g., bat/less/more).

Useful? React with 👍 / 👎.

@aibrahim-oai
Copy link
Collaborator Author

@codex review

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5e59ff1a1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 1624 to 1647
fn parse_grep_like(main_cmd: &[String], args: &[String]) -> ParsedCommand {
let args_no_connector = trim_at_connector(args);
let non_flags: Vec<&String> = args_no_connector
.iter()
.filter(|p| !p.starts_with('-'))
.collect();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle grep patterns/paths after --

The new parse_grep_like drops every argument that starts with - and does not treat -- as the end of options. That means valid commands like grep -- -foo src or git grep -- -foo src (where the pattern or path intentionally starts with -) are parsed as query="src" with no path, which is incorrect. This regression only affects grep-like commands that rely on -- to pass dash-prefixed operands. Consider honoring -- and including subsequent operands even if they start with - (e.g., via skip_flag_values and/or explicit -- handling).

Useful? React with 👍 / 👎.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should take a second look at this function given the dual comments from codex here

Comment on lines 2171 to +2238
Some((head, tail)) if head == "cat" => {
// Support both `cat <file>` and `cat -- <file>` forms.
let effective_tail: &[String] = if tail.first().map(String::as_str) == Some("--") {
&tail[1..]
if let Some(path) = single_non_flag_operand(tail, &[]) {
let name = short_display_path(&path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P3 Badge Preserve cat -- -file path parsing

The cat parser now uses single_non_flag_operand, which filters out operands starting with - even after a -- sentinel. As a result, a valid invocation like cat -- -weird (file named -weird) is no longer recognized as a Read, whereas it was previously supported via the -- handling. This only affects dash-prefixed filenames, but it is a real regression for standard cat semantics. Consider special-casing -- (or avoiding the starts_with('-') filter after it) for cat.

Useful? React with 👍 / 👎.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a fairly specific edge case, but this does indeed seem like a regression @aibrahim-oai

Comment on lines 1624 to 1647
fn parse_grep_like(main_cmd: &[String], args: &[String]) -> ParsedCommand {
let args_no_connector = trim_at_connector(args);
let non_flags: Vec<&String> = args_no_connector
.iter()
.filter(|p| !p.starts_with('-'))
.collect();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should take a second look at this function given the dual comments from codex here

}
}
}
Some((head, tail)) if is_python_command(head) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codex does head here strip the path to the binary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer

No—head is the raw first token from main_cmd.split_first(), and is_python_command(head) is evaluated directly against that string, so any leading path (e.g., /usr/bin/python3) is not stripped before the check at the call site. If the command token is a path, is_python_command will not match unless you add explicit path-stripping logic elsewhere. codex-rs/core/src/parse_command.rsL2388-L2415

The only normalization that happens prior to this is normalize_tokens, which removes simple yes/no pipeline prefixes and bash/zsh -c/-lc wrappers; it does not strip command paths or basenames. codex-rs/core/src/parse_command.rsL1502-L1522

Commands run

View task →

Comment on lines 2171 to +2238
Some((head, tail)) if head == "cat" => {
// Support both `cat <file>` and `cat -- <file>` forms.
let effective_tail: &[String] = if tail.first().map(String::as_str) == Some("--") {
&tail[1..]
if let Some(path) = single_non_flag_operand(tail, &[]) {
let name = short_display_path(&path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a fairly specific edge case, but this does indeed seem like a regression @aibrahim-oai

@aibrahim-oai
Copy link
Collaborator Author

@codex address all feedback above

@chatgpt-codex-connector
Copy link
Contributor

Summary

  • Added positional operand parsing that respects -- and skips option values, then reused it for first/single operand helpers and grep-like parsing to correctly handle dash-prefixed operands and option arguments. codex-rs/core/src/parse_command.rsL1606-L1686

Testing

  • just fmt
  • just fix -p codex-core
  • cargo test -p codex-core (failed: exec::tests::kill_child_process_group_kills_grandchildren_on_timeout, unified_exec::tests::reusing_completed_process_returns_unknown_process, unified_exec::tests::multi_unified_exec_sessions; unified_exec_persists_across_requests hung and was interrupted with Ctrl+C).

View task →

@aibrahim-oai aibrahim-oai force-pushed the improve/parsed/commands branch from b5e59ff to ccb61a9 Compare January 13, 2026 18:05
@aibrahim-oai
Copy link
Collaborator Author

@codex review this

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c4a4666409

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines 1707 to 1711
// Do not shorten the query: grep patterns may legitimately contain slashes
// and should be preserved verbatim. Only paths should be shortened.
let query = operands.first().cloned().map(String::from);
let path = operands.get(1).map(|s| short_display_path(s));
ParsedCommand::Search {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve search path when grep uses -f pattern files

When -f/--file is used (e.g., grep -f patterns.txt src or git grep -f patterns.txt src), parse_grep_like skips the pattern-file argument and then treats the first remaining operand as the query ("src") and omits the path entirely. This produces incorrect summaries for a common grep invocation because the actual search path becomes the “query” and no path is reported. Consider treating -f/--file as supplying the query (or setting query to the pattern file) and then using the first remaining operand as the path.

Useful? React with 👍 / 👎.

@aibrahim-oai
Copy link
Collaborator Author

@codex review this

Copy link
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bcd381226a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1477 to +1481
fn sed_read_path(args: &[String]) -> Option<String> {
let args_no_connector = trim_at_connector(args);
if !args_no_connector.iter().any(|arg| arg == "-n") {
return None;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid treating in-place sed as a read

The new sed_read_path helper only checks for -n and a range script and never filters out in‑place edit flags (e.g., -i/--in-place). As a result, commands like sed -i -n 10p file will be summarized as ParsedCommand::Read even though -i rewrites the file (and with -n can truncate it to just the printed range). This mislabels mutating commands as reads, which can hide real file modifications in the command summaries.

Useful? React with 👍 / 👎.

@aibrahim-oai aibrahim-oai merged commit c96c26c into main Jan 14, 2026
32 checks passed
@aibrahim-oai aibrahim-oai deleted the improve/parsed/commands branch January 14, 2026 00:59
@github-actions github-actions bot locked and limited conversation to collaborators Jan 14, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants