Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

馃悰 git-delta munges filenames with hyphens #1259

Closed
nickurak opened this issue Dec 14, 2022 · 4 comments
Closed

馃悰 git-delta munges filenames with hyphens #1259

nickurak opened this issue Dec 14, 2022 · 4 comments

Comments

@nickurak
Copy link

Reproduction steps:

Setup:

cd $(mktemp -d git-delta-bug-XXXX)
git init
echo hello > some-file
git add some-file
git commit -m some-file

Symptoms: - in file name is replaced with ::

git grep hello; git grep -l hello

returns:

some:file:hello
some:file

instead of the expected no-pager output of:

git --no-pager grep hello; git --no-pager grep -l hello

which yields:

some-file:hello
some-file

Version:

$ delta --version
delta 0.15.1
@matttbe
Copy link

matttbe commented May 4, 2023

Hello,

It looks like the hyphens are also replaced in the branch names, not just the filenames, e.g.:

$ git grep --no-pager mptcp_ t/upstream-net
t/upstream-net:<file>:<match>  ## without Delta
(...)

$ git grep mptcp_ t/upstream-net
t/upstream:net:<file>:<match>  ## with Delta
(...)

(with Delta, we have t/upstream:net instead of t/upstream-net for the branch name)

@dandavison
Copy link
Owner

Hi @nickurak @matttbe ultimately the problem here is that traditional grep output is not unambiguously parseable, due to the use of characters as delimiters which might also occur in file names. That said, it definitely might be possible to fix the bugs you're highlighting. Check out the (not nice) code which explains the problem and the attempts to solve it by trying various regexes:

delta/src/handlers/grep.rs

Lines 320 to 325 in ce41a39

fn make_grep_line_regex(regex_variant: GrepLineRegex) -> Regex {
// Grep tools such as `git grep` and `rg` emit lines like the following,
// where "xxx" represents arbitrary code. Note that there are 3 possible
// "separator characters": ':', '-', '='.
// The format is ambiguous, but we attempt to parse it.

If you can see how to improve that code and make it work for a larger set of inputs I'd be happy to accept changes: the code does at least have decent unit tests so you can experiment with different approaches fairly easily.

Personally, I would say the best solution is to use ripgrep --json | delta, which of course has no parsing ambiguities. However, I understand that git grep provides some functionality that ripgrep doesn't.

@rossburton
Copy link

I just hit this and got incredibly confused when git grep was telling me about files that don't exist.

@dandavison
Copy link
Owner

@rossburton is it possible for you to use rg --json instead? The delta manual contains an entry addressing this:

If you don't need special features of git grep, then for best results pipe rg --json output to delta: this avoids parsing ambiguities that are inevitable with the output of git grep and grep.

Copying from #1631

See the ~500 lines of unit tests here: https://github.com/dandavison/delta/blob/main/src/handlers/grep.rs#L653-L1177. If someone can improve the parsing while keeping all those tests passing (and hopefully adding a test for what you're fixing) that would be fantastic.

I'm going to close this since I have personally taken a fairly large stab at it and also implemented rg --json support for a fully robust solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants