Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-ASCII file paths not correctly extracted in changedFilesNames function #9114

Closed
anda3 opened this issue May 23, 2024 · 1 comment · Fixed by #9115
Closed

Non-ASCII file paths not correctly extracted in changedFilesNames function #9114

anda3 opened this issue May 23, 2024 · 1 comment · Fixed by #9115
Labels
bug Something isn't working gh-pr relating to the gh pr command help wanted Contributions welcome p3 Affects a small number of users or is largely cosmetic

Comments

@anda3
Copy link
Contributor

anda3 commented May 23, 2024

Describe the bug

The changedFilesNames function in diff.go does not correctly extract file paths that contain non-ASCII characters when the core.quotePath git parameter is set to true.

Version of GitHub CLI: gh version 2.49.2 (2024-05-13)

Steps to reproduce the behavior

  1. Set core.quotePath git parameter to true.
  2. Create a file with non-ASCII characters in its path.
  3. Run a git diff that includes the file.
  4. Pass the diff to the changedFilesNames function.

Expected vs actual behavior

Expected behavior: The function should correctly extract and output the file path, even if it contains non-ASCII characters and is quoted due to the core.quotePath git parameter being set to true.

Actual behavior: The function does not correctly extract the file path if it contains non-ASCII characters and is quoted.

Additional context

The issue seems to be with the regular expression used to extract the file paths from the diff. It does not account for quoted paths.

Possible solution

Update the regular expression to correctly handle quoted paths.

@anda3 anda3 added the bug Something isn't working label May 23, 2024
@cliAutomation cliAutomation added the needs-triage needs to be reviewed label May 23, 2024
@williammartin williammartin added the gh-pr relating to the gh pr command label May 23, 2024
@williammartin
Copy link
Member

Initial Exploration

Firstly, understanding what core.quotePath does:

➜ touch "hello-😀-"world"

➜ git config --global core.quotepath false

➜ git status
On branch wm/9114-triage
Your branch is up to date with 'origin/wm/9114-triage'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        hello-😀-world

nothing added to commit but untracked files present (use "git add" to track)

➜ git config --global core.quotepath true

➜  git status
On branch wm/9114-triage
Your branch is up to date with 'origin/wm/9114-triage'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        "hello-\360\237\230\200-world"

nothing added to commit but untracked files present (use "git add" to track)

I can see that core.quotePath turned on results in the file path being quoted and the emoji replaced by its octal representation.

What's the actual bug?

Firstly, thank you for the bug report, the detailed instructions and the PR. In future, it would be useful if you provided the gh commands as part of the reproduction as it took me a moment to figure out what you were actually doing since I am not familiar with this part of the code.

By default the pr diff command shows a git diff of the changes in a Pull Request. We can see that it works correctly:

➜ git config --global core.quotepath true
➜ gh pr diff

diff --git "a/hello-\360\237\230\200-world" "b/hello-\360\237\230\200-world"
new file mode 100644
index 0000000..e69de29

However the function changedFilesNames is called from a different path, that is hit when the --name-only flag is provided:

cli/pkg/cmd/pr/diff/diff.go

Lines 148 to 150 in 105bafd

if opts.NameOnly {
return changedFilesNames(opts.IO.Out, diff)
}

We can see that in this case, the file is not listed:

➜ gh pr diff --name-only | wc -l
       0

However, as far as I can tell this doesn't have anything to do with core.quotePath set to true as it also exhibits this behaviour when configured to false. For example, here is git show with core.quotePath set to false:

commit d8137798e51e2368ac9263bfb8c82ffa90e73317 (HEAD -> 9114-triage, origin/9114-triage)
Author: William Martin <williammartin@github.com>
Date:   Thu May 23 11:11:29 2024 +0200

    Add emoji file

diff --git a/hello-😀-world b/hello-😀-world
new file mode 100644
index 0000000..e69de29

But when we request the patch from GitHub it returns the path as if it were quoted regardless:

➜ gh api -H 'Accept: application/vnd.github.v3.diff' /repos/williammartin-test-org/test-repo/pulls/39 | cat

diff --git "a/hello-\360\237\230\200-world" "b/hello-\360\237\230\200-world"
new file mode 100644
index 0000000..e69de29

So all in all I think that there is a bug here, it's just unrelated to the local configuration and entirely related to what the GitHub API returns when a diff is requested and that diff contains files that are quoted.

@williammartin williammartin added p3 Affects a small number of users or is largely cosmetic help wanted Contributions welcome and removed needs-triage needs to be reviewed labels May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gh-pr relating to the gh pr command help wanted Contributions welcome p3 Affects a small number of users or is largely cosmetic
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants