-
-
Notifications
You must be signed in to change notification settings - Fork 385
Description
Current behavior 😯
This issue is for tracking the public vulnerability CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).
The
gix
andein
commands write pathnames and other metadata literally to terminals, even if they contain characters terminals treat specially, including ANSI escape sequences. This sometimes allows an untrusted repository to misrepresent its contents and to alter or concoct error messages.
Further details, including detailed instructions to reproduce the main effect described, are in the advisory, available in:
- CVE-2024-43785 - GitHub Advisory Database
- GHSA-88g2-r9rw-g55h - repository local advisory
- RUSTSEC-2024-0364 - RUSTSEC Advisory Database
As noted, this vulnerability is low-risk. It was decided, following a coordinated disclosure, that informing users would be beneficial, even before a patch is available. Contributions are welcome.
Expected behavior 🤔
General expectations
When sufficiently capable of misleading the user or (though less of a threat) interfering with the operation of the terminal, special characters should be escaped in the output of gix
and ein
commands, except when raw output has been requested or is otherwise expected.
In addition, for paths, when displaying them in human-readable (as opposed to JSON) text in a terminal, there is little to no disadvantage to quoting them using an unambiguous scheme. I think this should happen in the situations where git
always quotes paths, i.e., for paths that git
quotes even if core.quotePath
has been set to false
.
This does not necessarily mean they need to be quoted in the same way that git
quotes them, nor that core.quotePath
itself needs to be implemented.
Ideas for quoting
I am hoping it may be feasible to use the type system not just to implement specific display behavior, but to distinguish between text that may need escaping and text that is known not to need it, so that there would be fewer places, including in code that may be added in the future, where forgetting to calling a sanitization function or construct an safe-displaying object, or where using the wrong format specifier, would result in unintentionally outputting text that may contain terminal escape sequences. I don’t know if that’s feasible or if it’s a good idea.
Text from repositories is often &BStr
, including when it is a path, as is the case for example in gixoxide_core::repository::tree::format_entry()
:
That is not the only place where paths sometimes need to be quoted.
Although the obvious way to express that something is a path is to represent it as a Path
or PathBuf
, I don't think that is the best approach here. If I understand correctly, at least on Windows there’s no safe guaranteed-to-succeed conversion from &[u8]
or &BStr
to &OsStr
or Path
, nor a safe guaranteed-to-succeed way to construct an OsString
or PathBuf
from arbitrary bytes. Furthermore, if we had a Path
, it would still need custom printing to neutralize escape sequences.
One approach may be to do this, though this Rust code may be better understood as pseudocode in that a faster approach with fewer allocations should probably be preferred:
let plain = format!("{filename}");
let quoted = format!("{filename:?}");
if format!("\"{plain}\"") == quoted { plain } else { quoted }
That is, if quoting would keep it the same except the double quote marks around it, then show it verbatim not bothering with those quotation marks, and otherwise show it with the debug quoting provided by the BStr
implementation.
This quoting is sufficient to neutralize terminal escape sequences because it turns the escape character, most commonly represented as \e
, \033
, or \x1B
, into this literal sequence (which I think is at least as good as those representations):
\u{1b}
It seems to me that this may be beneficial even outside of the issue of escape characters. For example, if a path stored in a Git repository has pieces that aren’t valid UTF-8, then it would probably be better to show the escape sequences for those bytes as BStr
has Debug
do, rather than to show the Unicode substitution character as BStr
has Display
do.
This is a lossless encoding. It is reasonably easy to parse, in case anyone wants to do that, because which of the two forms was used for the output is discernible by whether it has a leading "
character. In particular, even if the original had a "
character, then the debug representation escapes it while also adding more quotes, and is therefore never equal to the original with quotes added, and thus would always be used rather than the original with quotes added.
Git behavior
Paths
git
always quotes escape characters in paths when not told to do otherwise such as by -z
, and will perform additional quoting if core.quotePath
is true
, which it is by default.
The following rehashes a fragment of the advisory to compare the behavior of git
and gix
, but it is not a substitute for the advisory.
I created a file whose name I specified in bash
using the $'
'
notation (a more portable and fully described approach is presented in the advisory) as:
$'\033]0;Boo!\007\033[2K\r\033[91mError: Repository is corrupted. Run \033[96mEVIL_COMMAND\033[91m to attempt recovery.\033[0m'
Running git ls-tree HEAD
shows this, which is identical to what it really outputs:
100644 blob 4b94b710bc5f8670d781e8a8db1a8db9d73f036f "\033]0;Boo!\a\033[2K\r\033[91mError: Repository is corrupted. Run \033[96mEVIL_COMMAND\033[91m to attempt recovery.\033[0m"
In contrast, running gix tree entries
changes the terminal title (until it is rewritten, which some shells prompts may do), and it shows this, in bright red and bright cyan as described above, such that it wrongly appears to be the entire output of the command:
Error: Repository is corrupted. Run EVIL_COMMAND to attempt recovery.
Here's a screenshot showing the appearance of both git
and gix
commands:
Non-path data
git
actually sometimes allows escape sequences from a repository through, such as in author and committer information shown in the output of git log
, as well as in changed blob contents shown in the output of git diff
. However, it seems to avoid it in situations where it could be seriously misleading, or where it would interfere with the operation of the terminal.
Characters that could be especially misleading are represented symbolically, such as the backspace and carriage return characters. Escape sequences that could be especially misleading are simply not let through, such as attempts to reposition the cursor in the terminal, except that those are let through when the output device is not a terminal. Colorization, when allowed to change, seems always to be restored, though I have not exhaustively verified that. Attempts to conceal or mimic leading +
, -
, or space characters in diffs do not seem to succeed.
I believe the Git behavior is not vulnerable, even outside of the treatment of paths.
Steps to reproduce 🕹
See the "PoC" (proof of concept) section in CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).
See also the "Git behavior" section above, which includes both a brief description of reproducing this, a screenshot that is not currently in the advisory, as well as showing that Git is not affected.