Skip to content

CVE-2024-43785: gitoxide-core does not neutralize special characters for terminals #1534

@EliahKagan

Description

@EliahKagan

Current behavior 😯

This issue is for tracking the public vulnerability CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).

The gix and ein commands write pathnames and other metadata literally to terminals, even if they contain characters terminals treat specially, including ANSI escape sequences. This sometimes allows an untrusted repository to misrepresent its contents and to alter or concoct error messages.

Further details, including detailed instructions to reproduce the main effect described, are in the advisory, available in:

As noted, this vulnerability is low-risk. It was decided, following a coordinated disclosure, that informing users would be beneficial, even before a patch is available. Contributions are welcome.

Expected behavior 🤔

General expectations

When sufficiently capable of misleading the user or (though less of a threat) interfering with the operation of the terminal, special characters should be escaped in the output of gix and ein commands, except when raw output has been requested or is otherwise expected.

In addition, for paths, when displaying them in human-readable (as opposed to JSON) text in a terminal, there is little to no disadvantage to quoting them using an unambiguous scheme. I think this should happen in the situations where git always quotes paths, i.e., for paths that git quotes even if core.quotePath has been set to false.

This does not necessarily mean they need to be quoted in the same way that git quotes them, nor that core.quotePath itself needs to be implemented.

Ideas for quoting

I am hoping it may be feasible to use the type system not just to implement specific display behavior, but to distinguish between text that may need escaping and text that is known not to need it, so that there would be fewer places, including in code that may be added in the future, where forgetting to calling a sanitization function or construct an safe-displaying object, or where using the wrong format specifier, would result in unintentionally outputting text that may contain terminal escape sequences. I don’t know if that’s feasible or if it’s a good idea.

Text from repositories is often &BStr, including when it is a path, as is the case for example in gixoxide_core::repository::tree::format_entry():

https://github.com/Byron/gitoxide/blob/25a3f1b0b07c01dd44df254f46caa6f78a4d3014/gitoxide-core/src/repository/tree.rs#L181-L202

That is not the only place where paths sometimes need to be quoted.

Although the obvious way to express that something is a path is to represent it as a Path or PathBuf, I don't think that is the best approach here. If I understand correctly, at least on Windows there’s no safe guaranteed-to-succeed conversion from &[u8] or &BStr to &OsStr or Path, nor a safe guaranteed-to-succeed way to construct an OsString or PathBuf from arbitrary bytes. Furthermore, if we had a Path, it would still need custom printing to neutralize escape sequences.

One approach may be to do this, though this Rust code may be better understood as pseudocode in that a faster approach with fewer allocations should probably be preferred:

let plain = format!("{filename}");
let quoted = format!("{filename:?}");
if format!("\"{plain}\"") == quoted { plain } else { quoted }

That is, if quoting would keep it the same except the double quote marks around it, then show it verbatim not bothering with those quotation marks, and otherwise show it with the debug quoting provided by the BStr implementation.

This quoting is sufficient to neutralize terminal escape sequences because it turns the escape character, most commonly represented as \e, \033, or \x1B, into this literal sequence (which I think is at least as good as those representations):

\u{1b}

It seems to me that this may be beneficial even outside of the issue of escape characters. For example, if a path stored in a Git repository has pieces that aren’t valid UTF-8, then it would probably be better to show the escape sequences for those bytes as BStr has Debug do, rather than to show the Unicode substitution character as BStr has Display do.

This is a lossless encoding. It is reasonably easy to parse, in case anyone wants to do that, because which of the two forms was used for the output is discernible by whether it has a leading " character. In particular, even if the original had a " character, then the debug representation escapes it while also adding more quotes, and is therefore never equal to the original with quotes added, and thus would always be used rather than the original with quotes added.

Git behavior

Paths

git always quotes escape characters in paths when not told to do otherwise such as by -z, and will perform additional quoting if core.quotePath is true, which it is by default.

The following rehashes a fragment of the advisory to compare the behavior of git and gix, but it is not a substitute for the advisory.

I created a file whose name I specified in bash using the $' ' notation (a more portable and fully described approach is presented in the advisory) as:

$'\033]0;Boo!\007\033[2K\r\033[91mError: Repository is corrupted. Run \033[96mEVIL_COMMAND\033[91m to attempt recovery.\033[0m'

Running git ls-tree HEAD shows this, which is identical to what it really outputs:

100644 blob 4b94b710bc5f8670d781e8a8db1a8db9d73f036f    "\033]0;Boo!\a\033[2K\r\033[91mError: Repository is corrupted. Run \033[96mEVIL_COMMAND\033[91m to attempt recovery.\033[0m"

In contrast, running gix tree entries changes the terminal title (until it is rewritten, which some shells prompts may do), and it shows this, in bright red and bright cyan as described above, such that it wrongly appears to be the entire output of the command:

Error: Repository is corrupted. Run EVIL_COMMAND to attempt recovery.

Here's a screenshot showing the appearance of both git and gix commands:

Screenshot of a terminal showing how git presents the unusual characters symbolically while gix lets them through and allows them to misleadingly rewrite and color the output

Non-path data

git actually sometimes allows escape sequences from a repository through, such as in author and committer information shown in the output of git log, as well as in changed blob contents shown in the output of git diff. However, it seems to avoid it in situations where it could be seriously misleading, or where it would interfere with the operation of the terminal.

Characters that could be especially misleading are represented symbolically, such as the backspace and carriage return characters. Escape sequences that could be especially misleading are simply not let through, such as attempts to reposition the cursor in the terminal, except that those are let through when the output device is not a terminal. Colorization, when allowed to change, seems always to be restored, though I have not exhaustively verified that. Attempts to conceal or mimic leading +, -, or space characters in diffs do not seem to succeed.

I believe the Git behavior is not vulnerable, even outside of the treatment of paths.

Steps to reproduce 🕹

See the "PoC" (proof of concept) section in CVE-2024-43785 (GHSA-88g2-r9rw-g55h, RUSTSEC-2024-0364).

See also the "Git behavior" section above, which includes both a brief description of reproducing this, a screenshot that is not currently in the advisory, as well as showing that Git is not affected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    acknowledgedan issue is accepted as shortcoming to be fixedhelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions