Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Effect of refreshing with a relative path is not obvious #1831

Closed
EliahKagan opened this issue Feb 19, 2024 · 4 comments · Fixed by #1839
Closed

Effect of refreshing with a relative path is not obvious #1831

EliahKagan opened this issue Feb 19, 2024 · 4 comments · Fixed by #1839

Comments

@EliahKagan
Copy link
Contributor

When git.refresh or git.cmd.Git.refresh (which git.refresh calls) is passed a relative path as an explicit path argument, it is taken relative to the current working directory of the process GitPython is running in at the time the refresh occurs. However, if instead one of those refresh functions is called with no argument and the value of the GIT_PYTHON_GIT_EXECUTABLE environment variable is a relative path, that value is not resolved, but is instead looked up every time it is run. (The default of git is likewise not resolved.)

GitPython/git/cmd.py

Lines 365 to 370 in afa5754

# Discern which path to refresh with.
if path is not None:
new_git = os.path.expanduser(path)
new_git = os.path.abspath(new_git)
else:
new_git = os.environ.get(cls._git_exec_env_var, cls.git_exec_name)

This appears intentional, and in 8dc8eb9 (#1815) I added tests that assert this behavior. But this should also be clarified for users, by documenting it explicitly in the docstring of at least one of the refresh functions. I am unsure how best to do this, because ideally the difference should be explained, and I don't know if there is any good reason for the two cases to work differently, other than avoiding a breaking change within the same major version of the library.

If this is only for compatibility, then it might make sense to have git.refresh and git.cmd.Git.refresh accept a second optional resolve argument to indicate if the first argument is supposed to be eagerly resolved, and issue a DeprecationWarning when the resolve argument is not passed (i.e., one-argument git.refresh calls would be deprecated). This would not substitute for adding an explanation to the docstring.

Security implications

A user who is confused about this behavior may write code like git.refresh("git"), perhaps with the intention of undoing the effect of a previous refresh. If this is done when the current working directory is the working tree of an untrusted repository that contains a malicious git executable (or a malicious executable otherwise named the same as the command passed to refresh), then GitPython will use that command as git, which would be a situation like CVE-2023-40590 or CVE-2024-22190.

However, I am inclined to consider improving how this is documented to be a security enhancement, but not a fix for an existing security vulnerability in GitPython. I think this is not really a vulnerability in GitPython for three reasons. In decreasing order of significance:

  • Such code would typically be identified readily, because a git or other such executable inside a repository would not ordinarily occur in testing or normal usage, and an unexpected GitCommandNotFound would be raised and observed. In particular, for the typical case of calling git.refresh early on, such a mistake would be identified immediately. This differs from those vulnerabilities, where the current directory was searched but then the expected places were searched, and also differs in that this is about a small likelihood of software that uses GitPython introducing its own vulnerability, rather than GitPython itself having inherently vulnerable behavior.
  • The behavior of GitPython need not change to fix this, since it is mainly a matter of documentation.
  • Searching for uses of git.refresh suggests this is not often used at all, and didn't turn up any incorrect uses of relative paths (though this does not guarantee there are no such incorrect uses).

Integration considerations

With #1791, the case for documenting this inconsistency becomes stronger, because that will add another refresh-related function, refresh_bash, which never resolves the path. Unlike git, GitPython often does not need bash or does not need it until a hook is needed to run on Windows, so it is more likely that a wrong call to refresh_bash would go undetected. Therefore, I very much agree with the decision there not to resolve the path, on security grounds:

GitPython/git/cmd.py

Lines 439 to 446 in 8200ad1

# Discern which path to refresh with.
if path is not None:
new_bash = os.path.expanduser(path)
# new_bash = os.path.abspath(new_bash)
else:
new_bash = os.environ.get(cls._bash_exec_env_var)
if not new_bash:
new_bash = cls._get_default_bash_path()

Because after #1791 this will be a behavioral difference between the refresh functions and refresh_bash, this will be a further reason to document the subtlety.

This could possibly be included in the docstring modifications there, which would avoid a conflict, but I am somewhat inclined not to request unnecessary enhancements there.

@Byron
Copy link
Member

Byron commented Feb 19, 2024

Thanks for investigating this issue in such depth!

It sounds like documenting this behaviour properly would be a possible remedy for the functions alternative use, which fortunately enough doesn't seem to be used much at all.

This could possibly be included in the docstring modifications there, which would avoid a conflict, but I am somewhat inclined not to request unnecessary enhancements there.

I agree, on the grounds that the PR has been stalling with the feedback as is and more feedback wouldn't improve that. Maybe at some point similar functionality (as in #1791) will supersede the PR as well.

@EliahKagan
Copy link
Contributor Author

It sounds like documenting this behaviour properly would be a possible remedy for the functions alternative use, which fortunately enough doesn't seem to be used much at all.

Is the reason why refresh(path) resolves the path and refresh() doesn't known? If so, I'd probably want to include that, as mentioned above. If not, I can just make a PR that adds a description of the difference to the docstring.

@Byron
Copy link
Member

Byron commented Feb 21, 2024

No, sorry, I don't know anymore.

@EliahKagan
Copy link
Contributor Author

That's all right, I'll describe the behavior as-is, and if anyone discovers or figures it out later then the description can be expanded or otherwise adjusted accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

2 participants