-
Notifications
You must be signed in to change notification settings - Fork 145
Some defensive programming #1890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
On the off chance that `lookup_decoration()` cannot find anything, let `leave_one_treesame_to_parent()` return gracefully instead of crashing. Pointed out by CodeQL. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `lookup_commit_reference()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `parse_object()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `lookup_commit()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `lookup_commit()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `parse_tree_indirect()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `lookup_commit()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `branch_get()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `branch_get()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `lookup_commit_reference()` can return NULL values. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
CodeQL points out that `branch_get()` can return NULL values. Note that the error path in this instance calls `BUG()`, not `die()`, for two reasons: 1. The code lives in `libgit.a` and calling `die()` from within those library functions is a bad practice that needs to be reduced, rather than increased. 2. The `inherit_tracking()` function really should only be called with the name of an existing branch, therefore a `NULL` return value would indeed constitute a bug in Git's code. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As pointed out by CodeQL, it could be NULL and we usually check for that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On the off-chance that it's NULL... Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
As pointed out by CodeQL, `lookup_commit()` can return NULL. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
7187789
to
17e9e9a
Compare
/submit |
Submitted as pull.1890.git.1747313139.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -3359,6 +3359,9 @@ static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *co | |||
struct commit_list *p; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> On the off chance that `lookup_decoration()` cannot find anything, let
> `leave_one_treesame_to_parent()` return gracefully instead of crashing.
But wouldn't it be a BUG("") worthy event for the treesame
decoration not to exist for the commit object in question at this
point of the code? Is it really defensive to silently pretend that
nothing bad happened and to move forward?
> Pointed out by CodeQL.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> revision.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/revision.c b/revision.c
> index c4390f0938cb..59eae4eb8ba8 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -3359,6 +3359,9 @@ static int leave_one_treesame_to_parent(struct rev_info *revs, struct commit *co
> struct commit_list *p;
> unsigned n;
>
> + if (!ts)
> + return 0;
> +
> for (p = commit->parents, n = 0; p; p = p->next, n++) {
> if (ts->treesame[n]) {
> if (p->item->object.flags & TMP_MARK) {
@@ -1106,7 +1106,7 @@ static enum get_oid_result get_parent(struct repository *r, | |||
if (ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> CodeQL points out that `lookup_commit_reference()` can return NULL
> values.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> object-name.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/object-name.c b/object-name.c
> index 76749fbfe652..ca54dad2f4c8 100644
> --- a/object-name.c
> +++ b/object-name.c
> @@ -1106,7 +1106,7 @@ static enum get_oid_result get_parent(struct repository *r,
> if (ret)
> return ret;
> commit = lookup_commit_reference(r, &oid);
> - if (repo_parse_commit(r, commit))
> + if (!commit || repo_parse_commit(r, commit))
> return MISSING_OBJECT;
Most of the time, the check for "ret" we see in the pre-context,
which is a result of get_oid_1(), would prevent an oid that is not a
valid name for a committish to even reach this code, I would think,
but with possible repository corruption, we may fail to "lookup" the
commit, so this is a good correction, I would think.
Thanks.
> if (!idx) {
> oidcpy(result, &commit->object.oid);
@@ -155,7 +155,7 @@ static struct commit *deref_without_lazy_fetch(const struct object_id *oid, | |||
struct tag *tag = (struct tag *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> CodeQL points out that `parse_object()` can return NULL values.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> fetch-pack.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 1ed5e11dd568..4cbcb0c14c48 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -155,7 +155,7 @@ static struct commit *deref_without_lazy_fetch(const struct object_id *oid,
> struct tag *tag = (struct tag *)
> parse_object(the_repository, oid);
>
> - if (!tag->tagged)
> + if (!tag || !tag->tagged)
> return NULL;
The "oid" can come from corruptible sources like commit graph file,
so I agree with your analysis that it may name a missing object in a
corrupt repository, leading "tag" being NULL. Looks correct.
> if (mark_tags_complete_and_check_obj_db)
> tag->object.flags |= COMPLETE;
@@ -1106,7 +1106,7 @@ static enum get_oid_result get_parent(struct repository *r, | |||
if (ret) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Thu, May 15, 2025 at 12:45:27PM +0000, Johannes Schindelin via GitGitGadget wrote:
> CodeQL points out that `lookup_commit_reference()` can return NULL
> values.
> [...]
> commit = lookup_commit_reference(r, &oid);
> - if (repo_parse_commit(r, commit))
> + if (!commit || repo_parse_commit(r, commit))
> return MISSING_OBJECT;
Sure, but repo_parse_commit() can also handle NULL values. It returns
"-1" in that case. I think CodeQL is not smart enough to know that.
-Peff
User |
This patch series was integrated into seen via git@d36a872. |
This branch is now known as |
This patch series was integrated into seen via git@20c7768. |
There was a status update in the "New Topics" section about the branch Assorted changes that please CodeQL. Comments? source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@a9e9d2a. |
This patch series was integrated into seen via git@5091f3e. |
This patch series was integrated into seen via git@e4fa727. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Comments? source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@7a2bbc5. |
This patch series was integrated into seen via git@edbfd57. |
This patch series was integrated into seen via git@3087553. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Comments? source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@d5c8251. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Comments? source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@4df18b3. |
This patch series was integrated into seen via git@27f8825. |
This patch series was integrated into seen via git@2e06034. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Comments? source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@c12b2a0. |
This patch series was integrated into seen via git@762369a. |
This patch series was integrated into seen via git@4063091. |
This patch series was integrated into seen via git@46b4ef2. |
This patch series was integrated into seen via git@4b9646d. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Will discard, as no strong support seems to be there in the thread. source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@4d7b6f6. |
This patch series was integrated into seen via git@58d3894. |
There was a status update in the "Cooking" section about the branch Assorted changes that please CodeQL. Will discard, as no strong support seems to be there in the thread. source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@e4c67d3. |
There was a status update in the "Discarded" section about the branch Assorted changes that please CodeQL. Will discard, as no strong support seems to be there in the thread. source: <pull.1890.git.1747313139.gitgitgadget@gmail.com> |
No support from upstream for these changes. I'll just suppress the CodeQL queries that pointed out these issues. |
This PR adds a [CodeQL](https://codeql.github.com/) workflow to this repository. CodeQL is touted as a "semantic code analysis engine", i.e. its intention isn't really a full industry-grade static code analyzer like Coverity (which we [already run in our CI builds](https://github.com/microsoft/git/actions/workflows/coverity.yml)). This shows in the number of queries we have to suppress because they would result in an unwieldy number of false positives. Or, one might argue, it shows how convoluted part of Git's logic is, which may not only confuse CodeQL but also human readers. The latter might not be a _direct_ security issue, but as Tony Hoare [said](https://en.wikiquote.org/wiki/C._A._R._Hoare#The_Emperor's_Old_Clothes): > There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult. I tried to make the code simpler with gitgitgadget#1888 and with gitgitgadget#1890, but the discussion went astray from my original purpose, instead veering off into the direction of arguing in complicated lines of reasoning that the current code is actually correct. Nevertheless, I do include these patches here, structured via merge commits to make it easier to drop them should too many merge conflicts in rebases to upstream Git versions make that advisable. However, for most of the alerts, the approach I took is exclude the queries wholesale, when they caused alerts. I could have taken the approach to suppress the alerts in a fine-grained way (via `codeql[<query-name>]`) so that true positives aren't suppressed in addition to false positives. However, I am loathe to take that approach because a voice inside me fully expects backlash on the Git mailing list along the lines of :"You're littering the code just to appease CodeQL!". Theoretically, an alternative exists: To develop modified versions of those CodeQL queries, e.g. to ignore paths inside the `.git/` directory in the `cpp/toctou-race-condition` query. However, CodeQL is a language that is not only (intentionally?) difficult to develop in by virtue of being declarative instead of imperative, its debugging facilities are pretty much non-existent. Given all of the above, the obvious question begs itself: Why bother with CodeQL at all? The truthful answer is: It is mandated by the Secure Future Initiative. And who knows, maybe CodeQL will come in handy in the future? After all, it is a framework more than a solution, and should in principle be able to help with answering questions like: "Which call paths into `libgit.a` could result in `die()` being called?".
These patches implement some defensive programming to address complaints some static analyzers might have.
cc: Jeff King peff@peff.net