Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v0.27.8 #4942

Merged
merged 26 commits into from
Jan 28, 2019
Merged

Release v0.27.8 #4942

merged 26 commits into from
Jan 28, 2019

Conversation

ethomson
Copy link
Member

Let's release v0.27.8! I've backported a bunch of fixes.

tiennou and others added 15 commits January 18, 2019 22:39
Previously, an error in `git_config_next` would be mistaken as a
successful load, because the previous call would have succeeded.
Coverity saw the subsequent check for a completed iteration as dead, so
let's make it useful again.

CID 1391374
We have no need to take a non-const reference. This does involve some other work
to make sure we don't mix const and non-const variables, but by splitting what
we want each variable to do we can also simplify the logic for when we do want
to free a new reference we might have allocated.
We want to allow the creation of annotated commits out of annotated tags and for
that we have to peel the reference all the way to the commit instead of stopping
at the first id it provides.
When git_filter_apply_fn callback returns a error while smudging proxy_stream_close
ends up returning without closing the stream. This is turn makes blob_content_to_file
crash as it asserts the stream being closed whether there are errors or not.

Closing the target stream on error fixes this problem.
If the routine exits with error before stream or hash_ctx is initialized, the
program will segfault when trying to free them.
When we try to add a configuration file with `git_config_add_file_ondisk`, we
treat nonexisting files as empty. We do this by performing a stat call, ignoring
ENOENT errors. This works just fine in case the file or any of its parents
simply does not exist, but there is also the case where any of the parent
directories is not a directory, but a file. So e.g. trying to add a
configuration file "/dev/null/.gitconfig" will fail, as `errno` will be ENOTDIR
instead of ENOENT.

Catch ENOTDIR in addition to ENOENT to fix the issue. Add a test that verifies
we are able to add configuration files with such an invalid path file just fine.
While commit objects usually should have only one author field, our commit
parser actually handles the case where a commit has multiple author fields
because some tools that exist in the wild actually write them. Detection of
those additional author fields is done by using a simple `git__prefixcmp`,
checking whether the current line starts with the string "author ". In case
where we are handed a non-NUL-terminated string that ends directly after the
space, though, we may have an out-of-bounds read of one byte when trying to
compare the expected final NUL byte.

Fix the issue by using `git__prefixncmp` instead of `git_prefixcmp`.
Unfortunately, a test cannot be easily written to catch this case. While we
could test the last error message and verify that it didn't in fact fail parsing
a signature (because that would indicate that it has in fact tried to parse the
additional "author " field, which it shouldn't be able to detect in the first
place), this doesn't work as the next line needs to be the "committer" field,
which would error out with the same error message even if we hadn't done an
out-of-bounds read.

As objects read from the object database are always NUL terminated, this issue
cannot be triggered in normal code and thus it's not security critical.
Handle two null argument cases that occur in the unit tests.
One is in library code, the other is in test code.

Detected by running unit tests with undefined behavior sanitizer:
```bash
 # build
mkdir build && cd build
cmake -DBUILD_CLAR=ON -DCMAKE_C_FLAGS="-fsanitize=address \
-fsanitize=undefined -fstack-usage -static-libasan" ..
cmake --build .

 # run with asan
ASAN_OPTIONS="allocator_may_return_null=1" ./libgit2_clar
...
............../libgit2/src/apply.c:316:3: runtime error: null pointer \
passed as argument 1, which is declared to never be null
...................../libgit2/tests/apply/fromfile.c:46:3: runtime \
error: null pointer passed as argument 1, which is declared to never be null
```
When parsing a signature's timezone offset, we first check whether there
is a timezone at all by verifying that there are still bytes left to
read following the time itself. The check thus looks like `time_end + 1
< buffer_end`, which is actually correct in this case. After setting the
timezone's start pointer to that location, we compute the remaining
bytes by using the formula `buffer_end - tz_start + 1`, re-using the
previous `time_end + 1`. But this is in fact missing the braces around
`(tz_start + 1)`, thus leading to an overestimation of the remaining
bytes by a length of two. In case of a non-NUL terminated buffer, this
will result in an overflow.

The function `git_signature__parse` is only used in two locations. First
is `git_signature_from_buffer`, which only accepts a string without a
length. The string thus necessarily has to be NUL terminated and cannot
trigger the issue.

The other function is `git_commit__parse_raw`, which can in fact trigger
the error as it may receive non-NUL terminated commit data. But as
objects read from the ODB are always NUL-terminated by us as a
cautionary measure, it cannot trigger the issue either.

In other words, this error does not have any impact on security.
@pks-t
Copy link
Member

pks-t commented Jan 23, 2019

I've added the backport-0.27.8 label to all the PRs that you have proposed in here.

@pks-t
Copy link
Member

pks-t commented Jan 23, 2019

Additional proposals:

@pks-t
Copy link
Member

pks-t commented Jan 23, 2019

And #4670: Fix negative gitignore rules with leading directories

@ethomson ethomson force-pushed the ethomson/v0.27.8 branch 2 times, most recently from 23ebbc7 to 5646095 Compare January 25, 2019 10:52
@ethomson
Copy link
Member Author

FYI - I brought in parts of #4374 to satisfy some of the tests in #4871.

@ethomson
Copy link
Member Author

@pks-t I tried cherry-picking #4871, and then bringing in dependencies, but I'm not having success. Help?

@pks-t
Copy link
Member

pks-t commented Jan 25, 2019

Oh, yeah. I didn't remember #4871 depending on the raw functions of #4374, but in fact they are only needed for the tests. I'm not feeling too comfortable with cherry-picking them from #4374, so I'd propose to skip d4ad658 and the newly introduced tree tests.

I've pushed a branch "pks/v0.27.8-without-odb-raw" to my fork. Diff:

diff --git a/src/blob.c b/src/blob.c
index 666dd232a..3396fe74f 100644
--- a/src/blob.c
+++ b/src/blob.c
@@ -19,54 +19,34 @@
 const void *git_blob_rawcontent(const git_blob *blob)
 {
 	assert(blob);
-	if (blob->raw)
-		return blob->data.raw.data;
-	else
-		return git_odb_object_data(blob->data.odb);
+	return git_odb_object_data(blob->odb_object);
 }
 
 git_off_t git_blob_rawsize(const git_blob *blob)
 {
 	assert(blob);
-	if (blob->raw)
-		return blob->data.raw.size;
-	else
-		return (git_off_t)git_odb_object_size(blob->data.odb);
+	return (git_off_t)git_odb_object_size(blob->odb_object);
 }
 
 int git_blob__getbuf(git_buf *buffer, git_blob *blob)
 {
 	return git_buf_set(
 		buffer,
-		git_blob_rawcontent(blob),
-		git_blob_rawsize(blob));
+		git_odb_object_data(blob->odb_object),
+		git_odb_object_size(blob->odb_object));
 }
 
-void git_blob__free(void *_blob)
+void git_blob__free(void *blob)
 {
-	git_blob *blob = (git_blob *) _blob;
-	if (!blob->raw)
-		git_odb_object_free(blob->data.odb);
+	git_odb_object_free(((git_blob *)blob)->odb_object);
 	git__free(blob);
 }
 
-int git_blob__parse_raw(void *_blob, const char *data, size_t size)
+int git_blob__parse(void *blob, git_odb_object *odb_obj)
 {
-	git_blob *blob = (git_blob *) _blob;
-	assert(blob);
-	blob->raw = 1;
-	blob->data.raw.data = data;
-	blob->data.raw.size = size;
-	return 0;
-}
-
-int git_blob__parse(void *_blob, git_odb_object *odb_obj)
-{
-	git_blob *blob = (git_blob *) _blob;
 	assert(blob);
 	git_cached_obj_incref((git_cached_obj *)odb_obj);
-	blob->raw = 0;
-	blob->data.odb = odb_obj;
+	((git_blob *)blob)->odb_object = odb_obj;
 	return 0;
 }
 
@@ -392,8 +372,8 @@ int git_blob_is_binary(const git_blob *blob)
 
 	assert(blob);
 
-	git_buf_attach_notowned(&content, git_blob_rawcontent(blob),
-		min(git_blob_rawsize(blob),
+	git_buf_attach_notowned(&content, blob->odb_object->buffer,
+		min(blob->odb_object->cached.size,
 		GIT_FILTER_BYTES_TO_CHECK_NUL));
 	return git_buf_text_is_binary(&content);
 }
diff --git a/src/blob.h b/src/blob.h
index f644ec583..3f1f97719 100644
--- a/src/blob.h
+++ b/src/blob.h
@@ -16,20 +16,11 @@
 
 struct git_blob {
 	git_object object;
-
-	union {
-		git_odb_object *odb;
-		struct {
-			const char *data;
-			git_off_t size;
-		} raw;
-	} data;
-	unsigned int raw:1;
+	git_odb_object *odb_object;
 };
 
 void git_blob__free(void *blob);
 int git_blob__parse(void *blob, git_odb_object *obj);
-int git_blob__parse_raw(void *blob, const char *data, size_t size);
 int git_blob__getbuf(git_buf *buffer, git_blob *blob);
 
 extern int git_blob__create_from_paths(
diff --git a/src/commit.c b/src/commit.c
index 80b4c3959..4fee8c8da 100644
--- a/src/commit.c
+++ b/src/commit.c
@@ -382,11 +382,11 @@ int git_commit_amend(
 	return error;
 }
 
-int git_commit__parse_raw(void *_commit, const char *data, size_t size)
+int git_commit__parse(void *_commit, git_odb_object *odb_obj)
 {
 	git_commit *commit = _commit;
-	const char *buffer_start = data, *buffer;
-	const char *buffer_end = buffer_start + size;
+	const char *buffer_start = git_odb_object_data(odb_obj), *buffer;
+	const char *buffer_end = buffer_start + git_odb_object_size(odb_obj);
 	git_oid parent_id;
 	size_t header_len;
 	git_signature dummy_sig;
@@ -476,13 +476,6 @@ bad_buffer:
 	return -1;
 }
 
-int git_commit__parse(void *_commit, git_odb_object *odb_obj)
-{
-	return git_commit__parse_raw(_commit,
-		git_odb_object_data(odb_obj),
-		git_odb_object_size(odb_obj));
-}
-
 #define GIT_COMMIT_GETTER(_rvalue, _name, _return) \
 	_rvalue git_commit_##_name(const git_commit *commit) \
 	{\
diff --git a/src/commit.h b/src/commit.h
index 9137a8fad..781809d70 100644
--- a/src/commit.h
+++ b/src/commit.h
@@ -35,6 +35,5 @@ struct git_commit {
 
 void git_commit__free(void *commit);
 int git_commit__parse(void *commit, git_odb_object *obj);
-int git_commit__parse_raw(void *commit, const char *data, size_t size);
 
 #endif
diff --git a/src/object.c b/src/object.c
index c1f3ea919..48f561384 100644
--- a/src/object.c
+++ b/src/object.c
@@ -12,7 +12,6 @@
 #include "repository.h"
 
 #include "commit.h"
-#include "hash.h"
 #include "tree.h"
 #include "blob.h"
 #include "oid.h"
@@ -20,86 +19,38 @@
 
 bool git_object__strict_input_validation = true;
 
-extern int git_odb_hash(git_oid *out, const void *data, size_t len, git_otype type);
-
 typedef struct {
 	const char	*str;	/* type name string */
 	size_t		size;	/* size in bytes of the object structure */
 
 	int  (*parse)(void *self, git_odb_object *obj);
-	int  (*parse_raw)(void *self, const char *data, size_t size);
 	void (*free)(void *self);
 } git_object_def;
 
 static git_object_def git_objects_table[] = {
 	/* 0 = GIT_OBJ__EXT1 */
-	{ "", 0, NULL, NULL, NULL },
+	{ "", 0, NULL, NULL },
 
 	/* 1 = GIT_OBJ_COMMIT */
-	{ "commit", sizeof(git_commit), git_commit__parse, git_commit__parse_raw, git_commit__free },
+	{ "commit", sizeof(git_commit), git_commit__parse, git_commit__free },
 
 	/* 2 = GIT_OBJ_TREE */
-	{ "tree", sizeof(git_tree), git_tree__parse, git_tree__parse_raw, git_tree__free },
+	{ "tree", sizeof(git_tree), git_tree__parse, git_tree__free },
 
 	/* 3 = GIT_OBJ_BLOB */
-	{ "blob", sizeof(git_blob), git_blob__parse, git_blob__parse_raw, git_blob__free },
+	{ "blob", sizeof(git_blob), git_blob__parse, git_blob__free },
 
 	/* 4 = GIT_OBJ_TAG */
-	{ "tag", sizeof(git_tag), git_tag__parse, git_tag__parse_raw, git_tag__free },
+	{ "tag", sizeof(git_tag), git_tag__parse, git_tag__free },
 
 	/* 5 = GIT_OBJ__EXT2 */
-	{ "", 0, NULL, NULL, NULL },
+	{ "", 0, NULL, NULL },
 	/* 6 = GIT_OBJ_OFS_DELTA */
-	{ "OFS_DELTA", 0, NULL, NULL, NULL },
+	{ "OFS_DELTA", 0, NULL, NULL },
 	/* 7 = GIT_OBJ_REF_DELTA */
-	{ "REF_DELTA", 0, NULL, NULL, NULL },
+	{ "REF_DELTA", 0, NULL, NULL },
 };
 
-int git_object__from_raw(
-	git_object **object_out,
-	const char *data,
-	size_t size,
-	git_otype type)
-{
-	git_object_def *def;
-	git_object *object;
-	size_t object_size;
-	int error;
-
-	assert(object_out);
-	*object_out = NULL;
-
-	/* Validate type match */
-	if (type != GIT_OBJ_BLOB && type != GIT_OBJ_TREE && type != GIT_OBJ_COMMIT && type != GIT_OBJ_TAG) {
-		giterr_set(GITERR_INVALID, "the requested type is invalid");
-		return GIT_ENOTFOUND;
-	}
-
-	if ((object_size = git_object__size(type)) == 0) {
-		giterr_set(GITERR_INVALID, "the requested type is invalid");
-		return GIT_ENOTFOUND;
-	}
-
-	/* Allocate and initialize base object */
-	object = git__calloc(1, object_size);
-	GITERR_CHECK_ALLOC(object);
-	object->cached.flags = GIT_CACHE_STORE_PARSED;
-	object->cached.type = type;
-	git_odb_hash(&object->cached.oid, data, size, type);
-
-	/* Parse raw object data */
-	def = &git_objects_table[type];
-	assert(def->free && def->parse_raw);
-
-	if ((error = def->parse_raw(object, data, size)) < 0)
-		def->free(object);
-
-	git_cached_obj_incref(object);
-	*object_out = object;
-
-	return 0;
-}
-
 int git_object__from_odb_object(
 	git_object **object_out,
 	git_repository *repo,
diff --git a/src/object.h b/src/object.h
index f5cbbf763..e46c9cafa 100644
--- a/src/object.h
+++ b/src/object.h
@@ -22,17 +22,6 @@ struct git_object {
 /* fully free the object; internal method, DO NOT EXPORT */
 void git_object__free(void *object);
 
-/*
- * Parse object from raw data. Note that the resulting object is
- * tied to the lifetime of the data, as some objects simply point
- * into it.
- */
-int git_object__from_raw(
-	git_object **object_out,
-	const char *data,
-	size_t size,
-	git_otype type);
-
 int git_object__from_odb_object(
 	git_object **object_out,
 	git_repository *repo,
diff --git a/src/tag.c b/src/tag.c
index 093a73e76..1ec823731 100644
--- a/src/tag.c
+++ b/src/tag.c
@@ -159,11 +159,6 @@ static int tag_parse(git_tag *tag, const char *buffer, const char *buffer_end)
 	return 0;
 }
 
-int git_tag__parse_raw(void *_tag, const char *data, size_t size)
-{
-	return tag_parse(_tag, data, data + size);
-}
-
 int git_tag__parse(void *_tag, git_odb_object *odb_obj)
 {
 	git_tag *tag = _tag;
diff --git a/src/tag.h b/src/tag.h
index 734770abd..8aae37840 100644
--- a/src/tag.h
+++ b/src/tag.h
@@ -26,6 +26,5 @@ struct git_tag {
 
 void git_tag__free(void *tag);
 int git_tag__parse(void *tag, git_odb_object *obj);
-int git_tag__parse_raw(void *tag, const char *data, size_t size);
 
 #endif
diff --git a/src/tree.c b/src/tree.c
index 4b2782f38..d1c1d7769 100644
--- a/src/tree.c
+++ b/src/tree.c
@@ -418,16 +418,18 @@ static int parse_mode(uint16_t *mode_out, const char *buffer, size_t buffer_len,
 	return 0;
 }
 
-int git_tree__parse_raw(void *_tree, const char *data, size_t size)
+int git_tree__parse(void *_tree, git_odb_object *odb_obj)
 {
 	git_tree *tree = _tree;
 	const char *buffer;
 	const char *buffer_end;
 
-	buffer = data;
-	buffer_end = buffer + size;
+	if (git_odb_object_dup(&tree->odb_obj, odb_obj) < 0)
+		return -1;
+
+	buffer = git_odb_object_data(tree->odb_obj);
+	buffer_end = buffer + git_odb_object_size(tree->odb_obj);
 
-	tree->odb_obj = NULL;
 	git_array_init_to_size(tree->entries, DEFAULT_TREE_SIZE);
 	GITERR_CHECK_ARRAY(tree->entries);
 
@@ -470,21 +472,6 @@ int git_tree__parse_raw(void *_tree, const char *data, size_t size)
 	return 0;
 }
 
-int git_tree__parse(void *_tree, git_odb_object *odb_obj)
-{
-	git_tree *tree = _tree;
-
-	if ((git_tree__parse_raw(tree,
-	    git_odb_object_data(odb_obj),
-	    git_odb_object_size(odb_obj))) < 0)
-		return -1;
-
-	if (git_odb_object_dup(&tree->odb_obj, odb_obj) < 0)
-		return -1;
-
-	return 0;
-}
-
 static size_t find_next_dir(const char *dirname, git_index *index, size_t start)
 {
 	size_t dirlen, i, entries = git_index_entrycount(index);
diff --git a/src/tree.h b/src/tree.h
index 31b6bf04f..00f4b06eb 100644
--- a/src/tree.h
+++ b/src/tree.h
@@ -43,7 +43,6 @@ extern int git_tree_entry_icmp(const git_tree_entry *e1, const git_tree_entry *e
 
 void git_tree__free(void *tree);
 int git_tree__parse(void *tree, git_odb_object *obj);
-int git_tree__parse_raw(void *_tree, const char *data, size_t size);
 
 /**
  * Lookup the first position in the tree with a given prefix.
diff --git a/tests/object/tree/parse.c b/tests/object/tree/parse.c
deleted file mode 100644
index 83d22193d..000000000
--- a/tests/object/tree/parse.c
+++ /dev/null
@@ -1,164 +0,0 @@
-#include "clar_libgit2.h"
-#include "tree.h"
-#include "object.h"
-
-#define OID1_HEX \
-	"\xae\x90\xf1\x2e\xea\x69\x97\x29\xed\x24" \
-	"\x55\x5e\x40\xb9\xfd\x66\x9d\xa1\x2a\x12"
-#define OID1_STR "ae90f12eea699729ed24555e40b9fd669da12a12"
-
-#define OID2_HEX \
-	"\xe8\xbf\xe5\xaf\x39\x57\x9a\x7e\x48\x98" \
-	"\xbb\x23\xf3\xa7\x6a\x72\xc3\x68\xce\xe6"
-#define OID2_STR "e8bfe5af39579a7e4898bb23f3a76a72c368cee6"
-
-typedef struct {
-	const char *filename;
-	uint16_t attr;
-	const char *oid;
-} expected_entry;
-
-static void assert_tree_parses(const char *data, size_t datalen,
-	expected_entry *expected_entries, size_t expected_nentries)
-{
-	git_tree *tree;
-	size_t n;
-
-	if (!datalen)
-		datalen = strlen(data);
-	cl_git_pass(git_object__from_raw((git_object **) &tree, data, datalen, GIT_OBJ_TREE));
-
-	cl_assert_equal_i(git_tree_entrycount(tree), expected_nentries);
-
-	for (n = 0; n < expected_nentries; n++) {
-		expected_entry *expected = expected_entries + n;
-		const git_tree_entry *entry;
-		git_oid oid;
-
-		cl_git_pass(git_oid_fromstr(&oid, expected->oid));
-
-		cl_assert(entry = git_tree_entry_byname(tree, expected->filename));
-		cl_assert_equal_s(expected->filename, entry->filename);
-		cl_assert_equal_i(expected->attr, entry->attr);
-		cl_assert_equal_oid(&oid, entry->oid);
-	}
-
-	git_object_free(&tree->object);
-}
-
-static void assert_tree_fails(const char *data, size_t datalen)
-{
-	git_object *object;
-	if (!datalen)
-		datalen = strlen(data);
-	cl_git_fail(git_object__from_raw(&object, data, datalen, GIT_OBJ_TREE));
-}
-
-void test_object_tree_parse__single_blob_parses(void)
-{
-	expected_entry entries[] = {
-		{ "foo", 0100644, OID1_STR },
-	};
-	const char data[] = "100644 foo\x00" OID1_HEX;
-
-	assert_tree_parses(data, ARRAY_SIZE(data) - 1, entries, ARRAY_SIZE(entries));
-}
-
-void test_object_tree_parse__single_tree_parses(void)
-{
-	expected_entry entries[] = {
-		{ "foo", 040000, OID1_STR },
-	};
-	const char data[] = "040000 foo\x00" OID1_HEX;
-
-	assert_tree_parses(data, ARRAY_SIZE(data) - 1, entries, ARRAY_SIZE(entries));
-}
-
-void test_object_tree_parse__leading_filename_spaces_parse(void)
-{
-	expected_entry entries[] = {
-		{ "       bar", 0100644, OID1_STR },
-	};
-	const char data[] = "100644        bar\x00" OID1_HEX;
-
-	assert_tree_parses(data, ARRAY_SIZE(data) - 1, entries, ARRAY_SIZE(entries));
-}
-
-void test_object_tree_parse__multiple_entries_parse(void)
-{
-	expected_entry entries[] = {
-		{ "bar", 0100644, OID1_STR },
-		{ "foo", 040000,  OID2_STR },
-	};
-	const char data[] =
-	    "100644 bar\x00" OID1_HEX
-	    "040000 foo\x00" OID2_HEX;
-
-	assert_tree_parses(data, ARRAY_SIZE(data) - 1, entries, ARRAY_SIZE(entries));
-}
-
-void test_object_tree_parse__invalid_mode_fails(void)
-{
-	const char data[] = "10x644 bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__missing_mode_fails(void)
-{
-	const char data[] = " bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__mode_doesnt_cause_oob_read(void)
-{
-	const char data[] = "100644 bar\x00" OID1_HEX;
-	assert_tree_fails(data, 2);
-	/*
-	 * An oob-read would correctly parse the filename and
-	 * later fail to parse the OID with a different error
-	 * message
-	 */
-	cl_assert_equal_s(giterr_last()->message, "failed to parse tree: missing space after filemode");
-}
-
-void test_object_tree_parse__unreasonably_large_mode_fails(void)
-{
-	const char data[] = "10000000000000000000000000 bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__missing_filename_separator_fails(void)
-{
-	const char data[] = "100644bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__missing_filename_terminator_fails(void)
-{
-	const char data[] = "100644 bar" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__empty_filename_fails(void)
-{
-	const char data[] = "100644 \x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__trailing_garbage_fails(void)
-{
-	const char data[] = "100644 bar\x00" OID1_HEX "x";
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__leading_space_fails(void)
-{
-	const char data[] = " 100644 bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 1);
-}
-
-void test_object_tree_parse__truncated_oid_fails(void)
-{
-	const char data[] = " 100644 bar\x00" OID1_HEX;
-	assert_tree_fails(data, ARRAY_SIZE(data) - 2);
-}

The `git__strntol` family of functions accepts leading spaces and will
simply skip them. The skipping will not honor the provided buffer's
length, though, which may lead it to read outside of the provided
buffer's bounds if it is not a simple NUL-terminated string.
Furthermore, if leading space is trimmed, the function will further
advance the pointer but not update the number of remaining bytes, which
may also lead to out-of-bounds reads.

Fix the issue by properly paying attention to the buffer length and
updating it when stripping leading whitespace characters. Add a test
that verifies that we won't read past the provided buffer length.
When parsing a tree entry's mode, we will eagerly parse until we hit a
character that is not in the accepted set of octal digits '0' - '7'. If
the provided buffer is not a NUL terminated one, we may thus read
out-of-bounds.

Fix the issue by passing the buffer length to `parse_mode` and paying
attention to it. Note that this is not a vulnerability in our usual code
paths, as all object data read from the ODB is NUL terminated.
The `git__strntol` family of functions has the ability to auto-detect
a number's base if the string has either the common '0x' prefix for
hexadecimal numbers or '0' prefix for octal numbers. The detection of
such prefixes and following handling has two major issues though that are
being fixed in one go now.

- We do not do any bounds checking previous to verifying the '0x' base.
  While we do verify that there is at least one digit available
  previously, we fail to verify that there are two digits available and
  thus may do an out-of-bounds read when parsing this
  two-character-prefix.

- When skipping the prefix of such numbers, we only update the pointer
  length without also updating the number of remaining bytes. Thus if we
  try to parse a number '0x1' of total length 3, we will first skip the
  first two bytes and then try to read 3 bytes starting at '1'.

Fix both issues by disentangling the logic. Instead of doing the
detection and skipping of such prefixes in one go, we will now first try
to detect the base while also honoring how many bytes are left. Only if
we have a valid base that is either 8 or 16 and have one of the known
prefixes, we will now advance the pointer and update the remaining bytes
in one step.

Add some tests that verify that no out-of-bounds parsing happens and
that autodetection works as advertised.
The `parse_mode` option uses an open-coded octal number parser. The
parser is quite naive in that it simply parses until hitting a character
that is not in the accepted range of '0' - '7', completely ignoring the
fact that we can at most accept a 16 bit unsigned integer as filemode.
If the filemode is bigger than UINT16_MAX, it will thus overflow and
provide an invalid filemode for the object entry.

Fix the issue by using `git__strntol32` instead and doing a bounds
check. As this function already handles overflows, it neatly solves the
problem.

Note that previously, `parse_mode` was also skipping the character
immediately after the filemode. In proper trees, this should be a simple
space, but in fact the parser accepted any character and simply skipped
over it. As a consequence of using `git__strntol32`, we now need to an
explicit check for a trailing whitespace after having parsed the
filemode. Because of the newly introduced error message, the test
object::tree::parse::mode_doesnt_cause_oob_read needs adjustment to its
error message check, which in fact is a good thing as it demonstrates
that we now fail looking for the whitespace immediately following the
filemode.

Add a test that shows that we will fail to parse such invalid filemodes
now.
When parsing a number, we accept a leading plus or minus sign to return
a positive or negative number. When the parsed string has such a leading
sign, we set up a flag indicating that the number is negative and
advance the pointer to the next character in that string. This misses
updating the number of bytes in the string, though, which is why the
parser may later on do an out-of-bounds read.

Fix the issue by correctly updating both the pointer and the number of
remaining bytes. Furthermore, we need to check whether we actually have
any bytes left after having advanced the pointer, as otherwise the
auto-detection of the base may do an out-of-bonuds access. Add a test
that detects the out-of-bound read.

Note that this is not actually security critical. While there are a lot
of places where the function is called, all of these places are guarded
or irrelevant:

- commit list: this operates on objects from the ODB, which are always
  NUL terminated any may thus not trigger the off-by-one OOB read.

- config: the configuration is NUL terminated.

- curl stream: user input is being parsed that is always NUL terminated

- index: the index is read via `git_futils_readbuffer`, which always NUL
  terminates it.

- loose objects: used to parse the length from the object's header. As
  we check previously that the buffer contains a NUL byte, this is safe.

- rebase: this parses numbers from the rebase instruction sheet. As the
  rebase code uses `git_futils_readbuffer`, the buffer is always NUL
  terminated.

- revparse: this parses a user provided buffer that is NUL terminated.

- signature: this parser the header information of objects. As objects
  read from the ODB are always NUL terminated, this is a non-issue. The
  constructor `git_signature_from_buffer` does not accept a length
  parameter for the buffer, so the buffer needs to be NUL terminated, as
  well.

- smart transport: the buffer that is parsed is NUL terminated

- tree cache: this parses the tree cache from the index extension. The
  index itself is read via `git_futils_readbuffer`, which always NUL
  terminates it.

- winhttp transport: user input is being parsed that is always NUL
  terminated
pks-t and others added 6 commits January 25, 2019 17:43
The function `parse_number` was replaced by `git_parse_advance_digit`
which is provided by the parser interface in commit 252f2ee (parse:
implement and use `git_parse_advance_digit`, 2017-07-14). As there are
no remaining callers, remove it.
…files

When computing whether a file is ignored, we simply search for the first
matching rule and return whether it is a positive ignore rule (the file
is really ignored) or whether it is a negative ignore rule (the file is
being unignored). Each rule has a set of flags which are being passed to
`fnmatch`, depending on what kind of rule it is. E.g. in case it is a
negative ignore we add a flag `GIT_ATTR_FNMATCH_NEGATIVE`, in case it
contains a glob we set the `GIT_ATTR_FNMATCH_HASGLOB` flag.

One of these flags is the `GIT_ATTR_FNMATCH_LEADINGDIR` flag, which is
always set in case the pattern has a trailing "/*" or in case the
pattern is negative. The flag causes the `fnmatch` function to return a
match in case a string is a leading directory of another, e.g. "dir/"
matches "dir/foo/bar.c". In case of negative patterns, this is wrong in
certain cases.

Take the following simple example of a gitignore:

    dir/
    !dir/

The `LEADINGDIR` flag causes "!dir/" to match "dir/foo/bar.c", and we
correctly unignore the directory. But take this example:

    *.test
    !dir/*

We expect everything in "dir/" to be unignored, but e.g. a file in a
subdirectory of dir should be ignored, as the "*" does not cross
directory hierarchies. With `LEADINGDIR`, though, we would just see that
"dir/" matches and return that the file is unignored, even if it is
contained in a subdirectory. Instead, we want to ignore leading
directories here and check "*.test". Afterwards, we have to iterate up
to the parent directory and do the same checks.

To fix the issue, disallow matching against leading directories in
gitignore files. This can be trivially done by just adding the
`GIT_ATTR_FNMATCH_NOLEADINGDIR` to the spec passed to
`git_attr_fnmatch__parse`. Due to a bug in that function, though, this
flag is being ignored for negative patterns, which is fixed in this
commit, as well. As a last fix, we need to ignore rules that are
supposed to match a directory when our path itself is a file.

All together, these changes fix the described error case.
When checking whether a rule negates another rule, we were checking
whether a rule had the `GIT_ATTR_FNMATCH_LEADINGDIR` flag set and, if
so, added a "/*" to its end before passing it to `fnmatch`. Our code now
sets `GIT_ATTR_FNMATCH_NOLEADINGDIR`, thus the `LEADINGDIR` flag shall
never be set. Furthermore, due to the `NOLEADINGDIR` flag, trailing
globs do not get consumed by our ignore parser anymore.

Clean up code by just dropping this now useless logic.
@ethomson
Copy link
Member Author

Hmm, I don't see pks/v0.27.8-without-odb-raw, but I've redone this without d4ad658 and the tests. LMK if I missed something.

@pks-t
Copy link
Member

pks-t commented Jan 28, 2019 via email

@ethomson ethomson merged commit 8d1b36a into maint/v0.27 Jan 28, 2019
@pks-t
Copy link
Member

pks-t commented Jan 28, 2019 via email

@ethomson ethomson deleted the ethomson/v0.27.8 branch February 12, 2019 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants