Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch_parse: fix parsing addition/deletion of file with space #5035

Merged
merged 3 commits into from
Apr 4, 2019

Conversation

pks-t
Copy link
Member

@pks-t pks-t commented Mar 29, 2019

The diff header format is a strange beast in that it is inherently
unparseable in an unambiguous way. While parsing

a/file.txt b/file.txt

is obvious and trivially doable, parsing a diff header of

a/file b/file ab.txt b/file b/file ab.txt

is not (but in fact valid and created by git.git).

Due to that, we have relaxed our diff header parser in commit 80226b5
(patch_parse: allow parsing ambiguous patch headers, 2017-09-22), so
that we started to bail out when seeing diff headers with spaces in
their file names. Instead, we try to use the "---" and "+++" lines,
which are unambiguous.

In some cases, though, we neither have a useable file name from the
header nor from the "---" or "+++" lines. This is the case when we have
a deletion or addition of a file with spaces: the header is unparseable
and the other lines will simply show "/dev/null". This trips our parsing
logic when we try to extract the prefix (the "a/" part) that is being
used in the path line, where we unconditionally try to dereference a
NULL pointer in such a scenario.

We can fix this by simply not trying to parse the prefix in cases where
we have no useable path name. That'd leave the parsed patch without
either old_prefix or new_prefix populated. But in fact such cases
are already handled by users of the patch object, which simply opt to
use the default prefixes in that case.

pks-t and others added 3 commits March 29, 2019 11:59
The diff header format is a strange beast in that it is inherently
unparseable in an unambiguous way. While parsing

    a/file.txt b/file.txt

is obvious and trivially doable, parsing a diff header of

    a/file b/file ab.txt b/file b/file ab.txt

is not (but in fact valid and created by git.git).

Due to that, we have relaxed our diff header parser in commit 80226b5
(patch_parse: allow parsing ambiguous patch headers, 2017-09-22), so
that we started to bail out when seeing diff headers with spaces in
their file names. Instead, we try to use the "---" and "+++" lines,
which are unambiguous.

In some cases, though, we neither have a useable file name from the
header nor from the "---" or "+++" lines. This is the case when we have
a deletion or addition of a file with spaces: the header is unparseable
and the other lines will simply show "/dev/null". This trips our parsing
logic when we try to extract the prefix (the "a/" part) that is being
used in the path line, where we unconditionally try to dereference a
NULL pointer in such a scenario.

We can fix this by simply not trying to parse the prefix in cases where
we have no useable path name. That'd leave the parsed patch without
either `old_prefix` or `new_prefix` populated. But in fact such cases
are already handled by users of the patch object, which simply opt to
use the default prefixes in that case.
Add a test that verifies that we are able to parse patches which add a
new file that has spaces in its path.
@ethomson
Copy link
Member

ethomson commented Apr 4, 2019

Thanks @eaigner for reporting the issue and @pks-t for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants