Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exceptions to TAR extended attribute constructors to disallow '=' and '\n' characters in keys and values. #82812

Merged
merged 5 commits into from Apr 24, 2023

Conversation

carlossanlop
Copy link
Member

Follow up of #82810
Related to #81699

This change adds exceptions that prevent creating a PAX entry with extended attributes that contain:

  • An '=' character in a key.
  • A '\n' character in either a key or a value.

The TAR spec indicates that the '=' character is used to divide a key from a value, and the '\n' character is used to divide a key-value-pair from another.

Since the keys and values are strings, the user could easily pass one of these characters where they shouldn't, and without proper internal detection, we end up writing them in the extended attributes. This prevents a roundtrip to properly read the extended attribute entries, and we either fail silently (the extended attributes show up as empty) or show corrupt data.

@carlossanlop
Copy link
Member Author

I compared the extraction behavior of the gnu tar tool that comes installed with Ubuntu, and tested reading '=' or '\n' characters in extended attributes.

The tool is quite permissive: As long as the reported size of the data section is correct, once an error happens when reading an extended attribute due to finding those characters where they shouldn't be, it simply skips the whole extended attributes section and continues reading the next file.

This behavior aligns to what we already do: we abort the while loop if there are any issues, but continue reading the subsequent entries if possible:

while (TryGetNextExtendedAttribute(ref buffer, out string? key, out string? value))

Also, the tar tool only cares about extended attributes that contain data that needs to be overriden on the extracted file: mtime, atime, ctime, uname, gname, etc. If one of those keys has an invalid value, it does not get overriden. Any other extended attributes that can't be overriden in the file, get ignored by the tool, and a message is printed to console telling the user that the attribute was ignored due to not being recognized.

carlossanlop and others added 4 commits April 4, 2023 09:41
…tesTarEntry constructors that take an extended attributes dictionary, throw when a disallowed character is found.
…ert them to the lazily created dictionary if they pass the validations.
{
throw new ArgumentException(SR.Format(SR.TarExtAttrDisallowedKeyChar, kvp.Key, kvp.Key[index] == '\n' ? "\\n" : kvp.Key[index]));
}
if (kvp.Value.IndexOf('\n') != -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readability:

Suggested change
if (kvp.Value.IndexOf('\n') != -1)
if (kvp.Value.Contains('\n'))

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's more readable, but I am unsure if Contains is more or less performant compared to IndexOf. Let me find out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Contains is strictly better in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually have an analyzer for this:

# CA2249: Consider using 'string.Contains' instead of 'string.IndexOf'
dotnet_diagnostic.CA2249.severity = warning

Do we know why it's not firing?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea. @carlossanlop can you please check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analyzer has a bug: it only runs when using == or >=. My code is using !=.

I opened an issue: dotnet/roslyn-analyzers#6587

@carlossanlop carlossanlop merged commit d9e8673 into dotnet:main Apr 24, 2023
104 of 107 checks passed
@carlossanlop carlossanlop deleted the TarDisallowCharsExtAttr branch April 24, 2023 23:24
@carlossanlop
Copy link
Member Author

I would like to backport this to 7.0, @ViktorHofer @ericstj. The exception would protect users from improper use of extended attributes, which could cause data loss. Do you have any concerns or objections?

@danmoseley
Copy link
Member

Tactics will generally not approve speculative ports. They'll ask for evidence this is being hit.

@ghost ghost locked as resolved and limited conversation to collaborators May 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants