Ensure that non-ASCII strings match themselves #16
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In the wildmatch code, we escape the strings in a pattern by iterating over each byte and then appending the result of each iteration of the loop to a new string. If the character is not part of an escape sequence, we append it by casting it to a string.
However, this does not produce the expected results. When casting an integral value, such as a byte, to a string, the integral value is interpreted as a rune. Consequently, each byte with value greater than 127 in the original string was being turned into a UTF-8 sequence corresponding to that byte's value in Latin-1.
Because a string which is overencoded does not match the original string, any attempt to have a non-ASCII string match itself would fail. Solve this by taking our byte value and first creating a byte slice, and then casting that to a string.
For the curious, Google Translate reports that the string used in the tests is Chinese for "hello world". Using a string that is not in Latin-1 is preferable to a Latin-1 string because it makes it less likely that we're getting things right by accident.
/cc @yunshan as reporter
Fixes git-lfs/git-lfs#3794.