Add a Clean/Smudge Filter for Windows UTF-16 files #113
Comments
Just out of curiosity, what Windows program stores text file in UTF-16 by default? The only occasion where I saw this recently for the first time is when you pipe e.g. the output of "wmic process get executablepath,processid" to a file. Also, is there a risk of this filter breaking the user experience non-UTF-16 text files? |
Regedit. |
Well, yeah. If you choose to export to .txt files instead of the usual .reg files (note that the suggested filter only applies to .txt files). Quite an academic example. |
@sschuberth: Regarding breaking the user experience, the filter would be off by default. The user would have to manually configure their gitattributes to handle UTF-16 files as required. |
As discussed in issue #113 it is useful to have a method for converting between unicode and utf-8 for writing smudge/clean filters. To assist in this we shall include a functional iconv.exe in the release. A side effect is we now also ship libintl-8.dll. Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
@zachriggle No, they are not, at least not in Visual Studio 2008. |
@kismert since you did almost all the work (apart from @patthoyts' patch to include |
The native encoding for Microsoft Windows is UTF-16 (UCS-2 little-endian). The Windows distribution of msysGit does not handle this format as text, but instead treats it as binary.
As a fix, msysGit Google group members suggested a clean/smudge filter, which does a very good job of handling UTF-16. This would take little effort to include in msysGit:
iconv.exe
and support files in\Program Files\Git\bin
for the Windows version.~\Git\etc\gitconfig
:~/Git/etc/gitattributes
or local~/.gitattributes
files, for example:I think this would be a valuable enhancement for msysGit that would allow Windows users to quickly configure their repositories to better work with the increasingly common UTF-16 format.
Note that Mercurial and Bazaar do not handle UTF-16 properly either, and this would give msysGit based products a leg up on their peers.
Thanks,
kismert
The text was updated successfully, but these errors were encountered: