Permalink
Browse files

fix #196 pre-commit hook improvements for UTF-8 / binary files

The previous commit ed322f3 fixed a problem for UTF-8 files but introduced
new issues -- for the files having no trailing newline and for binary
files. This commit addresses all those problems.

Close #249.
  • Loading branch information...
1 parent 61e1f0e commit 34c3bd8c6ac42d0867a31726092312a462873b13 @jakub-g jakub-g committed with Fabio Crisci Nov 28, 2012
Showing with 20 additions and 5 deletions.
  1. +20 −5 hooks/pre-commit
View
@@ -9,7 +9,7 @@ else
fi
-# Find files with trailing whitespace
+# 1. Find files with trailing whitespace and strip it.
for FILE in `exec git diff-index --check --cached $against -- | sed '/^[+-]/d' | sed -r 's/:[0-9]+:.*//' | uniq` ; do
# echo "$FILE whitespace cleaned" >> ./pre-commit.log
# Fix them!
@@ -18,11 +18,26 @@ for FILE in `exec git diff-index --check --cached $against -- | sed '/^[+-]/d' |
# as we are on Windows, we want CRLF back (see the next loop)
done
+# 2. For non-binary modified files, add trailing newline if not there, and convert LF to CRLF
for FILE in `exec git diff --cached --name-only $against`; do
- if [ -f "$FILE" ]; then # filter out files that are on "to be deleted" list
- # echo "$FILE dos2unixed" >> ./pre-commit.log
- # do not use 'dos2unix -D' as it misbehaves when input contains UTF-8 characters
- sed -i 's/$/\r/' "$FILE" # better LF -> CRLF; remove if you're not on Windows
+ # filter out files that 1) are on "to be deleted" list and 2) binary files
+ if [[ -f $FILE && $FILE != *.gif && $FILE != *.jpg && $FILE != *.jpeg && $FILE != *.png && $FILE != *.bmp && $FILE != *.swf ]]; then
+ # do not use 'dos2unix -D' as it misbehaves when input contains UTF-8 characters (duplicates the newlines)
+
+ # -i = do changes in-place;
+ # First, add trailing newline if not present yet: http://unix.stackexchange.com/a/31955/10745
+ # Then, perform newline normalization LF/CRLF to CRLF.
+ # That way, each file is guaranteed to have CRLF only and to have trailing CRLF.
+
+ # The only pathological case not handled here is to have CR-only as line ending which should
+ # never happen unless you explicitly insert it manually (Alt-0013), or use an ancient Mac.
+
+ # Important note: it seems tempting to pipe the two instructions below as a one,
+ # but then, for mixed input (some LF, some CRLF) it acts not as expected!
+
+ # echo -e "\nLF to CRLF conversion with SED on $FILE\n"
+ sed -i -e '$a\' "$FILE" # add trailing newline if not present
+ sed -i -e 's/$/\r/' "$FILE" # match end of line, and append CR; LF is added automatically by sed.
git add "$FILE"
fi
done

0 comments on commit 34c3bd8

Please sign in to comment.