Permalink
Browse files

do not elide UTF-8 characters from multi-line comments

Before this change, uncrustify would remove UTF-8 characters from
multi-line comments.  E.g., here, uncrustify would remove an a-acute
from a 2-line comment:

    $ printf '/*\303\241\n*/\n' > utf8.c
    $ src/uncrustify -q utf8.c; wc -c utf8.c utf8.c.unc*
     8 utf8.c
     6 utf8.c.uncrustify

This fixes it and adds a test to exercise the fix.

* src/output.cpp (output_comment_multi_simple): Correct the type of
"ch": use "int", not char, to avoid sign-extension with signed char.
* tests/c.test: New file.
* tests/input/c/cmt_multi_utf8.c: New file.
* tests/output/c/02423-cmt_multi_utf8.c: New file.
  • Loading branch information...
1 parent 34d4143 commit fbf046e03e41fb8c137c2ed49c120bf389cbbc75 Jim Meyering committed Nov 25, 2011
Showing with 6 additions and 1 deletion.
  1. +1 −1 src/output.cpp
  2. +1 −0 tests/c.test
  3. +2 −0 tests/input/c/cmt_multi_utf8.c
  4. +2 −0 tests/output/c/02423-cmt_multi_utf8.c
View
2 src/output.cpp
@@ -1568,7 +1568,7 @@ static void output_comment_multi(chunk_t *pc)
static void output_comment_multi_simple(chunk_t *pc)
{
int cmt_idx;
- char ch;
+ int ch;
int line_count = 0;
int ccol;
int col_diff = 0;
View
1 tests/c.test
@@ -255,6 +255,7 @@
02421 cmt_multi-1.cfg c/cmt_multi.c
02422 cmt_multi-2.cfg c/cmt_multi.c
+02423 cmt_multi-2.cfg c/cmt_multi_utf8.c
02431 align_right_cmt_gap-1.cfg c/cmt_right_align.c
02432 align_right_cmt_gap-2.cfg c/cmt_right_align.c
View
2 tests/input/c/cmt_multi_utf8.c
@@ -0,0 +1,2 @@
+/* This is a multiline comment with a UTF8 character: á
+ */
View
2 tests/output/c/02423-cmt_multi_utf8.c
@@ -0,0 +1,2 @@
+/* This is a multiline comment with a UTF8 character: á
+ */

0 comments on commit fbf046e

Please sign in to comment.