Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent results from isprint('\t') #7416

Closed
stevengj opened this issue Jun 25, 2014 · 6 comments
Closed

inconsistent results from isprint('\t') #7416

stevengj opened this issue Jun 25, 2014 · 6 comments
Labels

Comments

@stevengj
Copy link
Member

@stevengj stevengj commented Jun 25, 2014

As mentioned in #7402 and JuliaLang/IJulia.jl#200, isprint('\t') returns true on Windows and false on MacOS. This inconsistency seems undesirable.

According to http://en.cppreference.com/w/cpp/string/byte/isprint, the correct result is false. However, other documentation for isprint is vague on the topic. Since isprint(' ') returns true, it seems odd to me that it would return false for a tab.

I'm not sure what the right answer is here, but I'm leaning towards false based on what non-Windows systems do.

@stevengj

This comment has been minimized.

Copy link
Member Author

@stevengj stevengj commented Jun 25, 2014

One potential fix might be to use Unicode character categories, as suggested in #5939.

@stevengj

This comment has been minimized.

Copy link
Member Author

@stevengj stevengj commented Jun 25, 2014

Note that our current isprint function just calls iswprint, which is broken on Windows anyway because it only handles characters that are in the BMP (i.e. which fit into one 16-bit UTF16 code unit).

stevengj added a commit to JuliaIO/JSON.jl that referenced this issue Jun 25, 2014
@stevengj

This comment has been minimized.

Copy link
Member Author

@stevengj stevengj commented Jun 25, 2014

See also this discussion, in which the MinGW developers comment that contrery [sic] to other systems,msvcrt DLL version of iswprint function returns 1 for tab character and hence specifically check for tabs in GDB's iswprint implementation.

@stevengj stevengj added the windows label Jun 25, 2014
@elextr

This comment has been minimized.

Copy link

@elextr elextr commented Jun 26, 2014

I don't think there is a Unicode "printable" property looking at http://www.unicode.org/reports/tr44/#Character_Properties

The traditional ASCII printable has meant "not a control or delete", so space was "printable" but tab was not.

@stevengj

This comment has been minimized.

Copy link
Member Author

@stevengj stevengj commented Jun 26, 2014

The tab character in Unicode is in category Cc (control characters), so we could define isprintas !iscntrl && ....

@vtjnash

This comment has been minimized.

Copy link
Member

@vtjnash vtjnash commented Dec 13, 2014

fixed by #8233

@vtjnash vtjnash closed this Dec 13, 2014
chapuni pushed a commit to llvm-project/llvm-project-20170507 that referenced this issue Jul 26, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680
chapuni pushed a commit to llvm-project/llvm that referenced this issue Jul 26, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338034 91177308-0d34-0410-b5e6-96231b3b80d8
chapuni pushed a commit to llvm-project/llvm-project-submodule that referenced this issue Jul 26, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680
earl pushed a commit to earl/llvm-mirror that referenced this issue Jul 26, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338034 91177308-0d34-0410-b5e6-96231b3b80d8
ilovepi added a commit to ilovepi/llvm-project-20170507 that referenced this issue Aug 14, 2018
The standard library functions ::isprint/std::isprint have platform-
and locale-dependent behavior which makes LLVM's output less
predictable. In particular, regression tests my fail depending on the
implementation of these functions.

Implement llvm::isPrint in StringExtras.h with a standard behavior and
replace all uses of ::isprint/std::isprint by a call it llvm::isPrint.
The function is inlined and does not look up language settings so it
should perform better than the standard library's version.

Such a replacement has already been done for isdigit, isalpha, isxdigit
in r314883. gtest does the same in gtest-printers.cc using the following
justification:

    // Returns true if c is a printable ASCII character.  We test the
    // value of c directly instead of calling isprint(), which is buggy on
    // Windows Mobile.
    inline bool IsPrintableAscii(wchar_t c) {
      return 0x20 <= c && c <= 0x7E;
    }

Similar issues have also been encountered by Julia:
JuliaLang/julia#7416

I noticed the problem myself when on Windows isprint('\t') started to
evaluate to true (see https://stackoverflow.com/questions/51435249) and
thus caused several unit tests to fail. The result of isprint doesn't
seem to be well-defined even for ASCII characters. Therefore I suggest
to replace isprint by a platform-independent version.

Differential Revision: https://reviews.llvm.org/D49680
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.