Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format-Hex should not try to render unicode control characters that affect the display #7777

Open
SteveL-MSFT opened this issue Sep 13, 2018 · 10 comments

Comments

Projects
None yet
3 participants
@SteveL-MSFT
Copy link
Member

commented Sep 13, 2018

Format-Hex currently filters out control characters in the ASCII range that would affect the console display. Needs to be updated to handle Unicode control characters. Also consider replacing use of period for non-printable characters with the Unicode symbol for non-printable to avoid confusion with actual periods.

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Sep 14, 2018

Is "`u{00}" expected output format?

@SteveL-MSFT

This comment has been minimized.

Copy link
Member Author

commented Sep 14, 2018

Any unprintable character (beyond the control characters) should be replaced with the box with question mark symbol.

@ThreeFive-O

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2018

@iSazonov I think @SteveL-MSFT means this one:

U+FFFD � REPLACEMENT CHARACTER used to replace an unknown, unrecognized or unrepresentable character

from https://en.wikipedia.org/wiki/Specials_(Unicode_block)

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Sep 18, 2018

@ThreeFive-O Thanks for clarify.

Will this symbol be well displayed in the Windows 7 console with default config?

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Sep 18, 2018

With Raster fonts (default on Windows 7) the replacement symbol is not displayed.
On other Windows (by default) the symbol is displayed as an empty square.

It does not look good.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member Author

commented Sep 18, 2018

We might have to special case Win7 and do something like detect if the font can display and and maybe just show a question mark.

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Sep 19, 2018

Determining the capabilities of a font at runtime looks impossible.
We could add new parameter ReplacementCharacter with standard default U+FFFD � REPLACEMENT CHARACTER.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2018

@iSazonov does U+FFFD render as whitespace on Win7? That might be good enough since the telemetry shows that only a minor % of customers are on Win7 and it's probably a small set of those customers using format-hex.

@iSazonov

This comment has been minimized.

Copy link
Collaborator

commented Sep 23, 2018

does U+FFFD render as whitespace on Win7?

Raster font is default on Windows 7 console and the symbol is displayed as whitespace. User have to select TrueType font to see the symbol. I personally always do this.
If Unix consoles use TrueType font by default I think we could use the symbol.

Also I tried Char.IsControl() in code you link and get surprised results. It is again a problem on Windows 7. I don't know can we accept this for new Windows version and Unix-s.

@SteveL-MSFT

This comment has been minimized.

Copy link
Member Author

commented Jan 1, 2019

I think this should be pretty straight forward as I expect the unicode control characters to be documented on the internet and just needs to be added to the already existing filter out list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.