ANSI doesn't allow nonprintable characters #83

dcoshea · 2014-07-02T05:57:15Z

In 345eb58, in write_ch():

if ch not in string.printable:
    fout = open ('log', 'a')
    fout.write ('Nonprint: ' + str(ord(ch)) + '\n')
    fout.close()
    return

I would like non-printable characters to be accepted, since I am dealing with data streams that include CP437 line drawing characters.

Obviously it's easy to comment out the above lines, but I think it might make more sense for non-printable characters to be accepted, and to leave it up to the caller to decide what to do with any that they find on the virtual "screen" as they see fit. Filtering them out, as is currently done, means that other characters do not appear in their correct locations on the screen, unless this is meant to be a way to filter out line noise, in which case you don't want the cursor to be moved when they are filtered out?

The text was updated successfully, but these errors were encountered:

jquast · 2014-07-11T08:21:53Z

yes, this is wrong. will see.

also for cp437 line-drawing characters, you may be interested in:
https://github.com/jquast/x84/blob/master/x84/encodings/cp437_art.py

and maybe also, https://github.com/tehmaze/piece

dcoshea · 2014-07-14T03:25:23Z

also for cp437 line-drawing characters, you may be interested in:
https://github.com/jquast/x84/blob/master/x84/encodings/cp437_art.py

I take it that the difference between this encoding and the standard cp437 one is that this one maps the first 32 characters to glyphs (smiley face, etc.) whereas the standard one leaves them with the same ordinal values?

and maybe also, https://github.com/tehmaze/piece

Thanks, but I think that wouldn't work so well for me as it wants me to give it a file, and it parses it all in one hit, whereas I need to feed it data read from a serial interface a byte at a time and process the parsed result.

jquast · 2014-07-16T22:22:10Z

For a streaming terminal emulator "screen region" access, recommend then also, pyte, see stream.feed() call in example https://github.com/selectel/pyte/blob/master/examples/helloworld.py

jquast · 2014-07-16T22:22:49Z

oh yes, and you are correct -- "cp437_art" is the control characters are smileys & etc.

dcoshea · 2014-07-23T11:25:26Z

While I'm working on issue #84 - adding support for Unicode - is it okay if I just get rid of this check for whether the character is printable? If anyone really wanted to exclude non-printable characters, I think (haven't confirmed) that, with my fix for issue #84, they should be able to specify something like codec="ascii", and with the default setting of codec_errors="replace", most non-printable characters should get replaced.

jquast · 2014-07-24T02:33:02Z

That is correct, technically a utf-8 byte sequence would fail "printable". its up to the decoder to raise UnicodeDecodeError, etc.

This commit updates the the screen and ANSI modules to support Unicode under Python 2.x. Under Python 3.x, it was already supported because strings are Unicode by default. Now, on both Python versions: - The constructors accept a codec name (defaults to 'latin-1') and a scheme for handling encoding/decoding errors (defaults to 'replace'). The codec may be set to None to inhibit encoding/decoding. - Unicode is now used internally for storing the screen contents. - Methods that accept input characters will, if passed input of type 'bytes' (or, under Python 2.x, 'str'), use the specified codec to decode the input, otherwise treating it as Unicode. - Methods that return screen contents now return Unicode, with the exception of __str__() under Python 2.x, and __bytes__() in all versions of Python, which return the screen contents encoded using the specified codec. These changes are designed to work only with Python 2.6, 2.7, and 3.3 and later, specifically versions that provide both b'' and u'' string literals. The check in ANSI for characters being printable is also removed, as this prevents non-ASCII characters being accepted, which is not compatible with the goal of adding Unicode support. This addresses issue pexpect#83.

dcoshea · 2014-07-24T12:45:43Z

Filed pull request #96 which includes a fix for this issue.

This commit updates the the screen and ANSI modules to support Unicode under Python 2.x. Under Python 3.x, it was already supported because strings are Unicode by default. Now, on both Python versions: - The constructors accept a codec name (defaults to 'latin-1') and a scheme for handling encoding/decoding errors (defaults to 'replace'). The codec may be set to None to inhibit encoding/decoding. - Unicode is now used internally for storing the screen contents. - Methods that accept input characters will, if passed input of type 'bytes' (or, under Python 2.x, 'str'), use the specified codec to decode the input, otherwise treating it as Unicode. - Methods that return screen contents now return Unicode, with the exception of __str__() under Python 2.x, and __bytes__() in all versions of Python, which return the screen contents encoded using the specified codec. These changes are designed to work only with Python 2.6, 2.7, and 3.3 and later, specifically versions that provide both b'' and u'' string literals. The check in ANSI for characters being printable is also removed, as this prevents non-ASCII characters being accepted, which is not compatible with the goal of adding Unicode support. This addresses issue pexpect#83.

jquast · 2015-09-19T18:20:25Z

Closing, pexpect's terminal emulation code remains next release but no longer improved, marked deprecated by #240 Suggest any terminal emulation / screen scraping code efforts moved to more concerted project efforts such as https://github.com/selectel/pyte

jquast added the bug label Jul 11, 2014

dcoshea mentioned this issue Jul 24, 2014

Issue #84: Unicode support in screen and ANSI. #96

Closed

dcoshea mentioned this issue Aug 7, 2015

Tab characters are not expanded by screen/ANSI #95

Closed

jquast closed this as completed Sep 19, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ANSI doesn't allow nonprintable characters #83

ANSI doesn't allow nonprintable characters #83

dcoshea commented Jul 2, 2014

jquast commented Jul 11, 2014

dcoshea commented Jul 14, 2014

jquast commented Jul 16, 2014

jquast commented Jul 16, 2014

dcoshea commented Jul 23, 2014

jquast commented Jul 24, 2014

dcoshea commented Jul 24, 2014

jquast commented Sep 19, 2015

ANSI doesn't allow nonprintable characters #83

ANSI doesn't allow nonprintable characters #83

Comments

dcoshea commented Jul 2, 2014

jquast commented Jul 11, 2014

dcoshea commented Jul 14, 2014

jquast commented Jul 16, 2014

jquast commented Jul 16, 2014

dcoshea commented Jul 23, 2014

jquast commented Jul 24, 2014

dcoshea commented Jul 24, 2014

jquast commented Sep 19, 2015