Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow more Unicode on sys.stdout #37202

Closed
loewis mannequin opened this issue Sep 21, 2002 · 11 comments
Closed

Allow more Unicode on sys.stdout #37202

loewis mannequin opened this issue Sep 21, 2002 · 11 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@loewis
Copy link
Mannequin

loewis mannequin commented Sep 21, 2002

BPO 612627
Nosy @malemburg, @loewis
Files
  • stdout.txt
  • stdout2.txt
  • stdout3.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/loewis'
    closed_at = <Date 2003-05-10.07:11:14.000>
    created_at = <Date 2002-09-21.20:32:23.000>
    labels = ['interpreter-core']
    title = 'Allow more Unicode on sys.stdout'
    updated_at = <Date 2003-05-10.07:11:14.000>
    user = 'https://github.com/loewis'

    bugs.python.org fields:

    activity = <Date 2003-05-10.07:11:14.000>
    actor = 'loewis'
    assignee = 'loewis'
    closed = True
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2002-09-21.20:32:23.000>
    creator = 'loewis'
    dependencies = []
    files = ['4578', '4579', '4580']
    hgrepos = []
    issue_num = 612627
    keywords = ['patch']
    message_count = 11.0
    messages = ['41194', '41195', '41196', '41197', '41198', '41199', '41200', '41201', '41202', '41203', '41204']
    nosy_count = 3.0
    nosy_names = ['lemburg', 'nobody', 'loewis']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue612627'
    versions = []

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented Sep 21, 2002

    This patch extends the set of Unicode strings that can
    be printed to sys.stdout, to support all strings that
    the terminal will likely support. It also adds an
    encoding attribute to sys.std{in,out}.

    To do that:

    • it adds a .encoding attribute to all file objects,
      which is normally None
    • initializes the encoding of sys.stdin and sys.stdout
      if either is a terminal.
    • adds a wrapper object around sys.stdout in site.py
      that encodes all Unicode objects according to the
      detected encoding, if that encoding is known to Python

    To find the encoding of the terminal, it

    • uses GetConsoleCP and GetConsoleOutputCP on Windows,
    • uses nl_langinfo(CODESET) on Unix, if available.

    The primary rationale for this change is that people
    should be able to print Unicode in an interactive
    session. A parallel change needs to be added for IDLE,
    so that it adds the .encoding attribute to the emulated
    stdout (it already supports printing of Unicode on stdout).

    @loewis loewis mannequin closed this as completed Sep 21, 2002
    @loewis loewis mannequin self-assigned this Sep 21, 2002
    @loewis loewis mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 21, 2002
    @loewis loewis mannequin closed this as completed Sep 21, 2002
    @loewis loewis mannequin self-assigned this Sep 21, 2002
    @loewis loewis mannequin added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Sep 21, 2002
    @nobody
    Copy link
    Mannequin

    nobody mannequin commented Sep 24, 2002

    Logged In: NO

    I like the .encoding concept.

    I don't really like the sys.stdout wrapper. Wouldn't it be
    better to add the functionality to the file object .write() and
    .writelines() methods and then only use the wrapper in case
    sys.stdout is not a true file object ?

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented Sep 24, 2002

    Logged In: YES
    user_id=21627

    I have considered implementing it in the file object.
    However, it becomes quite involved, and heavy C code:
    PyFile_WriteObject calls PyObject_Print. Since Unicode does
    not implement a tp_print, this calls str/repr, which
    converts using the default encoding.

    It is not clear at which point the file encoding should be
    taking into account.

    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    I think it could work by adding a special case to
    PyFile_WriteObject() instead of calling PyObject_Print().
    You first encode the Unicode object and then let
    PyFile_WriteString() take care of the writing to the
    FILE* object.

    I see no other way, since you can't place the .encoding
    information into the FILE* object.

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented Oct 26, 2002

    Logged In: YES
    user_id=21627

    I've attached a revised version which implements your
    proposal; this version works without modification of site.py.

    In its current form, the file encoding is only applied in
    print; for sys.stdout.write, it is ignored. For print, it is
    applied independent of whether this is a script or
    interactive mode.

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented Mar 23, 2003

    Logged In: YES
    user_id=21627

    Is the patch now acceptable?

    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    Looks ok except for the direct hacking
    of f_encoding in the sys module. Please add
    either a macro or a new API to make changing
    the encoding from C possible without tapping
    directly into the implementation.

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented Mar 29, 2003

    Logged In: YES
    user_id=21627

    In stdout3.txt, PyFile_SetEncoding has been added, wrapping
    the creation and assignment of the string object f_encoding.

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented May 9, 2003

    Logged In: YES
    user_id=21627

    Any chance that this can go into 2.3b2?

    @malemburg
    Copy link
    Member

    Logged In: YES
    user_id=38388

    Sorry for not getting back to you on this earlier.

    stdout3.txt looks OK. Please check it in.

    Thanks !

    @loewis
    Copy link
    Mannequin Author

    loewis mannequin commented May 10, 2003

    Logged In: YES
    user_id=21627

    Committed as

    concrete.tex 1.25
    libstdtypes.tex 1.124
    fileobject.h 2.32
    fileobject.c 2.178
    sysmodule.c 2.119
    NEWS 1.763

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant