-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tkinter hangs or crashes when displaying astral chars #86391
Comments
On my macOS Mohave, 3.10, echoing '\U0001####' (# = hex digit) or chr(#####) (decimal digits) in IDLE's shell either prints an error box or hangs. On bpo-13153, freezing on macOS was reported for 3.7.6. Until tkinter on Mac works better, we should try to get an error box for all astral chars. For an SO questioner with Ubuntu 18.04, now updated to 20.04 with python 3.8.6, some chars display (128512-128547; 128549-128555; 128557-128576, example chr(128516)) and some 'crash' (example chr(128077)). I am trying to get 'crash' narrowed down and the tk version Ubuntu uses.
Serhiy, does >>> chr(128516) echo thumbs up on your Linux? The SO crash example works for me on Windows. I should test more codepoints. |
I get a crash for chr(128516) ("😄") in Tk. $ wish
% label .l -text 😄
.l
% X Error of failed request: BadLength (poly request too large or internal Xlib length error)
Major opcode of failed request: 139 (RENDER)
Minor opcode of failed request: 20 (RenderAddGlyphs)
Serial number of failed request: 599
Current serial number in output stream: 599 |
Serhiy:
On Linux? What is your Tk version? On my Fedora 32, the character is displayed properly. It seems like Tk is still using X11 whereas my GNOME desktop is using Wayland. $ ./python -m test.pythoninfo|grep ^tkinter
tkinter.TCL_VERSION: 8.6
tkinter.TK_VERSION: 8.6
tkinter.info_patchlevel: 8.6.10 |
Hum, I didn't explain well. My test. I ran: ./python -m idlelib In the IDLE shell, I wrote chr(0x1F604) which displays the emoji as expected: >>> chr(0x1F604)
'😄' |
I generated a script for testing all characters: with open('withtest.sh', 'w', errors='surrogatepass') as f:
for i in range(0x100, 0x110000): print(f"echo 'label .l -text \"{chr(i)}\"; exit' | wish 2>/dev/null && echo OK '\\U{i:08x}' {chr(i)!r} || echo FAIL '\\U{i:08x}' {chr(i)!r}", file=f) It takes a time. It tested around 20% of all characters for 6-7 hours. And it seems that all failed characters are colored emojies and all passed characters are non-colored. Seems it is related either to the font that provides colored emojies, or to the mechanism that interprets such fonts, or Tk just cannot correctly handle the output when such fonts are used (maybe reserve too small buffer or cannot interpret result code). |
Serhiy's test also work as expected. $ wish
% label .l -text 😄 Since the Serhiy's test doesn't use Python, is it worth it to track this Tk crash in the Python bug tracker? |
Yes, on Linux. Ubuntu 2020.04. Tk 8.6.10. X.Org X Server 1.20.8. I tried to report the bug upstream, but failed. I did not use the Tk bugtracker several years, and it was on different computer, so I have no password to my account, and when I tried to create new accounts, I cannot login with them too. I tried to write to the mailing list, but it requires subscribing, and when I subscribed I did not receive a message with confirmation. If anybody can, please report this bug to Tk developers. |
Victor, do you see a color smiling face in my example or monochromatic or just a bar? |
See attached screenshot: fedora32.png. |
It looks different on my computer. I suppose it will crash to you too if you install a color emoji font. |
The error on Linux could be related to this issue: https://bugzilla.redhat.com/show_bug.cgi?id=1498269 |
In IDLE on Windows the following prints the first 3 astral planes in a couple of minutes. for i in range(0x10000, 0x40000, 32):
chars = ''.join(chr(i+j) for j in range(32))
print(hex(i), chars) Perhaps half of the assigned chars in the first plane are printed instead of being replaced with a narrow box. This includes emoticons as foreground color outlines on background color. Maybe all of the second plane of extended CJK chars are printed. The third plane is unassigned and prints as unassigned boxes (with an X). Fixing OS graphics or tk is out of scope for us. Preventing hangs or crashes when using tkinter is. On Mac, refusing to insert any astral char into a tk widget might be the best solution. Serhiy, could that be done in tkinter/_tkinter? On Linux, the situation appears to be more complex. The SO questioner |
I dislike attempting to workaround Tk issues in Python. As you can see, the behavior really depends on the platform. As I wrote, on Fedora 32 it works (the character is rendered properly). I would prefer to not block such character on Fedora 32 because it does crash on some other platforms. Or you should detect the very precise conditions explaining why it works on some platforms and crash on some other platforms... |
For me, this is not limited to special characters. Trying to load anything in Tk using the 'JoyPixels' font crashes (sometimes it does load but all characters are very random - most are whitespace - and it crashes again after a call to I believe this is what is being experienced on https://askubuntu.com/questions/1236488/x-error-of-failed-request-badlength-poly-request-too-large-or-internal-xlib-le because they are not using any special characters yet are reporting the same problem. |
Victor, does my test run to completion (without exception) on your Fedora? If it does, I definitely would not disable astral char display on Fedora. This version catches exceptions and reports them separately and runs directly with tkinter, in about a second. tk = True
if tk:
from tkinter import Tk
from tkinter.scrolledtext import ScrolledText
root = Tk()
text = ScrolledText(root, width=80, height=40)
text.pack()
def print(txt):
text.insert('insert', txt+'\n')
errors = []
for i in range(0x10000, 0x40000, 32):
chars = ''.join(chr(i+j) for j in range(32))
try:
print(f"{hex(i)} {chars}")
except Exception as e:
errors.append(f"{hex(i)} {e}")
print("ERRORS:")
for line in errors:
print(line) |
It works on Ubuntu if uninstall the color Emoji font (package fonts-noto-color-emoji). |
The following program fails with: Python program: from tkinter import Tk
from tkinter.scrolledtext import ScrolledText
root = Tk()
text = ScrolledText(root, width=80, height=40)
text.pack()
for i in range(0x10000, 0x40000, 32):
chars = ''.join(chr(i+j) for j in range(32))
text.insert('insert', f"{hex(i)} {chars}\n")
input("Press enter to exit") It seems like the first character which triggers this RenderAddGlyphs BadLength issue is: U+1f6c2. See attached emoji.png screenshot. As you can see, some emojis are rendered in color in Gnome Terminal. I guess that it uses the Gtk 3 pango library to render these characters. |
The X Error is displayed and then the process exit. Python cannot catch this fatal X Error. |
@kevin Walzer: Is the problem were seeing a known issue with Tk? |
Some work has been done this year on expanding support for these types of glyphs in Tk, but I'm not sure of its current state--it's not my area of expertise. Can you open a ticket at https://core.tcl-lang.org/tk/ so one of the folks working on this can take a look? |
Kevin, Serhiy tried to report this upstream but failed. msg380143. One person running my test program reported Running line-by-line in terminal, the for-loop crashes with: Another reported "Seems to produce garbage on my system: But the program ran to completion without errors. A copy of the output from the window was attached. I have asked for the tcl/tk version. My response included: I don't know what you saw, but Notepad++ displays control chars with the high bit set (C1 controls) as their reversed type (white on black) 3 char acronym as defined on Thus the first astral U+10000 is encoded as b"\xF0\x90\x80\x80. In Notepad++, what is in the file appears as 4 characters, not 1, displayed 'ðDCSPADPAD', with the part after ð being being the correct white on black triplets for code points U+90 and U+80. The first char '\xf0' == 'ð' is the same for all quadruples shown by Notepad++. The next 3 vary as appropriate. In some cases, all 4 are normal printable chars, such as 0x29aa0, a CJK char, showing as "𩪠" If I cut the first 4 chars from Notepad++ to Thunderbird the result is "ð���". I see only ð but the presence of 3 0-width chars is revealed by moving through the string with arrow keys. As far as IDLE and Linux is concerned, I am just going to consider what to change or add in "User output in Shell" in the IDLE doc. |
Further to the information I posted on Stack Overflow (referred to above) relating to reproducing emoticon characters from Idle under Ubuntu, I have done more testing. Based on some of the code/comments above, I tried modifications which I hoped might identify errors before Idle crashed. Another test used this code. def FileSave(sav_file_name,outputstring):
with open(sav_file_name, "a", encoding="utf8",newline='') as myfile:
myfile.write(outputstring)
def FileSave1(sav_file_name,eoutputstring):
with open(sav_file_name, "a", encoding="utf8",newline='') as myfile:
myfile.write(eoutputstring)
tk = True
if tk:
from tkinter import Tk
from tkinter.scrolledtext import ScrolledText
root = Tk()
text = ScrolledText(root, width=80, height=40)
text.pack()
def print1(txt):
text.insert('insert', txt+'\n')
errors = []
outputstring = "Characters:"+ "\n"+"\n"
eoutputstring = "Errors:"+ "\n"+"\n"
#for i in range(0x1f600, 0x1f660): #crashes at 0x1f624
for i in range(0x1f623, 0x1f624): # 1f624, 1f625 then try 1f652
chars = chr(i)
decimal = str(int(hex(i)[2:],16))
try:
outputstring = str(hex(i))+" "+decimal+" "+chars+ "\n"
FileSave("Charsfile.txt", outputstring)
print1(f"{hex(i)} {decimal} {chars}")
print(f"{hex(i)} {decimal} {chars}")
except Exception as e:
print(str(hex(i)))
eoutputstring = str(hex(i))+ "\n"
FileSave1("Errorfile.txt", eoutputstring)
errors.append(f"{hex(i)} {e}")
print("ERRORS:")
for line in errors:
print(line) With the range starting at 0x1f623 and changing the end point, in Ubuntu, with end point 0x1f624, this prints ok, but if higher numbers are used the Idle windows all closed. However on some occasions, if I began with end point at 0x1f624 and run, then without closing the editor window I increased the end point to 0x1f625, save and run, the Text window would close, but the console window would remain open. I could then increase the upper range further and repeat and more characters would print to the console. In none of the tests with the more complex code above did I manage to generate any error output. My set up is as follows. Hopefully, the above might give some pointers to handling these characters. |
I've filed a Tk issue about this: https://core.tcl-lang.org/tk/tktview/f9fa926666d8e06972b5f0583b07a3c98eaac0a0 What versions of Tk are used?
|
On Ubuntu, Tk version is showing as 8.6.10 |
The crash I had on macOS with tk 8.6.8 appears to be gone when using tk 8.6.10. What I got back was a SyntaxError when pasting a smiley emoji in an IDLE shell window when trying to type execute print("😀"). The SyntaxError message says: 'utf-8' codec can't encode characters in position 7-12: surrogates not allowed. That's likely to to how Tk represents this character in its text widget, and is something we could work around when converting Tcl/Tk strings to Python strings. Printing the emoji using 'print(chr(128516))' works fine. The scriptlet in msg380173 also works. |
Serhiy, does Ronald's report above re 8.6.10 on macOS suggest what might be needed to make print("😀") work on Mac? As I remember, your year-old _tkinter patch to make print(<astral>) work on Linux and Windows converts Python strings differently on the two systems. But you did not know for sure what to do for macOS because nothing would work. |
Note that the main installers for Python 3.8 and 3.9 will continue to use Tk 8.6.8 due to problems when building later Tk version on macOS 10.9. The current plan is to add an installer variant to (amongst others) uses Tk 8.6.10 (and .11 when that's released). |
W.r.t. the SyntaxError I got (msg380552): It looks like it will be possible to work around that problem in _tkinter.c:unicodeFromTclStringAndSize by merging surrogate pairs. |
Please open a new issue for "surrogates not allowed". |
I closed python/issues-test-cpython#43647 as a duplicate of this. It reported that BMP chars can fail also. For instance, with "Noto Sans Mono", but not 'Dejavu Mono', the following crash.
>>> '\u2705'
'✅'
>>> '\u270f'
'✏' Unfortunately, as least on some *nix, the default tkFixedFont resolves to Noto Sans Mono. |
At least it is not a regression caused by support of astral characters (bpo-13153). |
No, seems strictly a matter of complicated color, which is perhaps becoming more common. Firefox colors the checkbox (white checkmark on green field in a largish black square) but not the (smaller) pencil. I did not recognize either the FF or tk Windows pencil as a pencil without any color (so I searched), so I won't be surprised if FF upgrades its pencil too. |
On macOS with 3.10.0a, 8.6.11 appears to fix this issue.
>>> chr(128516)
"😄" For IDLE, I am adding a paragraph to the doc. I will then close this issue as 'fixed' (insofar as we can for what is a 3rd party failure). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: