-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode characters inconsistently cannot be displayed in Notepad++ #3747
Comments
I was able to reproduce this as well. I tested with the Default Style in Style Configurator set to I believe this issue would happen any time you enter a character that is NOT contained in the selected font and then add/remove on the same line a character which IS contained in the selected font. IMHO seems like something not quite right with the font-substitution routines. This was in a TXT file encoded with Debug InformationNotepad++ v7.5.9 (64-bit) |
This may be, but it also appears to occur in other circumstances as-well. For-example, "⎷" (U+23B7 "RADICAL SYMBOL BOTTOM") is present in Consolas, Courier New, DejaVu Sans Mono, and Lucida Console, but if you put that in a new text file, it won't show up with any of those fonts. |
Cannot confirm this example on my system. The U+23B7 character is not in my DejaVu Sans Mono. There is U+23AE, followed by U+23CE. The U+23B7 character is not in my Courier New either. There is U+2321, followed by U+2500. Same for my Consoleas, U+2321 followed by U+2460, same for my Lucida Console, this ends at U+0433. So this example seems not to poke a hole into the theory, that only characters unavailable in the current font are affected. |
It might be, somehow, related to SCI_SETTECHNOLOGY configuration. |
@ValZapod |
@Ekopalypse, I did include an Techology 0: Technologies 1, 2 and 3: Both screenshots show the same file automatically loaded after start, the only thing I did was moving the cursor to the right bracket of the two marked brackets. I used Courier New here. The new techologies seem to size the substituted chars better then techology 0. Edit: But it has nothing to do with "fixed font" anymore, the substitutions seem to have quite variable widths. |
No, I mean the screenshot and discussion you linked to There it has been reported that the words in an autocompletion box are bigger using directwrite |
I can make the ∈ visible with the Courier New font now too, using technology 0 and some hand-configured font linking, which looks actually a little ill: What I did: There is a registry entry
Under it, there are many multi-string values, named after fonts. There was no value named |
@Uhf7 Is not set for me. |
then certainly another trick exists. Running out of thoughts here. On my system, the 32 bit version works exactly like the 64 bit version. A major difference is the system itself: I use you use Windows 7. I try it on my Windows 7 system ... |
Ok, under Windows 7, the ∈ works fine, with and without bracket highlighting, but some other characters don't. Using technology 0. With the 64 bit version of Npp, the same characters are missing. Using technology 1: What else could you ask for? Looks perfect to me. Technology 1, Windows 7, 64 bit version of Npp: I dream, if you ask me. Compared to the current state. |
No, it was started with these settings. I switched to global overwrite to show that no other font is defined. If global override is NOT checked, the default setting takes precedence.
Npp has no setting for this yet. What you can do is to use one of the scripting language plugins, @Uhf7 - hmm :-D what should I say - Windows 10 broke it :-( |
I see no negative effect when removing the ☆, if I use technology 1: or the ⇒: I certainly start to get messed up with my screenshot names here, that's why I don't post any pictures from re-inserting the ☆ and the ⇒ successfully, but for me, with technology 1 everything works fine. Under Windows 7 and under Windows 10, I believe. |
My knowledge of fonts, rendering, directwriting... refers to what I have posted. |
using a plugin or NppExec to make the characters display correctly is not exactly what I call a solution of the problem. It's a possible work-around, but wouldn't it be nice when the characters are displayed correctly without additional actions? And, @ValZapod, how long would it take until we have a new Scintilla version? (Perhaps, the Scintilla developers will say: Use technology 1 or higher! That would be interesting) I would feel better, if Npp itself would switch the technology to a working one. May be, it can be included in the configuration somehow, so that there is a safe fallback if the technology switch doesn't work on some systems. |
@Uhf7 - 100% correct :-D |
An issue close to this one is #2287. It is the same problem they describe there, existing since 2016, and it is solved there by setting the technology to DirectWrite with the help of a plug-in. Thank you for that solution, but this is something for insiders. As a new user, or as a user who is just using it for editing files without caring about development, this solution is this is very hard to find. So I would fully support what @jefflomax said in #2287:
So I will try to push it to the master now, with a PR. If we not do this now, the next ones come in two years wasting their time with testing it again and again and again. |
May be. But Unicode itself was already there under Vista. What bugs me more is, that technology 1 under this Vista seems to wreck "normal" characters nearby the ∈ character, sometimes. |
You wrote 2 days ago
The Unicode character U+25C6 (◆)displays in Npp with and without DirectWrite technology. Even in Windows 7. So I cannot verify that this is exactly "our" bug. And it was 2012. And he used Windows XP. And I'm sure there are many effects which can lead to empty frames instead of correct characters. I simply don't believe that it's promising to go to them and ask them to fix exactly this issue now. |
The second screenshot of my Vista screenshots, headlined "Vista, Technolgy 1, Courier New". |
Valerii Zapodovnikov:
Okay, maybe open another issue?? Maybe also lets try @nyamatongwe?
For Scintilla bug #1393, text shaping for East Asian text can be influenced by the locale used so displaying in a Japanese context may differ from a Chinese context. There are other bugs about this like https://sourceforge.net/p/scintilla/bugs/2027/.
Problems with displaying particular symbol characters may be different. They seem to occur when the specified font does not include some characters so Windows tries to use glyphs from backup fonts. Scintilla does not have much control over this.
For GDI (technology 0) you could try experimenting with the font creation setup call in SetLogFont inside win32/PlatWin.cxx. It is possible that the lfQuality and lfCharSet parameters will influence the behaviour.
DirectWrite was originally implemented for Windows Vista but that early version had some problems and DirectWrite has improved over time. Applications could default to using DirectWrite from Windows 7 if there are too many problems with Vista or add an option that users can select. Some people prefer GDI’s less anti-aliased (blurry) text.
Neil
|
Nope, The owner of Notepad3 is "Derick Payne" 😉 |
I don't think so, if you check his PR then you will see that he added it to the preference dialog. |
Too much noise for my taste, I'm out. |
@ValZapod I saw the UI already, but had no really opinion about it, because it doesn't belong to this project, so it does not help me here. My opinion regarding the technology settings in the screen shot: Two options too many. The difference in text rendering is between the Windows GDI TextOut function on one side and the DirectWrite equivalent on the other side. The rest is about how to bring the rendering result of DirectWrite to the screen. |
Valerii Zapodovnikov:
Somebody just proposed a patch for this! #8756 (comment) <#8756 (comment)> We already know that changing the text changes the presentation of this bug. The patch is not a fix.
Neil
|
This "fix" is at least a hint where the problem comes from: It comes directly from the Windows GDI text output functions for wide characters. I did some experiments based on this information. The Windows GDI functions, which are used by Scintilla and which do not work correctly, are:
The common error of these functions seems to be, that they use squares instead of characters for some 'bad' Unicode characters, as long as there is no 'good' Unicode character in the text string. I have no list of 'good' or 'bad' Unicode characters, this is only a term for it I invented here. But I can name two 'good' Unicode characters: 0x0000 and 0x200B. If one of those two characters is in the text, all other Unicode characters are displayed correctly. The 0x0000 character has been used by @KnIfER for the "fix". Unfortunately, it has a width, when we use it with the Windows GDI functions. So I went for the 0x200B character (Zero width space) in my experiments. A possible fix is to append the 0x200B character silently to all text strings passed to the Windows functions mentioned above. Then they produce the correct character width's and the correct output. To make this experiment fly without additional text copy operations, I modified the
After modifying the What remains here, is the
This experimental fix runs on my system without assertions in debug mode and displays the correct characters using the Windows GDI functions. I don't know whether such a solution would be accepted by Scintilla, but perhaps there is someone who wants to try it this way too ... |
Here's another way to reproduce this, from #3747 originally reported with #813 Open a new Notepad++ file, set the encoding to UTF-8 and paste these symbols (Double Arrow Unicode characters) on the first empty line This comment has a nice video showing the issue |
Here's the solution: |
Maybe be more specific about what is not fixed. |
Actually, DirectWrite enabled / disabled has same effect (shown in previous screenshot). |
Could it depend on system/application font? |
@ValZapod Would you suggest reopening this defect and closing the new ones, or something else? |
Pong! Seems to be a complicated issue. We should try to split up some different problems into different groups:
|
It is a hack, and it has to be applied in Scintilla. I call it a hack, because I don't know exactly why it works. But it is not a too-bad hack, because it respects all the requirements the Windows-API has at this point, which means, we do nothing which is forbidden by the Windows-API. Actually, we expect to work the Windows-API without this hack, but it doesn't.
Linux has nothing to do with it here. The problem is, that all characters we see displayed by Notepad++ are displayed by Windows-API-function calls. If the DirectWrite option in Notepad++ is disabled, Notepad++ uses the ancient Windows-GDI function TextOutW to display the character(s). If the DirectWrite option in Notepad++ is enabled, Notepad++ uses the brand-new and super-fast DirectWrite text output function to display the character(s). Since both are different graphic API's, there are different results.
Cannot try out this momentarily, but as far as I remember from testing, some should work. Please try. |
First thought: I feel overcharged with this. Fighting the ministry of truth??? No way. Second thought: The problem is so specific, that no one working at the 1st level of telephone or email support for Microsoft will grasp it. So the only advice I expect from there is something like "Please reboot your computer to see if it's gone" or similar. No hope, unless you know someone inside the system ... |
Description of the Issue
In a Notepad++ document that is encoded as UTF-8 (no BOM), many Unicode characters are not displayed, but the hollow square appears in their place. If a displayable Unicode character is added to a line containing undisplayable Unicode characters, those undisplayable ones suddenly appear. Removing the "good" one makes the others revert to the hollow square. A simple example:
☆◬⊗⊠⋆⧆⨂
Paste that line into NP++ and you will see all the characters. Remove the leading star ☆ and the others become squares. Restore the star and the others re-appear.
Steps to Reproduce the Issue
Expected Behavior
All of the characters always should appear.
Actual Behavior
They only appear if an always-acceptable Unicode character is on the same line. If an always-acceptable Unicode character is in the document but not on the same line, certain Unicode characters, such as, but not limited to, the ones shown above, will not be displayed properly.
Debug Information
Notepad++ v7.5.1 (32-bit)
Build time : Aug 29 2017 - 02:35:41
Path : C:\Program Files (x86)\Notepad++\notepad++.exe
Admin mode : OFF
Local Conf mode : OFF
OS : Windows 10 (64-bit)
Plugins : ComparePlugin.dll mimeTools.dll NppConverter.dll NppExport.dll NppFTP.dll NppTextFX.dll PluginManager.dll SpellChecker.dll
This occurs with characters from many of the Unicode blocks.
The text was updated successfully, but these errors were encountered: