Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with diacritics #478

Open
sorawee opened this issue Apr 4, 2021 · 7 comments
Open

Problem with diacritics #478

sorawee opened this issue Apr 4, 2021 · 7 comments

Comments

@sorawee
Copy link
Contributor

sorawee commented Apr 4, 2021

DrRacket can't display diacritics in Thai language (and probably other languages with diacritics) correctly in the code editor.

Screen Shot 2021-04-04 at 11 08 52 PM

Here's how it should be displayed:


(กำหนด ความกว้าง 500)

(กำหนด ความกว้าง 500)

FWIW, Emacs is able to display it correctly.

Screen Shot 2021-04-04 at 11 13 36 PM

@mbutterick's quad used to have an issue with diacritics too (though it's a different problem), so let me @ you in case you have an idea what could go wrong.

@rfindler
Copy link
Member

rfindler commented Apr 4, 2021

I am guessing this is an issue with either text% or perhaps the drawing libraries (accessed via, eg, canvas-dc%), but maybe on a non-mac platform? Or maybe a specific font? (It looks okay to me.)

Here's some code that might reproduce the issue outside of DrRacket (if it isn't a font-specific issue).

#lang racket/gui
(define s "กำหนด ความกว้าง")
(define t (new text%))
(define f (new frame% [label ""][width 300] [height 300]))
(define ec (new editor-canvas% [parent f] [editor t]))
(send t insert s)
(send f show #t)

@sorawee
Copy link
Contributor Author

sorawee commented Apr 4, 2021

Sorry, should have mentioned that I'm on Mac. The program that you provided above does reproduce the issue, though weirdly, "กำ" is now displayed correctly! "กว้าง" is still incorrect however.

Screen Shot 2021-04-05 at 6 18 40 AM

This is not a font specific issue IIUC. Even with the font TH Sarabun New (the standard font for Thai script), the issue persists in DrRacket.

Screen Shot 2021-04-05 at 6 21 04 AM

Here's how it displays in word processor softwares.

Screen Shot 2021-04-05 at 6 21 44 AM

@97jaz
Copy link

97jaz commented Apr 5, 2021

I think the problem is more generally with unicode combining characters:

#lang racket/base

(define chars '(#\e #\u0301))
(displayln chars)
(displayln (list->string chars))
(newline)

(define precomposed-chars
  ((compose string->list string-normalize-nfc list->string)
   chars))
(displayln precomposed-chars)
(displayln (list->string precomposed-chars))

@97jaz
Copy link

97jaz commented Apr 5, 2021

Related? racket/draw#22
According to a comment in this issue, DrRacket always uses #f for the combine? parameter to the draw-text method of dc<%>. And the code has this comment: https://github.com/racket/draw/blob/a4e156abe5119309783443495d671b9a7f3e434b/draw-lib/racket/draw/private/dc.rkt#L1493

@sorawee
Copy link
Contributor Author

sorawee commented Jan 12, 2022

In the latest version of DrRacket, things are a bit flipped. Running @rfindler's program, we will get:

Screen Shot 2022-01-11 at 5 35 29 PM

where กำ, which consists of two characters and , is displayed without the circle on top of . Note though that กว้าง is now displayed correctly.

It's somewhat weird, because this display problem only occurs when I choose not to "normalize" when pasting the code in. If I normalized, I do get the desired display, but now กำ becomes 3 characters: , , and , which is incorrect in Thai language. is one character, and is not equivalent to + .

Screen Shot 2022-01-11 at 5 40 17 PM

@sorawee
Copy link
Contributor Author

sorawee commented Jul 18, 2022

I want to try this again after the recent unicode change, and just noticed a couple more issues (which already exist even before the unicode change)

Steps to reproduce:

  • Paste (ความกว้าง 500) to DrRacket. Notice that the number 500 is not syntax-highlighted correctly Screen Shot 2022-07-17 at 10 11 58 PM
  • Move the caret to the right parenthesis, and hit the left key multiple times. Somehow it gets stuck at the end of "ความกว้าง"

mflatt added a commit to racket/snip that referenced this issue Jul 18, 2022
@mflatt
Copy link
Member

mflatt commented Jul 18, 2022

The problem with (ความกว้าง 500) should be fixed by the snip-lib commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants