-
Notifications
You must be signed in to change notification settings - Fork 682
[WIP] Route Tibetan through USE #933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/gen-use-table.py
Outdated
| @@ -338,6 +338,9 @@ def map_to_use(data): | |||
| if 0x1CF2 <= U <= 0x1CF3: UIPC = Right | |||
| if 0x1CF8 <= U <= 0x1CF9: UIPC = Top | |||
|
|
|||
| if not (UIPC in [Not_Applicable, Visual_Order_Left] or USE in use_positions): | |||
| print ("%s %s %s %s %s" % (hex(U), UIPC, USE, UISC, UGC), file=sys.stderr) | |||
| continue | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should figure out what is going on to know why this is failing, which of course has some reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0xf18 Bottom O Other Mn
0xf19 Bottom O Other Mn
0xf3e Right O Other Mc
0xf3f Left O Other Mc
0xf86 Top O Other Mn
0xf87 Top O Other Mn
It’s failing because Unicode assigns each of those characters an Indic_Positional_Category but not an Indic_Syllabic_Category.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And any hint what should be done here also? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set UISC for these characters after line 297.
|
This breaks clusters with multiple vowels like ⟨ཀིུ⟩: it needs the USE’s Tibetan overrides. |
src/gen-use-table.py
Outdated
| @@ -293,6 +293,9 @@ def map_to_use(data): | |||
|
|
|||
| # Resolve Indic_Syllabic_Category | |||
|
|
|||
| # TODO: There are Tibetan signs that don't have UISC | |||
| if U in [0xF18, 0xF19, 0xF3E, 0xF3F, 0xF86, 0xf87]: UISC = Cantillation_Mark | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always minimum four digits for Unicode hex values. Also, last one has lowercase 'f' while others are uppercase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, can you help on finding correct syllable category for those also? They were just signs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least for the U+0F3F to be moved to left it has to be marked as a matra or medial consonant. If it's neither, it won't be moved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PeterCon Peter, can you double check the overrides for Tibetan against the list posted on the spec page? Better yet, can you refresh the override list for all scripts?
src/gen-use-table.py
Outdated
| @@ -338,9 +341,6 @@ def map_to_use(data): | |||
| if 0x1CF2 <= U <= 0x1CF3: UIPC = Right | |||
| if 0x1CF8 <= U <= 0x1CF9: UIPC = Top | |||
|
|
|||
| if not (UIPC in [Not_Applicable, Visual_Order_Left] or USE in use_positions): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that was just added, I probably should squash the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, looked at the squashed diff and looks good.
src/hb-ot-shape-complex-private.hh
Outdated
| @@ -233,7 +232,7 @@ hb_ot_shape_complex_categorize (const hb_ot_shape_planner_t *planner) | |||
| /* Unicode-2.0 additions */ | |||
| case HB_SCRIPT_TIBETAN: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case should be removed and other commented-out TIBETAN uncommented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. See comments.
|
Also, did you test the reported bug at all? |
Yep, and this by its own wasn't fixing it IIRC. I was considering this as just a starting point... |
|
No response to what to do with the signs and I guess I can't reach to this soon so probably better to go lights off for this. |
|
It's still a valid issue and we want to fix it. Please don't close valid issue and PRs just because of inactivity. |
|
Well, I though I can't reach to them soon and better to focus on some limited ones, but if you want to, sure :) |
Hopefully to fix #861