Skip to content

Conversation

ebraminio
Copy link
Collaborator

@ebraminio ebraminio commented Mar 28, 2018

Hopefully to fix #861

@@ -338,6 +338,9 @@ def map_to_use(data):
if 0x1CF2 <= U <= 0x1CF3: UIPC = Right
if 0x1CF8 <= U <= 0x1CF9: UIPC = Top

if not (UIPC in [Not_Applicable, Visual_Order_Left] or USE in use_positions):
print ("%s %s %s %s %s" % (hex(U), UIPC, USE, UISC, UGC), file=sys.stderr)
continue
Copy link
Collaborator Author

@ebraminio ebraminio Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should figure out what is going on to know why this is failing, which of course has some reason.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0xf18 Bottom O Other Mn
0xf19 Bottom O Other Mn
0xf3e Right O Other Mc
0xf3f Left O Other Mc
0xf86 Top O Other Mn
0xf87 Top O Other Mn

It’s failing because Unicode assigns each of those characters an Indic_Positional_Category but not an Indic_Syllabic_Category.

Copy link
Collaborator Author

@ebraminio ebraminio Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And any hint what should be done here also? :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set UISC for these characters after line 297.

@dscorbett
Copy link
Collaborator

This breaks clusters with multiple vowels like ⟨ཀིུ⟩: it needs the USE’s Tibetan overrides.

@coveralls
Copy link

coveralls commented Mar 28, 2018

Coverage Status

Coverage increased (+0.07%) to 66.477% when pulling 84170e9 on ebraminio:tibetan into a48dd6e on harfbuzz:master.

@@ -293,6 +293,9 @@ def map_to_use(data):

# Resolve Indic_Syllabic_Category

# TODO: There are Tibetan signs that don't have UISC
if U in [0xF18, 0xF19, 0xF3E, 0xF3F, 0xF86, 0xf87]: UISC = Cantillation_Mark
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always minimum four digits for Unicode hex values. Also, last one has lowercase 'f' while others are uppercase.

Copy link
Collaborator Author

@ebraminio ebraminio Mar 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can you help on finding correct syllable category for those also? They were just signs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's ask @jfkthame @PeterCon
Typically those need to be discussed on an issue by themselves.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for the U+0F3F to be moved to left it has to be marked as a matra or medial consonant. If it's neither, it won't be moved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PeterCon Peter, can you double check the overrides for Tibetan against the list posted on the spec page? Better yet, can you refresh the override list for all scripts?

@@ -338,9 +341,6 @@ def map_to_use(data):
if 0x1CF2 <= U <= 0x1CF3: UIPC = Right
if 0x1CF8 <= U <= 0x1CF9: UIPC = Top

if not (UIPC in [Not_Applicable, Visual_Order_Left] or USE in use_positions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that was just added, I probably should squash the changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looked at the squashed diff and looks good.

@@ -233,7 +232,7 @@ hb_ot_shape_complex_categorize (const hb_ot_shape_planner_t *planner)
/* Unicode-2.0 additions */
case HB_SCRIPT_TIBETAN:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case should be removed and other commented-out TIBETAN uncommented.

Copy link
Member

@behdad behdad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. See comments.

@behdad
Copy link
Member

behdad commented Mar 28, 2018

Also, did you test the reported bug at all?

@ebraminio
Copy link
Collaborator Author

ebraminio commented Mar 28, 2018

Also, did you test the reported bug at all?

Yep, and this by its own wasn't fixing it IIRC. I was considering this as just a starting point...

@ebraminio
Copy link
Collaborator Author

No response to what to do with the signs and I guess I can't reach to this soon so probably better to go lights off for this.

@ebraminio ebraminio closed this Apr 3, 2018
@behdad
Copy link
Member

behdad commented Apr 5, 2018

It's still a valid issue and we want to fix it. Please don't close valid issue and PRs just because of inactivity.

@behdad behdad reopened this Apr 5, 2018
@ebraminio
Copy link
Collaborator Author

Well, I though I can't reach to them soon and better to focus on some limited ones, but if you want to, sure :)

@behdad behdad closed this in 32a4381 Oct 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tibetan reordering does not match Uniscribe
5 participants