Skip to content
This repository has been archived by the owner on Jul 22, 2022. It is now read-only.

Unicode IVS handling #35

Closed
miurahr opened this issue Mar 22, 2020 · 7 comments
Closed

Unicode IVS handling #35

miurahr opened this issue Mar 22, 2020 · 7 comments

Comments

@miurahr
Copy link
Owner

miurahr commented Mar 22, 2020

Describe the bug
When converting strings with IVS: Ideograph Variation Selector, unihandecode does not convert correctly.

Related issue
(if exist)

To Reproduce
A following test become failure.

def test_ivs():
    s1 =  "\U0000845B\U000E0100飾区"
    s2  = "\U0000845B\U000E0101城"
    u = unihandecode.Unihandecoder('ja')
    r1 = u.decode(s1)
    r2 = u.decode(s2)
    assert r1.startswith('katsu')
    assert r2.startswith('katsura')
    assert r1 == 'katsushika ku'
    assert r2 == 'katsuragi'

\U0000845B\U000E0100飾区 is a variation of 葛飾区 and \U0000845B\U000E0101城 is a variation of 葛城 which is used in location name 葛城市

Expected behavior

Test passed.

Environment (please complete the following information):
master head: 40b419d839a028348427fc0eab959f64119a32c0

@miurahr
Copy link
Owner Author

miurahr commented Mar 22, 2020

In unicode standard chart, these are defined as

image

@miurahr

This comment has been minimized.

@miurahr
Copy link
Owner Author

miurahr commented Mar 30, 2020

r1 become AssertionError: assert 'Katsushikaku Ku' == 'Katsushikaku'

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@github-actions
Copy link

github-actions bot commented Jun 9, 2020

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

@miurahr
Copy link
Owner Author

miurahr commented Jun 9, 2020

The case should be handled with https://github.com/miurahr/pykakasi

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant