Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some halfwidth chars were not properly transliterated #1

Open
fanweihua opened this issue Sep 20, 2016 · 6 comments
Open

Some halfwidth chars were not properly transliterated #1

fanweihua opened this issue Sep 20, 2016 · 6 comments

Comments

@fanweihua
Copy link

I used "ゲット" to test Java Api, the expected result should be "ゲット". There are 2 issues in the actual output. The first one is ゙ was not combined with ケ as one word. The 2nd one is ッ was not transformed at all.

@andywork
Copy link

andywork commented Feb 5, 2018

あと追加です。すみません。
「ャ」「ュ」「ョ」が半角になってくれませんです。as3版です。

@chengstone
Copy link

chengstone commented Sep 13, 2018

change function toZenkakuCase to following code:

public static String toZenkakuCase(String str)
{
    int f = str.length();
    StringBuilder buffer = new StringBuilder(str);

    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);

        if(H2Z.containsKey(c)){
            buffer.setCharAt(i, H2Z.get(c));
        } else if(c == 0x0020){
            buffer.setCharAt(i, '\u3000');
        } else if(c <= 0x007E && 0x0021 <= c) {
            buffer.setCharAt(i, (char)(c + 0xFEE0));
        }
    }

    str=buffer.toString();
    f = str.length();
    buffer = new StringBuilder(str);
    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);
        if ((0x304B <= c && c <= 0x3062 && (c % 2 == 1)) ||
                (0x30AB <= c && c <= 0x30C2 && (c % 2 == 1)) ||
                (0x3064 <= c && c <= 0x3069 && (c % 2 == 0)) ||
                (0x30C4 <= c && c <= 0x30C9 && (c % 2 == 0))) {
            char d = buffer.charAt(i + 1);
            buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : 0)));
            if (c != buffer.charAt(i)) {
                buffer = buffer.deleteCharAt(i + 1);
                f--;
            }
            continue;
        }

        if ((0x306F <= c && c <= 0x307D && (c % 3 == 0)) ||
                (0x30CF <= c && c <= 0x30DD && (c % 3 == 0))) {
            char d = buffer.charAt(i + 1);
            buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : ((d == '\u309C') ? 2 : 0))));
            if (c != buffer.charAt(i)) {
                buffer = buffer.deleteCharAt(i + 1);
                f--;
            }

            continue;
        }
    }

    return buffer.toString();
};

@andywork
Copy link

Thanks for the correction!

@chengstone
Copy link

chengstone commented Sep 18, 2018

すみません、先週のcodeを改修した後で、新しbugを見えました。
再改修した:

public static String toZenkakuCase(String str)
{
    int f = str.length();
    StringBuilder buffer = new StringBuilder(str);

    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);

        if(H2Z.containsKey(c)){
            buffer.setCharAt(i, H2Z.get(c));
        } else if(c == 0x0020){
            buffer.setCharAt(i, '\u3000');
        } else if(c <= 0x007E && 0x0021 <= c) {
            buffer.setCharAt(i, (char)(c + 0xFEE0));
        }
    }

    str=buffer.toString();
    f = str.length();
    buffer = new StringBuilder(str);
    for(int i=0;i<f;i++)
    {
        char c = str.charAt(i);
        if ((0x304B <= c && c <= 0x3062 && (c % 2 == 1)) ||
                (0x30AB <= c && c <= 0x30C2 && (c % 2 == 1)) ||
                (0x3064 <= c && c <= 0x3069 && (c % 2 == 0)) ||
                (0x30C4 <= c && c <= 0x30C9 && (c % 2 == 0))) {
            if(i + 1 < buffer.length()){
                char d = buffer.charAt(i + 1);
                buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : 0)));
                if (c != buffer.charAt(i)) {
                    buffer = buffer.deleteCharAt(i + 1);
                    f--;
                }
                continue;
            }
        }

        if ((0x306F <= c && c <= 0x307D && (c % 3 == 0)) ||
                (0x30CF <= c && c <= 0x30DD && (c % 3 == 0))) {
            if(i + 1 < buffer.length()){
                char d = buffer.charAt(i + 1);
                buffer.setCharAt(i, (char) (c + ((d == '\u309B') ? 1 : ((d == '\u309C') ? 2 : 0))));
                if (c != buffer.charAt(i)) {
                    buffer = buffer.deleteCharAt(i + 1);
                    f--;
                }

                continue;
            }
        }
    }

    return buffer.toString();
};

@andywork
Copy link

Thank you. I will use the new version.

@chengstone
Copy link

“if(i + 1 < buffer.length()){”を追加しました

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants