Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

small issue with combine Tashkeel. #31

Open
MohHeader opened this issue Oct 22, 2018 · 7 comments
Open

small issue with combine Tashkeel. #31

MohHeader opened this issue Oct 22, 2018 · 7 comments

Comments

@MohHeader
Copy link
Contributor

MohHeader commented Oct 22, 2018

Another Edit ( 4 Feb 2019 )

Edit ( 19Dec2018 )
Function is updated to fix some bugs.

Hi :)

I found a bug in my code submitted later,, if we have a word that contains شدة & the previous letter has a tashkeel, then the Shada will be assigned to the previous letter instead of the correct one.

I have to fix it

but if someone is searching for a fix until a Merge request is created

Just replace the RemoveTaskeel method

internal static string RemoveTashkeel(string str, out List<TashkeelLocation> tashkeelLocation)
    {
        tashkeelLocation = new List<TashkeelLocation>();
        char[] letters = str.ToCharArray();

        int index = 0;
        bool lastWasLetter = true;
        int combined = 0;
        for (int i = 0; i < letters.Length; i++)
        {
            bool currentIsLetter = false;
            if (letters[i] == (char)0x064B)
            { // Tanween Fatha
                tashkeelLocation.Add(new TashkeelLocation((char)0x064B, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x064C)
            { // Tanween Damma
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0651) // Shadda
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC5E, i - 1 - combined); // Shadda With Tanween Damma
                        combined++;
                        continue;
                    }
                }
                tashkeelLocation.Add(new TashkeelLocation((char)0x064C, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x064D)
            { // Tanween Kasra
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0651) // Shadda
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC5F, i - 1 - combined); // Shadda With Tanween Kasra
                        combined++;
                        continue;
                    }
                }
                tashkeelLocation.Add(new TashkeelLocation((char)0x064D, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x064E)
            { // Fatha
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0651) // Shadda
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC60, i - 1 - combined); // Shadda With Fatha
                        combined++;
                        continue;
                    }
                }

                tashkeelLocation.Add(new TashkeelLocation((char)0x064E, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x064F)
            { // DAMMA
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0651)
                    { // SHADDA
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC61, i - 1 - combined); // Shadda With DAMMA
                        combined++;
                        continue;
                    }
                }
                tashkeelLocation.Add(new TashkeelLocation((char)0x064F, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x0650)
            { // KASRA
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0651)
                    { // SHADDA
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC62, i - 1 - combined); // Shadda With KASRA
                        combined++;
                        continue;
                    }
                }
                tashkeelLocation.Add(new TashkeelLocation((char)0x0650, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x0651)
            { // SHADDA
                if (index > 0 && combineTashkeel && lastWasLetter == false)
                {
                    if (tashkeelLocation[index - 1].tashkeel == (char)0x064E) // FATHA
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC60, i - 1 - combined); // Shadda With Fatha
                        combined++;
                        continue;
                    }

                    if (tashkeelLocation[index - 1].tashkeel == (char)0x064F) // DAMMA
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC61, i - 1 - combined); // Shadda With DAMMA
                        combined++;
                        continue;
                    }

                    if (tashkeelLocation[index - 1].tashkeel == (char)0x0650) // KASRA
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC62, i - 1 - combined); // Shadda With KASRA
                        combined++;
                        continue;
                    }

                    if (tashkeelLocation[index - 1].tashkeel == (char)0x064D) // Tanween Kasra
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC5F, i - 1 - combined); // Shadda With Tanween Kasra
                        combined++;
                        continue;
                    }

                    if (tashkeelLocation[index - 1].tashkeel == (char)0x064C) // Tanween Damma
                    {
                        tashkeelLocation[index - 1] = new TashkeelLocation((char)0xFC5E, i - 1 - combined); // Shadda With Tanween Damma
                        combined++;
                        continue;
                    }
                }

                tashkeelLocation.Add(new TashkeelLocation((char)0x0651, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x0652)
            { // SUKUN
                tashkeelLocation.Add(new TashkeelLocation((char)0x0652, i - combined));
                index++;
            }
            else if (letters[i] == (char)0x0653)
            { // MADDAH ABOVE
                tashkeelLocation.Add(new TashkeelLocation((char)0x0653, i - combined));
                index++;
            }
            else if (letters[i] == (char)0xFC62)
            {
                tashkeelLocation.Add(new TashkeelLocation((char)0xFC62, i - combined));
                index++;
            }
            else if (letters[i] == (char)0xFC61)
            {
                tashkeelLocation.Add(new TashkeelLocation((char)0xFC61, i - combined));
                index++;
            }
            else if (letters[i] == (char)0xFC60)
            {
                tashkeelLocation.Add(new TashkeelLocation((char)0xFC60, i - combined));
                index++;
            }
            else
                currentIsLetter = true;

            lastWasLetter = currentIsLetter;
        }

        string[] split = str.Split(new char[]{(char)0x064B,(char)0x064C,(char)0x064D,
            (char)0x064E,(char)0x064F,(char)0x0650,
            (char)0x0651,(char)0x0652,(char)0x0653,(char)0xFC60,(char)0xFC61,(char)0xFC62});
        str = "";

        foreach (string s in split)
        {
            str += s;
        }
        return str;
    }
@ZiadJ
Copy link

ZiadJ commented Nov 23, 2018

Can that help fix issue #27 by any chance?

@MohHeader
Copy link
Contributor Author

Can that help fix issue #27 by any chance?

I had an issue like that, and I think this one would fix it, can you try and let me know ?
:)

@ZiadJ
Copy link

ZiadJ commented Dec 19, 2018

The Dec19 update seems to have fixed it. Thanks for the time you've put into it MohHeader.

@obahareth
Copy link
Contributor

obahareth commented Dec 19, 2018

@MohHeader I'm not exactly sure about what all of RemoveTashkeel is doing, but would using something like Unicode Normalization help?

I have a simple example of that here.

@MohHeader
Copy link
Contributor Author

@obahareth

RemoveTashkeel, doesn't completely removes it, but it do so as a temp state.
Then either to return it back, or not.

I think "Unicode Normalization", is what I am trying to achieve through Combine Tashkeel

Let me give you an example
if you have Fateha+Shadah ــَّــ then normal case the plugin will show them as a two separate shapes.
in some fonts, they will overlap.

the solution is easy, just replace them both with a 1 unicode shape that combine both.
the bug introduced here was due to wrong calculations when trying to return back the Tashkeel.

I tried the example you listed, it only removes the Tashkeel

@obahareth
Copy link
Contributor

@MohHeader That sounds a lot like Precomposed characters, but I don't think that's possible with Arabic characters.

So RemoveTashkeel isn't actually removing it, it's finding the locations of each Tashkeel character?

@MohHeader
Copy link
Contributor Author

I am not the author of the original RemoveTashkeel function, I just edited it to support combined Tashkeel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants