New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Counting words for Asian languages #17323
Comments
Please inform the customer of conversation # 99154 when this conversation has been closed. |
Please inform the customer of conversation # 148759 when this conversation has been closed. |
Any developments/ideas on this issue? Not being able to use Yoast with Chinese content is keeping the plugin away from 450 million internet users. |
Please inform the customer of conversation # 172656 when this conversation has been closed. |
Maybe a character count is better for Japanese/Chinese. One kanji character is usually takes up the space of 2 regular characters. |
@a4jp-com thanks for your suggestion. So each kanji character can be seen as 1 word? Or is there more to it? |
Japanese (also Chinese, Korean, Vietnamese) is a logographic language from the Han family; that means that not all characters represent morphemes: some morphemes are composed of more than one characters (see more here). @terw-dan The approach is to take the word count as character count, mostly because each character counts as one "word" and there are no spaces between characters (at least in Chinese). As for keyword analysis, if the user imputs the combination of two or more characters, that must be seen as one word. |
Individual kanji characters sometimes have a meaning but in some situations they are combined with a few other hiragana characters to make different words with different readings. Kanji only: Kanji with hiragana: Each character in either hiragana, katakana or kanji takes up the space of 2 english characters. |
Thanks both for the explanation. This will come in helpful when we start implementing and testing this. |
Is there anything I can do to help? |
@a4jp-com Thank you for your eagerness to help! At the moment, our main problem is the absence of spaces in Asian languages. Could you confirm that in Japanese there are no spaces between individual words as well. |
Sometimes a regular space is used but in other situations a Japanese space is used. Here is the encoding for the unicode character 'IDEOGRAPHIC SPACE' (U+3000) |
@a4jp-com Thank you for your reply. From googling some Japanese websites, I got the impression that spaces are generally speaking only used between sentences, but not between words. In the following sentences, for example, there is a whitespace after 、 and 。.):
However, these white spaces are part of the 、(U+3001) and 。(U+3002) characters (so the space is not a separate character). In what situations do people use the regular space or Japanese space? |
Regular spaces are used between the surname and given name. It's kind of a design choice. Regular spaces are also used when romaji or English phrases are used in advertising. |
Most of Japanese don't use space between words. Maybe Yoast can give the option; that turns off some deduction rules for Japanese. |
I've been living in Japan for 16 years and have worked as a system engineer in 3 Japanese companies. Spaces are used in sites here. Especially in pages mixed with English words. For example: |
Please inform the customer of conversation # 179998 when this conversation has been closed. |
Thank you very much @IreneStr. |
Please inform the customer of conversation # 184521 when this conversation has been closed. |
I have been following the rules set out in the plugin but I have lost 75% of my views on one Japanese site. I've gone from about 200 views a day down to about 50 views a day. The only other change I made was changing the site to HTTPS. I thought that was meant to increase the ranking. Any ideas what could be causing the problem? https://agreatdream.com/word-lists/ Is this somehow linked to the count being off? |
i wrote my blog in chinese, really really really need this function. |
@a4jp-com The wordcount is only shown as an indication. It is not something we (can) save to your post that has influence on your rankings. So it has to be something else that caused a decrease. |
I was just thinking as the numbers are wrong that when we make pages we might be adding titles that are too long etc |
I write my blog in Chinese and Japanese. There is some example of my post title for you to test. Please help to fix it , we do need this amazing function! it keeps telling us our posts are poor makes us sad. |
Please inform the customer of conversation # 192085 when this conversation has been closed. |
has it been fixed yet? guys |
I'd love to find out the character count through this plugin. |
Please inform the customer of conversation # 198257 when this conversation has been closed. |
Any ideas on checking the word count? |
Is anyone still working on this? |
Please inform the customer of conversation # 516867 when this conversation has been closed. |
Please inform the customer of conversation # 538393 when this conversation has been closed. |
Still a problem at Sep 2019. |
Please inform the customer of conversation # 565766 when this conversation has been closed. |
Please inform the customer of conversation # 585879 when this conversation has been closed. |
Is anyone working on this anymore? This has been here for about 4 years. |
This is currently not being worked on, it is a feature request that might get implemented in the future. |
Okay. Thanks for the honest reply @Djennez. I'll post this update in the plugin download area of WordPress. No one has said this up till now which isn't very good. This should have been made clear years ago. Is it okay to fork the current plugin and make a Japanese version. I'll just edit the character count code. What is the licence of the current free version? Is it a General Public License? |
Please inform the customer of conversation # 598441 when this conversation has been closed. |
Did you success on this yet? I still don't understand how this situation can be ignored by the developers for such a long time. I would never purchase a paid version of that with this issue, as it makes the plugin 50% useless. |
Please inform the customer of conversation # 607816 when this conversation has been closed. |
Can you include a conditional statement based on the page language that either counts words or characters based on the language set? Example 1:
Example 2:
I'm not sure how you have separated other languages but I'm sure you already have code like this in your plugin for other languages. |
I also need this improvement for Chinese language. |
T-T. Whatever happened to the programmer that was assigned to fixing this problem? |
Please inform the customer of conversation # 649514 when this conversation has been closed. |
Please inform the customer of conversation # 729318 when this conversation has been closed. |
Is there a way to just get a character count instead of word count? You already have the code in the plugin. You just need a function for the option. |
Please inform the customer of conversation #744721 when this conversation has been closed. |
Please inform the customer of conversation #760565 when this conversation has been closed. |
Can we find out what is happening if you already have the code please? |
Can you just simply counting words like what WordPress Editor do? |
Please inform the customer of conversation # 894142 when this conversation has been closed. |
We are going to close the issue since supporting for Asian language is a generic term as there are a whole lot languages available in Asia. Therefore, we will split the issue and open new ones for specific languages. If you want to see if Yoast SEO has support for Chinese, Korean, or Vietnamese, do feel free to open a specific issue (if there's none already) so that we can see how many users are requesting for what language. That said, Yoast SEO v18.0 has already included support for Japanese (which is one of the Asian languages as well.) |
Thanks for adding Japanese. I never noticed you did that. ♥ |
WordPress has build code to be able to count words in Asian languages. We should take this as inspiration. WordPress actually ships the word counting code only in the Chinese language pack, the third link shows where to find it.
Patch for word-count.js
https://core.trac.wordpress.org/ticket/20738
https://core.trac.wordpress.org/ticket/30966
Where to find the language pack:
https://core.trac.wordpress.org/ticket/33454
The text was updated successfully, but these errors were encountered: