Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Page Analysis is not correct with Cyrillic alphabet - not counting in the first paragraph and keyword density #349

Closed
todorchrisov opened this issue Oct 28, 2013 · 8 comments

Comments

@todorchrisov
Copy link

yoast-cyrillic-bug
Hi Yoast!

Today I finally installed WordPress SEO and it is great except one thing - the Page Analysis is not working correctly with Cyrillic alphabet.

More specifically:

  1. "The keyword doesn't appear in the first paragraph of the copy, make sure the topic is clear immediately." - not true. It is there 2 times. The very first word of the first paragraph is the keyword, followed by another mention in the first paragraph.
  2. "The keyword density is 0%, which is a bit low, the keyword was found 19 times." - not true. The keyword was found 19 times in a text with 355 words and that is around 4-5%.

Is there something I could do in the settings of the plugin to fix this?

My blog is in Bulgarian language.

Best regards,

Todor

@jrfnl
Copy link
Contributor

jrfnl commented Feb 27, 2014

@todorchrisov Could you provide me with some example texts and keyword combinations for me to test with ?

@jrfnl
Copy link
Contributor

jrfnl commented Mar 4, 2014

Based on the sample texts I received here and in some related issues, I have made some small improvements to the keyword density calculation for non-latin, non-ideograph based languages by means of commit 97d0973.
Hopefully that will yield more consistent results. The changes are included in the the v1.5 branch. You will need to re-download the branch if you want to test the changes.

Related issues: #703, #681, #349, #264 and #145.

@jrfnl
Copy link
Contributor

jrfnl commented Mar 4, 2014

Would you be willing to test this yourself ? You can download the v1.5 branch here in GitHub. Please do make a backup of the database before testing and don't test in a production/live environment.

barrykooij added a commit that referenced this issue Mar 16, 2014
Related #707
Related #720
Related #349
Related #729
Related #703
Related #264
Related #756
@DimasRadene
Copy link

spesialisobat.com
Greetings Teacher ... I Dimas From Indonesia, please help, why I Sala Mereh In "The keyword does not Appear in the first paragraph of the copy, make sure the topic is clear immediately." Seo

@OwlBawl
Copy link

OwlBawl commented May 7, 2014

met the same problem, also I can add that "prevent stop words" in links doesn't work with Cyrillic

@jdevalk
Copy link
Contributor

jdevalk commented Aug 8, 2014

I'm not great with testing for cyrillic, but I think we fixed a lot of this. Could someone do some testing here?

@Rarst
Copy link
Contributor

Rarst commented Sep 10, 2015

I've tested current version with some cyrillic content. The count works correctly. The first paragraph thing now has opposite issue — page analysis says keyword is found in first paragraph, even when it's not in it.

@Rarst
Copy link
Contributor

Rarst commented Jun 10, 2016

Closing as text analysis engine had been replaced since.

@Rarst Rarst closed this as completed Jun 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants