Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect Gunning Fog Calculation #70

Closed
bradfordlynch opened this issue Dec 8, 2018 · 1 comment
Closed

Incorrect Gunning Fog Calculation #70

bradfordlynch opened this issue Dec 8, 2018 · 1 comment

Comments

@bradfordlynch
Copy link
Contributor

In the FOG calculation, complex words are defined as those words having 3 or more syllables. Not that Wikipedia is the best source of information, but see list item 3 here. But, as you can see in the snippet below from textstat.py the threshold for a difficult word is: not in the easy_word_set and more than one syllable. (Lines 254-255)

    @repoze.lru.lru_cache(maxsize=128)
    def difficult_words(self, text):
        text_list = re.findall(r"[\w\='‘’]+", text.lower())
        diff_words_set = set()
        for value in text_list:
            if value not in easy_word_set:
                if self.syllable_count(value) > 1:
                    diff_words_set.add(value)
        return len(diff_words_set)

    @repoze.lru.lru_cache(maxsize=128)
    def dale_chall_readability_score(self, text):
        word_count = self.lexicon_count(text)
        count = word_count - self.difficult_words(text)

        try:
            per = float(count) / float(word_count) * 100
        except ZeroDivisionError:
            return 0.0

        difficult_words = 100 - per

        score = (
            (0.1579 * difficult_words)
            + (0.0496 * self.avg_sentence_length(text)))

        if difficult_words > 5:
            score += 3.6365
        return legacy_round(score, 2)

    @repoze.lru.lru_cache(maxsize=128)
    def gunning_fog(self, text):
        try:
            per_diff_words = (
                (self.difficult_words(text) / self.lexicon_count(text) * 100))

            grade = 0.4 * (self.avg_sentence_length(text) + per_diff_words)
            return legacy_round(grade, 2)
        except ZeroDivisionError:
            return 0.0`
@bradfordlynch
Copy link
Contributor Author

bradfordlynch commented Dec 8, 2018

I created PR #71 for a fix to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant