New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readability statistics #47
Comments
On a related note, I'm not wild about referring to the This is also relevant to my goal of supporting externally-defined checks that don't directly use one of the extension points (see #45 (comment) for details). |
This holds block-level content (i.e., it excludes headings, lists, and table cells) which is meant to processed for summary statistics like readability scores. Related to #47.
This holds block-level content (i.e., it excludes headings, lists, and table cells) which is meant to processed for summary statistics like readability scores. Related to #47.
Question: Does this "plugin" ignore content in Markdown that does not appear in a doc build? I'm thinking about links and descriptions such as alt text. In other words, would a page full of links like some word bias the results? IIRC, the Flesch-Kincaid calculations would read bits like the relative path URL as a single (complicated) word. Example: when I run https://developer.cobalt.io/getting-started/sign-in/ through:
My wild guess: Vale's flesch-kincaid plugin also reads link text in markdown, such as [some word](../path/to/something-complex) as single words, which would increase the score. |
Thanks for posting this question, @mjang. (For context: we've been chatting on Slack and spitballing ideas of why the scores differ.) Another idea: I wonder if the web tools are also counting sidebars and menus. 🤔 Those could distort scores in one direction or another. Some examples:
|
Yes -- Vale tries to be as accurate as possible when calculating these metrics. It uses its There's a few problems with the comparison to WebFX:
Here's an example HTML document (a snippet from <p>Organizations coming to Git from other version control systems frequently find it hard to develop a productive workflow.
This article describes GitLab flow, which integrates the Git workflow with an issue tracking system.
It offers a transparent and effective way to work with Git:</p>
<pre><code class="language-mermaid">graph LR
subgraph Git workflow
A[Working copy] --> |git add| B[Index]
B --> |git commit| C[Local repository]
C --> |git push| D[Remote repository]
end
</code></pre>
Let's break this down:
If we pass just the "correct" text to WebFx, it changes its calculations to 3, 44, and 10.2. The score difference is likely from the calculation of "complex words" and syllables, but it's much closer. |
I'm reopening this issue because I think it would be useful to add a "View: Readability" option to https://vale-studio.errata.ai/. |
To extend the discussion from the Write the Docs slack: I need to be able to do an "apples to apples" comparison of Flesch-Kincaid scores. And it's at best difficult to apply the Vale plugin to HTML content (Sure, I could pull the source code from external HTML into a repo, but that requires understanding git, repos, and Vale). So I need to know -- do you have / know of a Web tool that shows consistent results to your Flesch-Kincaid plugin? |
I think you have forgot to add this extension in the documentation. |
I'm thinking about including a new
readability
extension point that will allow users to set standards for metrics like Flesch-Kincaid, Gunning-Fog, and Coleman-Liau. For example,This would warn about any paragraphs that exceed a reading level of 8th grade.
The
prose
library already supports these metrics, so it's just a matter of deciding on the check implementation details.The text was updated successfully, but these errors were encountered: