Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix spelling and grammar issues #2888

Merged
merged 2 commits into from
Dec 7, 2020
Merged

Fix spelling and grammar issues #2888

merged 2 commits into from
Dec 7, 2020

Conversation

p-e-w
Copy link
Contributor

@p-e-w p-e-w commented Dec 4, 2020

This PR fixes a bunch of spelling and grammar issues that I found with LanguageTool using a semi-automated process that I will describe in detail below.

I have focused on issues that are obvious mistakes, rather than just poor style or against some convention. As such, this PR does not attempt to correct:

  • British/American spelling inconsistencies, of which there are many.
  • Violations of Oxford ise/ize word ending conventions.
  • Missing commas, particularly in conjunctive clauses.
  • Improper punctuation, such as hyphens being used in place of em-dashes.

The workflow described below finds hundreds of these issues, which may or may not be worth resolving as well. Note that I have thus far only looked at documents from the specification root directory, the appendices subdirectory, and the file modules/end_to_end_encryption.rst, as I unfortunately don't have more time to work on this for now.


Workflow

Spell- and grammar-checking technical documents is challenging because symbols and jargon invariably generate an overwhelming number of false positives. The following setup works quite well for the Spec:

  1. Install Pandoc and Java.
  2. Download and unpack the standalone "Desktop" version of LanguageTool from here.
  3. Add this custom Matrix.org Jargon Dictionary to LanguageTool's bundled Hunspell version. The easiest way to accomplish this is by replacing the contents of org/languagetool/resource/en/hunspell/ignore.txt with the contents of that Gist. This dramatically reduces the number of false positives.
  4. For each Spec document, run pandoc -t plain PATH/TO/SPEC/DOCUMENT.rst | java -jar languagetool-commandline.jar -l en-GB -d 'WHITESPACE_RULE,DASH_RULE,EN_QUOTES,MULTIPLICATION_SIGN,WORD_CONTAINS_UNDERSCORE,ENGLISH_WORD_REPEAT_BEGINNING_RULE,COMMA_PARENTHESIS_WHITESPACE,SENTENCE_WHITESPACE,UPPERCASE_SENTENCE_START,PLUS_MINUS,ARROWS,OXFORD_SPELLING_ISE_VERBS,OXFORD_SPELLING_NOUNS,OXFORD_SPELLING_ADJECTIVES,LICENCE_LICENSE_NOUN_SINGULAR,AN_INVITE,UNLIKELY_OPENING_PUNCTUATION,EN_UNPAIRED_BRACKETS,ANYMORE'.

The raised issues are of fairly high quality, with roughly 30-40% of them being actual problems that warrant fixing.


Signed-off-by: Philipp Emanuel Weidmann pew@worldwidemann.com

Signed-off-by: Philipp Emanuel Weidmann <pew@worldwidemann.com>
@p-e-w
Copy link
Contributor Author

p-e-w commented Dec 4, 2020

Do trivial changes like this require a news fragment? I've read CONTRIBUTING.rst but the answer is still not clear to me.

@turt2live turt2live self-requested a review December 4, 2020 14:13
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I'll take care of the changelogs before merging.

specification/proposals_intro.rst Outdated Show resolved Hide resolved
@turt2live turt2live self-assigned this Dec 4, 2020
Co-authored-by: Kitsune Ral <Kitsune-Ral@users.sf.net>
@turt2live turt2live merged commit ba6a72f into matrix-org:master Dec 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants