Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration of cargo-spellcheck #832

Closed
wants to merge 17 commits into from
Closed

Configuration of cargo-spellcheck #832

wants to merge 17 commits into from

Conversation

TomaszWaszczyk
Copy link
Contributor

My second try to configure the tool ;-)

@HCastano HCastano added the A-feat New feature or request label Mar 18, 2021
@HCastano HCastano changed the title feat: (config): Proposal to configuration of cargo-spellcheck Configuration of cargo-spellcheck Mar 18, 2021
@@ -0,0 +1,88 @@
90
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, where'd you get this list from?

Also, what's with the /<letter> at the end of certain words?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kinda...it is called Hunspell and by that letters it is allowed to specify kind of dictionary like "en_US". Paste from the documentation:

If the .dic files contains incorrect rules for a word, then add the word with the correct rules to .dic_delta. For example, if en_US.dic contains the entry raccoon/M, but the rule should be /MS, then add raccoon/MS to en_US.dic_delta. Rules for entries in .dic_delta override the rules for entries in .dic.

Source: https://www.chromium.org/developers/how-tos/editing-the-spell-checking-dictionaries

What is the source of the list? Well, long story short, around 2 months ago I was thinking that the task will take me ~1h and started to do it, but it was not so simple and was needed to leave it, but I to follow and look at Parity's repositories and I found that PR: https://github.com/paritytech/polkadot/pull/1841/files and my dream is to do at least one merge/ed PR and when I saw the PR I immediately made my PR.

That is it, that's the story.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kinda...it is called Hunspell and by that letters it is allowed to specify kind of dictionary like "en_US". Paste from the documentation:

If the .dic files contains incorrect rules for a word, then add the word with the correct rules to .dic_delta. For example, if en_US.dic contains the entry raccoon/M, but the rule should be /MS, then add raccoon/MS to en_US.dic_delta. Rules for entries in .dic_delta override the rules for entries in .dic.

Source: https://www.chromium.org/developers/how-tos/editing-the-spell-checking-dictionaries

I still don't fully get it, but good to know there's a source for the strangeness 😅

What is the source of the list? Well, long story short, around 2 months ago I was thinking that the task will take me ~1h and started to do it, but it was not so simple and was needed to leave it, but I to follow and look at Parity's repositories and I found that PR: https://github.com/paritytech/polkadot/pull/1841/files and my dream is to do at least one merge/ed PR and when I saw the PR I immediately made my PR.

That is it, that's the story.

Haha, sounds good! Keep in mind that you will need to add some "bridge" specific words in there though (e.g our test networks Rialto and Millau). I'm not sure what other false positives you'll get, but I imagine there'll be a few.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spellcheck.mp4

@HCastano I wanted to check output from the commands like in recording but I got empty output, I have missed something all it works like supposed to work?

@HCastano
Copy link
Contributor

HCastano commented Apr 7, 2021

@TomaszWaszczyk Hey, any updates here?

@TomaszWaszczyk
Copy link
Contributor Author

I have added specific to the bridge words, within 36h will check if occurs some false positives and if needed commit. Is it a good direction?

@tomusdrw
Copy link
Contributor

Looks good to me, can you add a job on CI to make sure it's being checked?

@TomaszWaszczyk
Copy link
Contributor Author

Sure, will do that within coming days.

@TomaszWaszczyk
Copy link
Contributor Author

I have changed a way how to install cargo-spellcheck on CI because via rustup component add cargo-spellcheck it was not working.

In case of any hints how can I improve the PR let me know, please ;-)

PS. Feel free to suggest me next issue that you know will learn me more about the project.

@TomaszWaszczyk TomaszWaszczyk marked this pull request as ready for review April 15, 2021 12:20
Copy link
Contributor

@tomusdrw tomusdrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'm kind of suspicious that it didn't fail on anything - does it mean that our entire documentation is typo-free? I find it quite hard to believe, but 🤷 🎉.

PS. Feel free to suggest me next issue that you know will learn me more about the project.

I've recently created a separate milestone for small improvements like that, feel free to take a look.

Copy link
Contributor

@HCastano HCastano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty suspicious of the fact that we got no typos or anything of the sort.

It doesn't look like we've installed an en_US dictionary on the build machine. I figured hunspell would come with a default, but maybe I'm mistaken.

@drahnr Can you confirm if we need to install an external dictionary?

.github/workflows/lint.yml Outdated Show resolved Hide resolved
Co-authored-by: Hernando Castano <HCastano@users.noreply.github.com>
.github/workflows/lint.yml Outdated Show resolved Hide resolved
@drahnr
Copy link

drahnr commented Apr 16, 2021

I'm pretty suspicious of the fact that we got no typos or anything of the sort.

Depends on your configuration and the quality of your list, I'll take a peek into this at some point today :)

It doesn't look like we've installed an en_US dictionary on the build machine. I figured hunspell would come with a default, but maybe I'm mistaken.

This should yield an error for sure if there is no dictionary found.

@drahnr Can you confirm if we need to install an external dictionary?

Currently you need an external dictionary. This is going to change with the next release (v0.8.0 which also fixes a few other kinks) and the fedora local en_US.aff and en_US.dic will be used by default and can be overridden iff a local dictionary is found.

@drahnr
Copy link

drahnr commented Apr 16, 2021

It's very unlikely that cargo-spellcheck is happy on the first shot 😁

@drahnr
Copy link

drahnr commented Apr 16, 2021

This is a bug in cargo-spellcheck, it does not handle glob patterns in workspaces just yet drahnr/cargo-spellcheck#166 which I just fixed with v0.8.0-beta.2. It's on crates-io now :)

@drahnr
Copy link

drahnr commented Apr 16, 2021

Builtin dictionary and affix files are on their way: drahnr/cargo-spellcheck#168 should be ready at some point later today with v0.8.0-beta.3.

Edit: released

uses: actions-rs/cargo@master
with:
command: spellcheck
args: check -m 1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding -vv to have some additional output in CI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have looked at the docs and there is written:

To increase verbosity add -v (multiple) to increase verbosity.

That is why I added -v, correct me please if I am wrong with that one -v

Copy link

@drahnr drahnr Apr 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-q nothing
- error
-v warn
-vv info
-vvv debug
and any more than that will probably kill the output buffer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Oh, my goodness! There is around 400 places to improve:

image

But as I said A I will say Z, will deliver it, take few days but will deliver.

@drahnr
Copy link

drahnr commented Apr 19, 2021

Had a brief peek into the results and they look good to me. There are a bunch of -> and <-> in the comments and the new tokenizer seems to happily chop around before > and < characters and then mark < as issues. This can be resolved by adding a single character quirk to hunspell.

@HCastano
Copy link
Contributor

Closing this in favour of #924.

@HCastano HCastano closed this Apr 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-feat New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants