Can we skip files which end in ".pem/.crt" #2135

clickthisnick · 2021-11-03T18:17:31Z

What are people's thought on skipping files that end with ".pem" and ".crt" so that certificates and things like that don't get false flagged on accident?

vikivivi · 2021-11-04T07:13:22Z

You might want to see "skip" in https://github.com/codespell-project/codespell/blob/master/README.rst

clickthisnick · 2021-11-04T12:39:28Z

ya that's what we are doing - didn't know if the community thought it would be okay to default skip without that explicitly set tho

peternewman · 2021-11-04T15:30:09Z

If I look at some random .pem and .crt files, some do have some plain English in them too, although mostly just the example ones. Is there some reason they shouldn't be scanned automatically?

Also what's it tripping up on them, two letter character combinations? Can we resolve it by just moving them to the code dictionary?

clickthisnick · 2021-11-04T16:57:18Z

ya its a bunch of 2/3 letters things like FLE -> FILE, we started enabling codespell automatically on a bunch of repos and people have been fixing typos in their testing/dummy certs and then wonder why they are then broken/invalid

I don't think moving to code dictionary would work as likely fle is a typo.

looking at my specific example the cert has a line FLE+blah and FLE is being flagged. It seems like + is a delimiter like space so FLE is considered a word, but I wonder if it should be?

peternewman · 2021-11-11T12:14:36Z

ya its a bunch of 2/3 letters things like FLE -> FILE, we started enabling codespell automatically on a bunch of repos and people have been fixing typos in their testing/dummy certs and then wonder why they are then broken/invalid

Oh dear. I was going to suggest something clever for hex, then realised it's base 64 so that's a non-starter.

I don't think moving to code dictionary would work as likely fle is a typo.

Yeah agreed, again if it was just hex we could do clever stuff, but it's every typo.

looking at my specific example the cert has a line FLE+blah and FLE is being flagged. It seems like + is a delimiter like space so FLE is considered a word, but I wonder if it should be?

I think you want it to be, so you catch typos in your variables when you're doing foo+bar=baz.

I'm sort of ambivalent either way to this personally, perhaps we should have a straight vote; 👍 or 👎 on @clickthisnick first post in this topic as to whether we should change the default skip (when nothing is set) to include these types of files.

If we do so, we should probably make sure it logs the files its skipping by default, so we're not silently hiding some typos.

matkoniecz · 2021-11-12T08:09:39Z

If skipping would be automatically done: would there be any way to actually scan .pem/.crt files?

I see no overriding of skip in parameters (which could be useful BTW, thugh workaround of multiple codespell is also viable)

And codespell */**/*.crt would not scan crt file two folders deep.

clickthisnick · 2021-11-12T16:15:54Z

After reading the Jupyter notebook filter issue, having to maintain and include a bunch of custom file extensions in the core product would be annoying and time consuming.

For my usecase we had a script add the codespell config to repos (via pre-commit), we can def just ignore the specific extensions we have found to be problematic in our specific environment, rather than make this tool much more complicated

clickthisnick · 2021-11-12T16:16:42Z

I'm okay with closing this issue, and saying its up to the user to use the tool in the best way that they best see fit, rather than edit the tool to take a non intuitive action for each specific case

peternewman · 2021-11-13T16:00:56Z

If skipping would be automatically done: would there be any way to actually scan .pem/.crt files?

Possibly not with how it's written currently, but we could set things up so the default skip argument was to skip those two extensions (and maybe .git)? If you then supplied any skip argument, it would be cancelled, but you could skip them manually there, as well as what you wanted to skip.

After reading the Jupyter notebook filter issue, having to maintain and include a bunch of custom file extensions in the core product would be annoying and time consuming.

Personally I wouldn't be so against it for something like this, which has a far broader usage, at least in the sense nearly everyone uses certs, but perhaps not many people scan them with Codespell. I guess we need to work out if they are extensions to codespell (i.e. special processing via a module/function when it matches a particular type of file), or using codespell in external tools.

For my usecase we had a script add the codespell config to repos (via pre-commit), we can def just ignore the specific extensions we have found to be problematic in our specific environment, rather than make this tool much more complicated

That's great. You could also possibly look at an ignore regex to match the header, base64, footer pattern, which would still find typos elsewhere in those files.

peternewman added the enhancement label Nov 11, 2021

peternewman mentioned this issue Nov 12, 2021

Jupyter notebook filter to only spell check inside cell inputs #2138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we skip files which end in ".pem/.crt" #2135

Can we skip files which end in ".pem/.crt" #2135

clickthisnick commented Nov 3, 2021

vikivivi commented Nov 4, 2021

clickthisnick commented Nov 4, 2021

peternewman commented Nov 4, 2021

clickthisnick commented Nov 4, 2021

peternewman commented Nov 11, 2021

matkoniecz commented Nov 12, 2021

clickthisnick commented Nov 12, 2021

clickthisnick commented Nov 12, 2021 •

edited

Loading

peternewman commented Nov 13, 2021

Can we skip files which end in ".pem/.crt" #2135

Can we skip files which end in ".pem/.crt" #2135

Comments

clickthisnick commented Nov 3, 2021

vikivivi commented Nov 4, 2021

clickthisnick commented Nov 4, 2021

peternewman commented Nov 4, 2021

clickthisnick commented Nov 4, 2021

peternewman commented Nov 11, 2021

matkoniecz commented Nov 12, 2021

clickthisnick commented Nov 12, 2021

clickthisnick commented Nov 12, 2021 • edited Loading

peternewman commented Nov 13, 2021

clickthisnick commented Nov 12, 2021 •

edited

Loading