New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Pod::Spell ignores stopwords in pod #9
Comments
|
weird... will look this weekend |
|
didn't not have appropriate tuit levels this weekend, there's no chance you could write a failing test is there? and since I noticed you commented on "ordered stopwords" problem, do you think that has something to do with this ( possible it is not exasperated by lexical scoping of stopwords ) |
|
xt/author/pod-spell.t from YAML-Tiny 1.55 will trigger this issue; the word "embeddable" is listed as a stopword in lib/YAML/Tiny.pm but it's still complained about. |
|
I'll poke at it a bit as well and see what I find. |
|
Got it: "easily-embeddable". In 1.07, learned words are now isolated to an object. The global stoplist is not modified. During processing, Pod::Spell & Pod::Wordlist grabs chunks of \S+ to test against the wordlist (after a tiny bit of cleanup). So "embeddable" doesn't get stripped. The spellchecker does split on hyphen and reports the bad part. Test::Spelling then does a second pass against the global stoplist, which never learned embeddable. Options:
There's also the option of modifying the global stopword list when learning, but I tend to think we should stick with a gradual process of making the global stopword list private. I'll see if I can bang-out a test and fix for the Pod::Wordlist part, but someone should go write a patch for Test::Spelling to use the new API and stop messing with the global, because it is likely to become a lexical at some point anyway. |
|
https://rt.cpan.org/Ticket/Display.html?id=57492 < guessing this may or may not be relevant. But something to consider. It's very likely that Pod::Spell needs to make 2 passes, which is what I thought was the cause of that issue a while back. |
|
That will do it. Now we just need it shipped. |
|
I still have a bit of an issue here. On Fedora, the default speller is hunspell rather than aspell, and since Fedora 17 its behaviour is to complain about a few things in the YAML::Tiny Pod, such as: and Given this, hunspell reports two mis-spellings: Now it's arguable that this is an issue with hunspell and/or its dictionaries, but regardless of that I should be able to shut it up by specifying stopwords. If I add stopwords embedded in the Pod, the test still fails, but if I add the two failing character sequences to the list of stopwords at the end of xt/author/pod-spell.t, the test passes. So there's an inconsistency there. |
|
@pghmcfc assuming that's Test::Spelling (or a dzil plugin that uses it), you're possibly seeing the "two pass" approach that Test::Spelling uses. First it runs POD through Pod::Spell to strip stopwords, then puts it to the spell checker, and then checks the result against the stopwords again to catch things the dictionary might have split up. I think what's going on for I'm not sure about the David |
|
I've pushed up some commits that deal with these issues, mostly by stripping out "words" that are all punctuation or just a lone |
|
it appears some of the changes have somehow caused a regression |
|
nvm, it's not so much of a regression as an oversight, stoplist didn't used to be in the pod. (lol DOH!) |
|
With Pod::Spell 1.09 I'm now seeing test failures on Fedora prior to Fedora 17 (and all Red Hat Enterprise Linux versions) in Pod-Spell's own spell test: |
|
@pghmcfc, that's a weird bug, but can I step back and ask why you're running release and author tests? Per the Lancaster Consensus definitions of the semantics of those variables, there's really no reason to have it set (unless you're actively developing the module):
|
|
I'm the downstream maintainer of about 100 modules in Fedora, including a bunch of testing modules like Test::Spelling, so I tend to run author/release tests where they're provided as they make for good test cases for the test modules (and their dependencies) themselves. That's why I run them despite knowing that I'm not really supposed to do it. |
|
Fair enough. Looking at the Pod::Spell pod, I see where that string is, but I can't see why it's not getting flagged for me. I wonder if it's a matter of the version of Pod::Parser that's doing it. I'll see if I can replicate it somehow. |
|
I just checked Pod::Parser, no dice, and that version isn't even on CPAN anymore. In an effort to see if it goes away... I suspect this is a bug in a really old version of something. also we have no unicode tests in the test suite, should probably fix that. |
|
Found it. It's an encoding issue. The test should be failing, but somehow, on my setup, it's not. Why it's failing for @pghmcfc is probably some sort of locale issue or aspell version or something. I'll open a separate ticket. |
|
Possibly; I don't see failures on recent (last 12 months) versions of Fedora but I do everywhere else. Locale should be the same in all cases but aspell/hunspell (which we use in Fedora where possible) may have different versions/dictionaries. I've also come across another issue with 1.09, this time in our development version with 5.18.1 and mostly up to date versions of everything, running the spell check for Perl-OSType: Looks like the trailing single quote hasn't been stripped. |
|
Trailing ' still a problem: |
|
Ugh. I've taken another shot at this with f340036 The lack of decent test corpus hurts, so the more of these we discover, the easier it is to fix. |
|
That's cracked it. Current git master runs its own test suite plus those of YAML-Tiny and Perl-OSType successfully for me with current Test-Spelling on all Fedora releases from Fedora 3 (5.8.5) to the current development version (5.18.1), plus Red Hat Enterprise Linux 4, 5 and 6. Thanks! |
When I upgraded from Pod-Spell-1.06 to 1.07, stopwords embedded in the pod stopped being recognized.
Downgrading made things work again.
More details in xenoterracide/Dist-Zilla-Plugin-Test-PodSpelling#18
The text was updated successfully, but these errors were encountered: