Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

url part of footnotes and links shall not be checked #97

Closed
drahnr opened this issue Aug 20, 2020 · 5 comments · Fixed by #105
Closed

url part of footnotes and links shall not be checked #97

drahnr opened this issue Aug 20, 2020 · 5 comments · Fixed by #105
Assignees
Labels
bug Something isn't working checker / hunspell hunspell checker related topics checker generic checker topics help wanted 🤝 Extra attention is needed
Milestone

Comments

@drahnr
Copy link
Owner

drahnr commented Aug 20, 2020

Describe the bug

Links are also tokenized and spellchecked, which leads to false positives

error: spellcheck(Hunspell)
   --> /tmp/build/83e750f6/git-spearow-juice/coaster/src/frameworks/opencl/program.rs:10
    |
 10 |  [binary]: ../../binary/index.html
    |                               ^^^^
    | - HTML
    |
    |   Possible spelling mistake found.

with a source snippet

#[derive(Debug, Copy, Clone)]
/// Defines a OpenCL Program.
///
/// A Program is OpenCL's version of Coaster's [binary][binary].
/// [binary]: ../../binary/index.html
pub struct Program {
    id: isize,
}

To Reproduce

Steps to reproduce the behaviour:

  1. Run 'cargo spellcheck -vvv' on a doc comment that contains a footnote / link
  2. See error

Expected behavior

Ignore the url part of the link index, the title of the link is already checked of the inline test but for sake of interactive convenience and not breaking the structure, it should be suggested in both cases.

Please complete the following information:

  • System: Fedora 32 x86_64
  • Obtained: git
  • Version: ab2819f595872ea12d52b823f7d09a78c91085eb

Additional context
https://ci.spearow.io/teams/main/pipelines/cargo-spellcheck/jobs/pr-validate/builds/282

@drahnr drahnr added the bug Something isn't working label Aug 20, 2020
@drahnr drahnr added this to the v0.5.0 milestone Aug 20, 2020
@drahnr drahnr self-assigned this Aug 20, 2020
@drahnr
Copy link
Owner Author

drahnr commented Aug 20, 2020

@laysauchoa another topic that would be up for grabs :) , it's a bit more involved with the common mark parser interaction, but limited in scope

Preparation steps I would imagine:

  • create a unit test to make this easily reproducible
  • investigate how the link is represented in the common mark parser (possibly using the debugger and stepping through the unit test)

@drahnr drahnr changed the title footnotes and links shall not be tokenized url part of footnotes and links shall not be checked Aug 20, 2020
@drahnr drahnr added checker generic checker topics checker / hunspell hunspell checker related topics help wanted 🤝 Extra attention is needed labels Aug 20, 2020
@laysauchoa
Copy link
Collaborator

cc @drahnr

@drahnr
Copy link
Owner Author

drahnr commented Aug 20, 2020

There is the markdown.rs / struct PlainOverlay which is kind of a filtering view put on top of a CheckableChunk which erase all markdown syntax, we could simply mask the url part in the parser output there (not sure where exactly the root of the issue is there, investigation needed on how pulldown_cmark works and which Tags and Events are yielded).

All the common mark / markdown parsing is done there, so the solution is guaranteed to be in that file :)

#44 is loosely related, it checks if the referenced file or website exists, where this is for the string representing the url rather than the destination.

The knowledge required to fix #44 and #97 is pretty much the same :) so it would make sense to tackle one after another.

@laysauchoa
Copy link
Collaborator

cc @drahnr
Sorry if I am asking something repetitive just to be clear about it. When you say to tackle one after another, do the previous issue comes first or this one? Which one you recommend doing it first?

Many thanks!

@drahnr
Copy link
Owner Author

drahnr commented Aug 21, 2020

This one first, it has practical impact, #44 is a nice to have feature.
What I meant is: Finish this one before starting the other, so the commits are clearly separated and their purpose is clear.

laysauchoa added a commit to laysauchoa/cargo-spellcheck that referenced this issue Aug 23, 2020
Handle links for being tokenized and spellchecked,
which can lead to false positives.

Issue description:
drahnr#97

Signed-off-by: Laysa Uchoa <laysa.uchoa@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working checker / hunspell hunspell checker related topics checker generic checker topics help wanted 🤝 Extra attention is needed
Projects
None yet
2 participants