-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add a doc_paragraphs_missing_punctuation
lint
#15758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
7fc38a7
abdedec
7678f3b
e34a90b
36df5d6
d35076e
f246313
2ec2671
53fd302
76fb0ce
52aa4a5
2f9342e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,126 @@ | ||
use rustc_errors::Applicability; | ||
use rustc_lint::LateContext; | ||
use rustc_resolve::rustdoc::main_body_opts; | ||
|
||
use rustc_resolve::rustdoc::pulldown_cmark::{Event, Options, Parser, Tag, TagEnd}; | ||
|
||
use super::{DOC_PARAGRAPHS_MISSING_PUNCTUATION, Fragments}; | ||
|
||
const MSG: &str = "doc paragraphs should end with a terminal punctuation mark"; | ||
const PUNCTUATION_SUGGESTION: char = '.'; | ||
|
||
pub fn check(cx: &LateContext<'_>, doc: &str, fragments: Fragments<'_>) { | ||
for missing_punctuation in is_missing_punctuation(doc) { | ||
match missing_punctuation { | ||
MissingPunctuation::Fixable(offset) => { | ||
// This ignores `#[doc]` attributes, which we do not handle. | ||
if let Some(span) = fragments.span(cx, offset..offset) { | ||
clippy_utils::diagnostics::span_lint_and_sugg( | ||
cx, | ||
DOC_PARAGRAPHS_MISSING_PUNCTUATION, | ||
span, | ||
MSG, | ||
"end the doc comment with some punctuation", | ||
PUNCTUATION_SUGGESTION.to_string(), | ||
Applicability::MaybeIncorrect, | ||
); | ||
} | ||
}, | ||
MissingPunctuation::Unfixable(offset) => { | ||
// This ignores `#[doc]` attributes, which we do not handle. | ||
if let Some(span) = fragments.span(cx, offset..offset) { | ||
clippy_utils::diagnostics::span_lint_and_help( | ||
cx, | ||
DOC_PARAGRAPHS_MISSING_PUNCTUATION, | ||
span, | ||
MSG, | ||
None, | ||
"end the doc comment with some punctuation", | ||
); | ||
} | ||
}, | ||
} | ||
} | ||
} | ||
|
||
#[must_use] | ||
/// If punctuation is missing, returns the offset where new punctuation should be inserted. | ||
fn is_missing_punctuation(doc_string: &str) -> Vec<MissingPunctuation> { | ||
// The colon is not exactly a terminal punctuation mark, but this is required for paragraphs that | ||
// introduce a table or a list for example. | ||
const TERMINAL_PUNCTUATION_MARKS: &[char] = &['.', '?', '!', '…', ':']; | ||
|
||
let mut no_report_depth = 0; | ||
let mut missing_punctuation = Vec::new(); | ||
let mut current_paragraph = None; | ||
|
||
for (event, offset) in | ||
Parser::new_ext(doc_string, main_body_opts() - Options::ENABLE_SMART_PUNCTUATION).into_offset_iter() | ||
Comment on lines
+57
to
+58
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Double parsing every doc comment is not ideal, could this be tweaked into a state machine struct that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before adding support for checking each paragraph, I had added a short-circuit in cases where punctuation was obviously correct. With this new check this is not possible anymore however. What are the performance implications when the lint is not enabled by users? Does it still run? It is my impression that this used to be the case but is not anymore. If not, do you really want this in this PR, or can this be left for a separate refactor, to avoid modifying multiple lints at once? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It runs as long as any of the lints in the Yeah sure it can be deferred, I just spotted that there's another lint that also separately parses the comments in this module so we could do them both at once |
||
{ | ||
match event { | ||
Event::Start( | ||
Tag::CodeBlock(..) | ||
| Tag::FootnoteDefinition(_) | ||
| Tag::Heading { .. } | ||
| Tag::HtmlBlock | ||
| Tag::List(..) | ||
| Tag::Table(_), | ||
) => { | ||
no_report_depth += 1; | ||
}, | ||
Event::End(TagEnd::FootnoteDefinition) => { | ||
no_report_depth -= 1; | ||
}, | ||
Event::End( | ||
TagEnd::CodeBlock | TagEnd::Heading(_) | TagEnd::HtmlBlock | TagEnd::List(_) | TagEnd::Table, | ||
) => { | ||
no_report_depth -= 1; | ||
current_paragraph = None; | ||
}, | ||
Event::InlineHtml(_) | Event::Start(Tag::Image { .. }) | Event::End(TagEnd::Image) => { | ||
current_paragraph = None; | ||
}, | ||
Event::End(TagEnd::Paragraph) => { | ||
if let Some(mp) = current_paragraph { | ||
missing_punctuation.push(mp); | ||
} | ||
}, | ||
Event::Code(..) | Event::Start(Tag::Link { .. }) | Event::End(TagEnd::Link) | ||
if no_report_depth == 0 && !offset.is_empty() => | ||
{ | ||
if doc_string[..offset.end] | ||
.trim_end() | ||
.ends_with(TERMINAL_PUNCTUATION_MARKS) | ||
{ | ||
current_paragraph = None; | ||
} else { | ||
current_paragraph = Some(MissingPunctuation::Fixable(offset.end)); | ||
} | ||
}, | ||
Event::Text(..) if no_report_depth == 0 && !offset.is_empty() => { | ||
let trimmed = doc_string[..offset.end].trim_end(); | ||
if trimmed.ends_with(TERMINAL_PUNCTUATION_MARKS) { | ||
current_paragraph = None; | ||
} else if let Some(t) = trimmed.strip_suffix(|c| c == ')' || c == '"') { | ||
if t.ends_with(TERMINAL_PUNCTUATION_MARKS) { | ||
// Avoid false positives. | ||
current_paragraph = None; | ||
} else { | ||
current_paragraph = Some(MissingPunctuation::Unfixable(offset.end)); | ||
} | ||
} else { | ||
current_paragraph = Some(MissingPunctuation::Fixable(offset.end)); | ||
} | ||
}, | ||
_ => {}, | ||
} | ||
} | ||
|
||
missing_punctuation | ||
} | ||
|
||
#[derive(Debug, Copy, Clone, PartialEq, Eq)] | ||
enum MissingPunctuation { | ||
Fixable(usize), | ||
Unfixable(usize), | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,172 @@ | ||
#![feature(custom_inner_attributes)] | ||
#![rustfmt::skip] | ||
#![warn(clippy::doc_paragraphs_missing_punctuation)] | ||
|
||
/// Returns the Answer to the Ultimate Question of Life, the Universe, and Everything. | ||
//~^ doc_paragraphs_missing_punctuation | ||
fn answer() -> i32 { | ||
42 | ||
} | ||
|
||
/// The `Option` type. | ||
//~^ doc_paragraphs_missing_punctuation | ||
// Triggers even in the presence of another attribute. | ||
#[derive(Debug)] | ||
enum MyOption<T> { | ||
/// No value. | ||
//~^ doc_paragraphs_missing_punctuation | ||
None, | ||
/// Some value of type `T`. | ||
Some(T), | ||
} | ||
|
||
// Triggers correctly even when interleaved with other attributes. | ||
/// A multiline | ||
#[derive(Debug)] | ||
/// doc comment: | ||
/// only the last line triggers the lint. | ||
//~^ doc_paragraphs_missing_punctuation | ||
enum Exceptions { | ||
/// Question marks are fine? | ||
QuestionMark, | ||
/// Exclamation marks are fine! | ||
ExclamationMark, | ||
/// Ellipses are ok too… | ||
Ellipsis, | ||
/// HTML content is however not checked: | ||
/// <em>Raw HTML is allowed as well</em> | ||
RawHtml, | ||
/// The raw HTML exception actually does the right thing to autolinks: | ||
/// <https://spec.commonmark.org/0.31.2/#autolinks>. | ||
//~^ doc_paragraphs_missing_punctuation | ||
MarkdownAutolink, | ||
/// This table introduction ends with a colon: | ||
/// | ||
/// | Exception | Note | | ||
/// | -------------- | ----- | | ||
/// | Markdown table | A-ok | | ||
MarkdownTable, | ||
/// Here is a snippet. | ||
//~^ doc_paragraphs_missing_punctuation | ||
/// | ||
/// ``` | ||
/// // Code blocks are no issues. | ||
/// ``` | ||
CodeBlock, | ||
} | ||
|
||
// Check the lint can be expected on a whole enum at once. | ||
#[expect(clippy::doc_paragraphs_missing_punctuation)] | ||
enum Char { | ||
/// U+0000 | ||
Null, | ||
/// U+0001 | ||
StartOfHeading, | ||
} | ||
|
||
// Check the lint can be expected on a single variant without affecting others. | ||
enum Char2 { | ||
#[expect(clippy::doc_paragraphs_missing_punctuation)] | ||
/// U+0000 | ||
Null, | ||
/// U+0001. | ||
//~^ doc_paragraphs_missing_punctuation | ||
StartOfHeading, | ||
} | ||
|
||
mod module { | ||
//! Works on | ||
//! inner attributes too. | ||
//~^ doc_paragraphs_missing_punctuation | ||
} | ||
|
||
enum Trailers { | ||
/// Sometimes the last sentence ends with parentheses (and that's ok). | ||
ParensPassing, | ||
/// (Sometimes the last sentence is in parentheses.) | ||
SentenceInParensPassing, | ||
/// **Sometimes the last sentence is in bold, and that's ok.** | ||
DoubleStarPassing, | ||
/// **But sometimes it is missing a period.** | ||
//~^ doc_paragraphs_missing_punctuation | ||
DoubleStarFailing, | ||
/// _Sometimes the last sentence is in italics, and that's ok._ | ||
UnderscorePassing, | ||
/// _But sometimes it is missing a period._ | ||
//~^ doc_paragraphs_missing_punctuation | ||
UnderscoreFailing, | ||
/// This comment ends with "a quote." | ||
AmericanStyleQuotePassing, | ||
/// This comment ends with "a quote". | ||
BritishStyleQuotePassing, | ||
} | ||
|
||
/// Doc comments can end with an [inline link](#anchor). | ||
//~^ doc_paragraphs_missing_punctuation | ||
struct InlineLink; | ||
|
||
/// Some doc comments contain [link reference definitions][spec]. | ||
//~^ doc_paragraphs_missing_punctuation | ||
/// | ||
/// [spec]: https://spec.commonmark.org/0.31.2/#link-reference-definitions | ||
struct LinkRefDefinition; | ||
|
||
// List items do not always need to end with a period. | ||
enum UnorderedLists { | ||
/// This list has an introductory sentence: | ||
/// | ||
/// - A list item | ||
Dash, | ||
/// + A list item | ||
Plus, | ||
/// * A list item | ||
Star, | ||
} | ||
|
||
enum OrderedLists { | ||
/// 1. A list item | ||
Dot, | ||
/// 42) A list item | ||
Paren, | ||
} | ||
|
||
/// Doc comments with trailing blank lines are supported. | ||
//~^ doc_paragraphs_missing_punctuation | ||
/// | ||
struct TrailingBlankLine; | ||
|
||
/// This doc comment has multiple paragraph. | ||
/// This first paragraph is missing punctuation. | ||
//~^ doc_paragraphs_missing_punctuation | ||
/// | ||
/// The second one as well | ||
/// And it has multiple sentences. | ||
//~^ doc_paragraphs_missing_punctuation | ||
/// | ||
/// Same for this third and last one. | ||
//~^ doc_paragraphs_missing_punctuation | ||
struct MultiParagraphDocComment; | ||
|
||
/// ``` | ||
struct IncompleteBlockCode; | ||
|
||
/// This ends with a code `span`. | ||
//~^ doc_paragraphs_missing_punctuation | ||
struct CodeSpan; | ||
|
||
#[expect(clippy::empty_docs)] | ||
/// | ||
struct EmptyDocComment; | ||
|
||
/** | ||
* Block doc comments work. | ||
* | ||
*/ | ||
//~^^^ doc_paragraphs_missing_punctuation | ||
struct BlockDocComment; | ||
|
||
/// Sometimes a doc attribute is used for concatenation | ||
/// ``` | ||
#[doc = ""] | ||
/// ``` | ||
struct DocAttribute; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding catching individual paragraphs, would it suffice to track the unconditional depth of start/end tags and check when it goes from 1 -> 0 due to a
TagEnd::Paragraph
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, the "rules" for what makes sense in the middle of a comment are different from the rules that make sense at the end. For example, here's a few crates where a paragraph ends in a colon:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see that being a problem, it would no longer lint on a final paragraph ending in
:
but that doesn't seem like it would be common enough to need a lintThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've finally gotten around to implement this: as you suggested, each paragraph end is now checked, and colons are allowed in that position. I've of course updated the tests to reflect that. I also had to refactor a bit to be able to return multiple lint error for each doc comment.
Do you think the lint should be renamed, maybe by removing the word "terminal", because colons are not technically terminal punctuation (and that would make the name shorter)?
(As rustbot shows, I had to rebase to take into account the new way of importing
pulldown_cmark
.)Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that sounds good, maybe also specify that it's about paragraphs now with something like
doc_paragraphs_missing_punctuation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I've renamed the lint: every occurrence in source code, the changelog entry in the PR description, and the PR title. We unfortunately can't rename the branch.
I think this is now ready for review; tell me if you want me to squash the commits (because their titles barely make sense now) while of course properly preserving authorship.