Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greatly simplify doctest parsing and information extraction #138104

Merged
merged 12 commits into from
Mar 28, 2025

Conversation

GuillaumeGomez
Copy link
Member

The original process was pretty terrible, as it tried to extract information such as attributes by performing matches over tokens like #!, which doesn't work very well considering you can have # ! [, which is valid.

Also, it now does it in one pass: if the parser is happy, then we try to extract information, otherwise we return early.

r? @fmease

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Mar 6, 2025
@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch from 70b66bc to 4d9a209 Compare March 6, 2025 17:51
@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 6, 2025
@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch from 543cc23 to 7e403e8 Compare March 7, 2025 15:17
@rustbot rustbot added the T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. label Mar 7, 2025
@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch 2 times, most recently from fe7ed38 to 5b53cea Compare March 7, 2025 20:57
@rustbot rustbot added the A-run-make Area: port run-make Makefiles to rmake.rs label Mar 7, 2025
@rustbot
Copy link
Collaborator

rustbot commented Mar 7, 2025

This PR modifies run-make tests.

cc @jieyouxu

@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch from 5b53cea to 0aa9752 Compare March 7, 2025 22:48
@GuillaumeGomez
Copy link
Member Author

CI finally passed! \o/

@GuillaumeGomez GuillaumeGomez added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. A-doctests Area: Documentation tests, run by rustdoc and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. labels Mar 8, 2025
Copy link
Member

@fmease fmease left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Very cool. I have some comments but after that we should be good to go. Review comments starting with (preexisting) don't need to be addressed in this PR or by you for that matter

debug!("crate_attrs:\n{crate_attrs}{maybe_crate_attrs}");
debug!("crates:\n{crates}");
debug!("after:\n{everything_else}");

// If the AST returned an error, we don't want this doctest to be merged with the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated sentence: On parse errors, we already return early above yielding invalid which has has !can_be_merged and failed_ast.

@@ -401,3 +402,26 @@ fn check_split_args() {
compare("a\n\t \rb", &["a", "b"]);
compare("a\n\t1 \rb", &["a", "1", "b"]);
}

#[test]
fn comment_in_attrs() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what has this test to do with comments? There are no comments, only an inner doc comment – which also isn't "in [an] attr"


#[test]
fn comment_in_attrs() {
// if input already has a fn main, it should insert a space before it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused what does this comment have to do with the input string? There is no fn main in it and there's also no space getting inserted before anything (I can only see an extra empty line in the generated main fn)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll rewrite the comment to explain better the problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum also the comment is wrong. XD

Comment on lines 336 to 337
// We need to shift by 1 because we added `{` at the beginning of the source.we provided
// to the parser.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// We need to shift by 1 because we added `{` at the beginning of the source.we provided
// to the parser.
// We need to shift by 1 because we added `{` at the beginning of the
// source we provided to the parser.

prev_span_hi: &mut Option<usize>,
) {
let extra_len = DOCTEST_CODE_WRAPPER.len();
// We need to shift by 1 because we added `{` at the beginning of the source.we provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated comment? We added more than { and we don't shift by 1

AstError,
Ok,
}

fn cancel_error_count(psess: &ParseSess) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(preexisting) This would ideally also be called reset_error_count canceling counting and resetting counting are different things >.<

{
PartitionState::Crates
// We add potential backlines/comments if there are some in items generated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backlines? line breaks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading my comment destroyed my brain. What the hell happened when I wrote it? O.o

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it, I definitely need to improve this...

Comment on lines +444 to 442
while let Some(token) = iter.next() {
if let TokenTree::Token(token, _) = token
&& let TokenKind::Ident(name, _) = token.kind
&& name == kw::Fn
&& let Some(TokenTree::Token(fn_token, _)) = iter.peek()
&& let TokenKind::Ident(fn_name, _) = fn_token.kind
&& fn_name == sym::main
&& let Some(TokenTree::Delimited(_, _, Delimiter::Parenthesis, _)) = {
iter.next();
iter.peek()
}
{
info.has_main_fn = true;
break;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the effort to try and get rod of the string-based fallback logic. It's a bit overkill maybe but I guess it's fine.

Tho note that this check will run a lot more often than the other check. E.g., it will check all macro calls given

//! ````
//! first!(...);
//! second!(...);
//! fn main() {}
//! ```

because we go in source order. The old string-based fallback check only kicked in when it couldn't find fn main anywhere else. I guess in practice it doesn't really matter and doing two passes would be even more overkill

s: &mut String,
source: &str,
span: rustc_span::Span,
prev_span_hi: &mut Option<usize>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
prev_span_hi: &mut Option<usize>,
prev_span_hi: &mut usize,

this could just be an usize right? starting off at 0 instead of None in parse_source

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough! I do .unwrap_or(0) with it so it's definitely useless to keep it in an Option...


fn push_to_s(
s: &mut String,
source: &str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, if you just passed wrapped_source we wouldn't need to fix the any of the byte offsets, right? Could we do that instead please

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would force to clone wrapped_source since it's passed by value to new_parser_from_source_str. So I'd rather not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, missed that! Fair fair

@fmease fmease added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 27, 2025
@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch from 0aa9752 to 796b8d6 Compare March 27, 2025 14:24
@rustbot rustbot added the T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output. label Mar 27, 2025
@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member Author

Ah interesting, rebasing allowed me to uncover a new bug. New regression test incoming. :D

…en first non-crate items and the crate items
@GuillaumeGomez GuillaumeGomez force-pushed the simplify-doctest-parsing branch from 796b8d6 to 5d27440 Compare March 27, 2025 16:43
@GuillaumeGomez
Copy link
Member Author

Applied suggestions. Since it was approved, let's go! :D

@bors r=fmease rollup

@bors
Copy link
Contributor

bors commented Mar 27, 2025

📌 Commit 5d27440 has been approved by fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 27, 2025
jhpratt added a commit to jhpratt/rust that referenced this pull request Mar 27, 2025
…rsing, r=fmease

Greatly simplify doctest parsing and information extraction

The original process was pretty terrible, as it tried to extract information such as attributes by performing matches over tokens like `#!`, which doesn't work very well considering you can have `#   ! [`, which is valid.

Also, it now does it in one pass: if the parser is happy, then we try to extract information, otherwise we return early.

r? `@fmease`
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 28, 2025
Rollup of 5 pull requests

Successful merges:

 - rust-lang#138104 (Greatly simplify doctest parsing and information extraction)
 - rust-lang#139010 (Improve `xcrun` error handling)
 - rust-lang#139021 (std: get rid of pre-Vista fallback code)
 - rust-lang#139026 (Use `abs_diff` where applicable)
 - rust-lang#139030 (saethlin goes on vacation)

r? `@ghost`
`@rustbot` modify labels: rollup
@jhpratt
Copy link
Member

jhpratt commented Mar 28, 2025

@bors r-

#139033 (comment)

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Mar 28, 2025
@GuillaumeGomez
Copy link
Member Author

@bors r=fmease rollup

@bors
Copy link
Contributor

bors commented Mar 28, 2025

📌 Commit 87d524b has been approved by fmease

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 28, 2025
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 28, 2025
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#137889 (update outdated doc with new example)
 - rust-lang#138104 (Greatly simplify doctest parsing and information extraction)
 - rust-lang#138678 (rustc_resolve: fix instability in lib.rmeta contents)
 - rust-lang#138986 (feat(config): Add ChangeId enum for suppressing warnings)
 - rust-lang#139038 (Update target maintainers for thumb targets to reflect new REWG Arm team name)
 - rust-lang#139045 (bootstrap: update `test_find` test)
 - rust-lang#139047 (Remove ScopeDepth)

Failed merges:

 - rust-lang#139044 (bootstrap: Avoid cloning `change-id` list)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Mar 28, 2025
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#137889 (update outdated doc with new example)
 - rust-lang#138104 (Greatly simplify doctest parsing and information extraction)
 - rust-lang#138678 (rustc_resolve: fix instability in lib.rmeta contents)
 - rust-lang#138986 (feat(config): Add ChangeId enum for suppressing warnings)
 - rust-lang#139038 (Update target maintainers for thumb targets to reflect new REWG Arm team name)
 - rust-lang#139045 (bootstrap: update `test_find` test)
 - rust-lang#139047 (Remove ScopeDepth)

Failed merges:

 - rust-lang#139044 (bootstrap: Avoid cloning `change-id` list)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit ad87732 into rust-lang:master Mar 28, 2025
6 checks passed
@rustbot rustbot added this to the 1.87.0 milestone Mar 28, 2025
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 28, 2025
Rollup merge of rust-lang#138104 - GuillaumeGomez:simplify-doctest-parsing, r=fmease

Greatly simplify doctest parsing and information extraction

The original process was pretty terrible, as it tried to extract information such as attributes by performing matches over tokens like `#!`, which doesn't work very well considering you can have `#   ! [`, which is valid.

Also, it now does it in one pass: if the parser is happy, then we try to extract information, otherwise we return early.

r? `@fmease`
@GuillaumeGomez GuillaumeGomez deleted the simplify-doctest-parsing branch March 30, 2025 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-doctests Area: Documentation tests, run by rustdoc A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. T-rustdoc-frontend Relevant to the rustdoc-frontend team, which will review and decide on the web UI/UX output.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants