Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of numbered markdown lists. #5423

Merged

Conversation

anforowicz
Copy link

Fixes issue #5416

@ytmimi ytmimi self-assigned this Jun 30, 2022
Copy link
Contributor

@ytmimi ytmimi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your first contribution 🎉!

When you have a moment could you leave a comment referencing issue #5416 above these test cases so we can tie them back to the original issue. Something like:

//! This is a numbered list (see issue #5416):
//! 1. Long long long long long long long long long long long long long long long long long line
//! 2. Another very long long long long long long long long long long long long long long long line
//! 3. Nested list
//!    A. foo
//!    B. bar
//! 4. Last item

Overall I think these test cases are good. I would like to add a few more. When you have a chance could you add the following test cases:

  • Use ) instead of . since you also added that functionality.
  • Can we also add a really deeply nested list (3 or 4 levels deep)
  • Can we also have the deeply nested lists have enough content that they would wrap when wrap_comments=true is set.
  • It think it would be interesting to have a test case where we use both numbered and non numbered lists.

If you can think of any other test cases feel free to add them, but I think those additions would give us better test coverage!

@anforowicz
Copy link
Author

@ytmimi - could you PTAL again?

FWIW I don't use github daily, so please point out if there are some technical issues with how I addressed the feedback and/or how I attempted to rebase the PR on top of the tip of the tree. In particular, I am not sure if there are supposed to be 3 separate commits here: master...anforowicz:rustfmt:handling-numbered-lists (maybe I should instead amend the original commit used for the PR? or squash the 3 commits into one using some other mechanism...).

When you have a moment could you leave a comment referencing issue #5416 above these test cases [...]

Done

Overall I think these test cases are good. I would like to add a few more. When you have a chance could you add the following test cases:

Use ) instead of . since you also added that functionality.

Done (see the test that starts with: "Using the ')' instead of '.' character after the number")

  • Can we also have the deeply nested lists have enough content that they would wrap when wrap_comments=true is set.

Done (tweaked the original test - the one that now is called: "This is a numbered markdown list").

  • Can we also add a really deeply nested list (3 or 4 levels deep)
  • It think it would be interesting to have a test case where we use both numbered and non numbered lists.

Done (the last test - "Deep list that mixes various bullet and number formats").

Sigils at the start of a line

If you can think of any other test cases feel free to add them, but I think those additions would give us better test coverage!

I was wondering about sigils appearing at the start of a line in the middle of a paragraph. Some notes:

Maybe this should not be a list:

The Captain died in
1868.  He was buried in...

But this should start a list (making a special case for 1 vs non-1; bullet items are a bit unclear):

Our top priorities are
1. fix ordered lists

But then in a follow-up discussion it was pointed out that it is not clear if 1 vs non-1 should be special-cased - wouldn't we need to require a blank like before each and every item: https://talk.commonmark.org/t/blank-lines-before-lists-revisited/1990/16

So, I think I don't want to make any changes in this direction in this PR, but maybe there are some notes / questions to consider:

  • Would it be possible to eventually redesign comment processing so that it delegates markdown processing/wrapping to a separate tool or library? https://atom.io/packages/flowmark is one example (obviously this won't work because this is not a Rust library, nor a standalone executable AFAICT...).
  • Should the notes above be captured somewhere more permanent? A separate issue maybe?

@ytmimi
Copy link
Contributor

ytmimi commented Jul 6, 2022

First off, thanks for the thoughtful questions and all the linked discussion!

FWIW I don't use github daily, so please point out if there are some technical issues with how I addressed the feedback and/or how I attempted to rebase the PR on top of the tip of the tree.

Understood! I do appreciate you taking the time to make those updates when you logged back on and saw the notification. Everyone's go their own way of responding to feedback and in this particular case I think just responding with "Done" is totally fine. I think had the ask been more involved than updating tests some additional discussion / clarification might have been needed, but we're good here!

In particular, I am not sure if there are supposed to be 3 separate commits here: master...anforowicz:rustfmt:handling-numbered-lists (maybe I should instead amend the original commit used for the PR? or squash the 3 commits into one using some other mechanism...).

Just a little bit of background before directly answering your question. We follow the rust compilers no-merge policy. I've updated my git config to always rebase instead of merge using git config --global pull.rebase. You might consider updating your git config with the same command if you plan to do more work on rust 😁 If there are merge conflicts we need those to be resolved with a rebase instead of a merge.

You can look into interactive rebasing to squash your commits, but we can also just squash and merge the commits if you don't want to squash them yourself. In this case I think these changes make more sense as a single commit, but again don't feel obligated to squash them yourself.

I was wondering about sigils appearing at the start of a line in the middle of a paragraph

It's an interesting thought. Given that the main goal of rustfmt is to format rust code I don't see the need to dive too deeply into those details, especially if it complicates the implementation.

(AFAIU rustdoc uses the CommonMark Markdown specification):

rustdoc using pulldown-cmark to render markdown as HTML. See the Cargo.toml.

  • Would it be possible to eventually redesign comment processing so that it delegates markdown processing/wrapping to a separate tool or library?

Definitely something we'd be open to considering! This is also something I've thought about since we've got a few open issues related to markdown formatting. I briefly looked at pulldown-cmark-to-cmark, but haven't played around with trying to integrate it. I think the challenge with introducing a library to handle the markdown formatting is that we need to ensure that we don't introduce breaking formatting changes.

  • Should the notes above be captured somewhere more permanent? A separate issue maybe?
    Merge state

Feel free to open an issue if you'd like. As I mentioned, the main goal for rustfmt is to format rust code so markdown formatting is pretty low on the priority list and likely won't get much attention from the team any time soon as there's plenty on the backlog. That said, we'll certainly review PRs if there's anyone motivated enough to work on something like this.

ytmimi
ytmimi previously approved these changes Jul 6, 2022
Copy link
Contributor

@ytmimi ytmimi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this!

@anforowicz
Copy link
Author

@ytmimi - what are the next steps here? Are there any actions that I need to take at this point (maybe somehow unmerging/rebasing the commits in my fork)?

src/comment.rs Outdated Show resolved Hide resolved
@ytmimi
Copy link
Contributor

ytmimi commented Jul 9, 2022

@ytmimi - what are the next steps here? Are there any actions that I need to take at this point

Again, thanks for your work on this. I'm still thinking through a few things. The examples you listed in #5423 (comment) has got me thinking and I'd like to play around with some inputs. Specifically referring to

The Captain died in
1868.  He was buried in...

The issue you outlined there would also apply to the end of sentences since after this change we allow aaaaaa. to start a new itemized block. I'm thinking about what happens when we start a newline with the end of a sentence and then the next sentence needs to wrap?

Testing the following (curtesy of wikipedia) with your branch I get:

intput

//! Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air
//! moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH), Fe(OH)3), and is typically associated with the corrosion of refined iron.

output

//! Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction
//! of iron and oxygen in the catalytic presence of water or air
//! moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and
//!           iron(III) oxide-hydroxide (FeO(OH), Fe(OH)3), and is typically
//!           associated with the corrosion of refined iron

@ytmimi ytmimi dismissed their stale review July 9, 2022 20:06

After thinking on it a bit more I found an edge case that would ned to be addressed before we could move forward

@anforowicz
Copy link
Author

Some random notes:

  • The problem of mid-paragraph-sigils is not new, but indeed the PR means that the problem may happen more frequently ("word." seems more likely than "- word" at a beginning of a line).
  • I wonder if limiting the number of characters in a sigil might mitigate some of the concerns.
    • Limiting to 1 character would seem safest and still fairly functional (especially since using "1." for each item works fine for Common Mark - this would offer a workaround if "10." got misformatted by rustfmt).
    • We could also limit to at most 2 characters. It seems that having more than 3 characters in a sigil should cover very rarely (some discussion of roman numerals in Common Mark can be found here - it seems that they are not supported but I haven't honestly finished reading this discussion in depth).
  • Ultimately, what rustfmt does with itemized list (including what it may do after this PR with numbered lists) is somewhat ad-hoc and heuristics-based - in the long-term this is a problem that Common Mark needs to resolve (in https://talk.commonmark.org/t/blank-lines-before-lists-revisited/1990 or around).

@calebcartwright
Copy link
Member

Lots of good dialog here, thank you both!

IIRC, rustdoc is using cmark or some form of it anyway, and I'm not inherently opposed to us trying a radically different approach as instead of the current strategy of continuing to bolt things on to support additional cases as they arise.

However, any such radically different approach would need to be accompanied by some exhaustive tests (far more than what we have today) to avoid, or at least minimize, the amount of churn we would inflict on our users. To be clear, I think this testing would need to entail running rustfmt with new approach over various projects that are currently utilizing the feature. The associated config options that hit this in the first place are still unstable, but it's still incumbent upon us to be cautious given the potential impact.

Additionally, we'd also need to ensure continued interop with other relevant options and behavior, potential future behavior, and be mindful about any extra dependency bloat considering it's an opt-in feature.

@anforowicz
Copy link
Author

After thinking on it a bit more I found an edge case that would ned to be addressed before we could move forward

@ytmimi, could you please share more details about the edge case that needs to be addressed to unblock this PR?

(I might not reply for a while - I'll be away from a computer for a bit of summer vacation :-)

@ytmimi
Copy link
Contributor

ytmimi commented Jul 28, 2022

@anforowicz apologies for the delay. I have some reservations based on the case you outlined in Sigils At the start of a line, which I also demonstrated in #5423 (comment). I'm assuming there's a non zero chance that introducing these changes will cause existing doc comments to be poorly formatted, and I'd like to avoid that.

We could also limit to at most 2 characters

I like this as a workaroud to limiting the issue. Erring on the side of caution I'd also like to limit it to just numeric values, instead of also trying to support upper and lowercase ascii letters.

I feel like that's a good compromise that will address the issue you ran into in #5416

@ytmimi ytmimi added the p-low label Jul 28, 2022
@anforowicz
Copy link
Author

Thanks @ytmimi! Can you PTAL again?

(Again - my apologies for the hap-hazardous git and github - hopefully the separate commits in my fork can be merged as a single commit?)

@ytmimi
Copy link
Contributor

ytmimi commented Aug 28, 2022

@anforowicz no worries. I should have some time in the coming week to review this again.

@anforowicz
Copy link
Author

anforowicz commented Sep 26, 2022

@ytmimi - gentle ping? Maybe this got missed because this PR is incorrectly marked as pr-waiting-on-author.

@anforowicz
Copy link
Author

r? @cynecx

@anforowicz
Copy link
Author

@rustbot label -pr-waiting-on-author +pr-follow-up-review-pending

@rustbot
Copy link
Collaborator

rustbot commented Oct 5, 2022

Error: The feature relabel is not enabled in this repository.
To enable it add its section in the triagebot.toml in the root of the repository.

Please file an issue on GitHub at triagebot if there's a problem with this bot, or reach out on #t-infra on Zulip.

src/comment.rs Show resolved Hide resolved
src/comment.rs Outdated Show resolved Hide resolved
@anforowicz
Copy link
Author

@ytmimi - this PR should be ready for another review pass please.

In addition to resolving the PR feedback above I also did some other minor changes:

  • When resolving the PR feedback I realized that CommonMark calls the "*", "1." snippets "markers" rather than "sigils" so I also went ahead and did this rename. This is mostly contained within the code already touched by this PR, so hopefully okay.
  • And after a self-review I've also made some small tweaks of doc comments.

@ytmimi
Copy link
Contributor

ytmimi commented Oct 14, 2022

@anforowicz thanks for turning around the feedback so quickly. I haven't looked at the changes yet, but from your description above they seem reasonable. I also wanted to let you know that I approved #5560. We haven't merged it yet, but I think it makes sense to merge that PR before this one. There will likely be some merge conflicts once that PR is merged so I wanted to give you a heads up. It might make sense to subscribe to that PR so you know when it's merged.

@calebcartwright
Copy link
Member

As anticipated the merging of #5560 has introduced some merge conflicts, but I gather we should be ready to go with this once those conflicts are resolved.

@calebcartwright
Copy link
Member

Thanks for the quick resolution @anforowicz!

@ytmimi - do you think this needs another quick pass, and then do you think this should be included in the next release or are shall we hold it till the following?

@ytmimi
Copy link
Contributor

ytmimi commented Feb 7, 2023

@calebcartwright I took another look and I think this is good to go 👍🏼 and could be included in the next release, but I don't mind if we want to hold off until the following.

@calebcartwright
Copy link
Member

@ytmimi - This seems like a worthwhile add to me, so more a question of when as opposed to if.

Suppose the options are to merge it now/tomorrow and pull it into the next release (pre let-else support), or wait and then do it in the release that follows let-else. I'll defer to you on whether you think we should go ahead with this now based on your confidence level in the implementation and adherence to formatting stability guarantee.

@calebcartwright calebcartwright added the release-notes Needs an associated changelog entry label Jun 20, 2023
@calebcartwright
Copy link
Member

haven't looked at this one personally but per team meeting in discussion on zulip earlier Yacin's comfortable with this, so going to pull it in as the last addition in the next release

@calebcartwright calebcartwright merged commit f4201ef into rust-lang:master Jun 20, 2023
27 checks passed
@calebcartwright calebcartwright removed the release-notes Needs an associated changelog entry label Jun 20, 2023
@anforowicz anforowicz deleted the handling-numbered-lists branch June 21, 2023 00:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants