Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add reflow command #2128

Merged
merged 5 commits into from May 2, 2022
Merged

add reflow command #2128

merged 5 commits into from May 2, 2022

Conversation

vlmutolo
Copy link
Contributor

@vlmutolo vlmutolo commented Apr 16, 2022

Users need to be able to hard-wrap text for many applications, including
comments in code, git commit messages, plaintext documentation, etc. It
often falls to the user to manually insert line breaks where appropriate
in order to hard-wrap text.

This commit introduces the "reflow" command (both in the TUI and core
library) to automatically hard-wrap selected text to a given column length.

Addresses #625.

helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved
helix-term/src/commands/typed.rs Outdated Show resolved Hide resolved

/// Given a slice of text, return the text re-wrapped to fit
/// it within the given width.
pub fn reflow_hard_wrap(text: RopeSlice, max_line_len: usize) -> String {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you evaulated simply using https://github.com/mgeisler/textwrap which will also solve for cases like multiwidth characters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet! That could clean up most of the logic. I'll definitely check it out.

Copy link
Contributor Author

@vlmutolo vlmutolo Apr 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked at it a bit, and it seems like the right approach. Thanks for the suggestion. I wish I'd asked around re: existing crates a bit earlier, but I wanted to delete most of this code anyway.

I'm especially impressed with the line-breaking algorithms they have. Not sure if using the "optimal" line breaking algorithm is worth the extra binary size. Personally, I'd want it for the extra ~50KB. Thoughts?

Users need to be able to hard-wrap text for many applications, including
comments in code, git commit messages, plaintext documentation, etc. It
often falls to the user to manually insert line breaks where appropriate
in order to hard-wrap text.

This commit introduces the "reflow" command (both in the TUI and core
library) to automatically hard-wrap selected text to a given number of
characters (defined by Unicode "extended grapheme clusters"). It handles
lines with a repeated prefix, such as comments ("//") and indentation.
@vlmutolo vlmutolo marked this pull request as ready for review April 16, 2022 16:20
@vlmutolo
Copy link
Contributor Author

vlmutolo commented Apr 16, 2022

I switched to using textwrap, which obviously made this PR a lot cleaner. The default feature set looked like what we want. It has the smawk feature enabled by default, which brings in one (transitive) dependency to do the linear algebra calculations to make the reflow optimal instead of greedy. Do we want to make this optional behind a (default?) feature, or just leave it always-on?


Also I left the textwrap crate to be imported in core inside a now-tiny module called wrap. The motivation here was to allow any Helix frontend to have access to the wrapping capabilities. Let me know if you think there's a better place for the function, or if we want to just stick it in core/lib.


Future work should probably include ensuring that reflowing from a single line with a known prefix (like a comment) preserves that prefix. Otherwise, the textwrap crate seems to do an excellent job.

@the-mikedavis the-mikedavis linked an issue Apr 17, 2022 that may be closed by this pull request
@vlmutolo
Copy link
Contributor Author

I'm not sure what the Build / Docs (pull_request) failure is. Is this something I should be trying to fix?

@the-mikedavis
Copy link
Member

If you run cargo xtask docgen and commit the changes it'll clear up the ❌. That check makes sure all the available commands / languages / lsps are documented.

@vlmutolo
Copy link
Contributor Author

That's a nice check to ensure up-to-date docs in the book. I'll have to remember that strategy for my own future projects.

Copy link
Member

@dead10ck dead10ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pretty straightforward to me with the external crate.

let rope = doc.text();

// TODO: If only a single character is selected, we should expand the
// selection to whatever "object" (can we use treesitter for this? text query?)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be tricky because of inconsistencies between grammars. For example, in Rust, each line in a comment is a separate node, so it would just end up reflowing a single line, instead of all the consecutive lines of comments together.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's interesting. We can scrap this TODO, then.

Plus, if we have a way of "expanding" the selection to a comment block or something, then we basically achieve the same effect because this operates on selections.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's mio for selecting the comment block (selecting all adjacent comments nodes if the queries are specified correctly).

Copy link
Contributor

@pickfire pickfire Apr 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gqq like vim to default wrapping on a single line might be a reasonable default.

I think we probably also want to ignore comment in the reflow, at least it should be as convenient as vim, also note that rust supports quite a few form of comments and that should be handled as well here, like ////, //!, // ...

My experience with kakoune using external fmt command were especially bad with reflowing rust code given that the comment could appear in multiple forms, and the comment being wrapped along in unexpected ways.

Copy link
Contributor Author

@vlmutolo vlmutolo Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library we're using (textwrap) will detect a common prefix across the selected lines. This will work for most of our comments. For example, I just tested this:

// A B
// C D E F

and it reflowed to:

// A B C D E F

We fall down a bit in trying to reflow from one line to many, since the library we're using can't detect the comment prefix when it's not repeated. We also still have an issue where the library will eat blank lines and reflow across them. I'd file both of those points under "future work" on this feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that's future work. It could be defined in a language.toml as something like comment_prefix_tokens = ["///", "//!", "//"] where the first token in the list is prioritized (so rust /// doesn't get detected as //)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey folks! Textwrap author here :-) It's super cool to see textwrap::refill used. I'll be very interested in making textwrap::unfill (the actual engine behind textwrap:refill) more flexible so that it covers the necessary use-cases.

I wrote textwrap::unfill quickly so that it seems to do the right thing for a few languages, such as Markdown and programming languages which use # and / as line comments. But the code has not been tested much in the wild, so feedback is welcome :-)

Copy link
Contributor Author

@vlmutolo vlmutolo Jun 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgeisler textwrap is an awesome library! I started to implement this feature myself before I was aware textwrap existed, so I have a bit of an idea of the work that went into a more comprehensive solution. And the optimal line breaking is just really cool.

Re: unfill, what do you think about adding an API to provide a list of "known prefixes"? I should probably make an issue in textwrap to discuss this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think we should do exactly that: change the API of refill to let you specify prefixes and so on. This probably means expanding unfill as well since that is the function which turns a wrapped paragraph into a single line of text.

// doesn't pass in a selection, default to 79 characters per line?

const DEFAULT_MAX_LEN: Cow<'static, str> = Cow::Borrowed("79");
let max_line_len: usize = args.get(0).unwrap_or(&DEFAULT_MAX_LEN).parse()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not necessarily in this PR, but it might also be cool to add a config option to set the default max line width if you prefer a wider width, but don't want to give it explicitly every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I can add that real quick to this PR, and clean up some of the comments and TODOs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be set on a global config especially when different languages very likely have different width, notably

  • rust 100 / 80 (some special case that we don't really want to handle but rustfmt should already do it)
  • python 80
  • git commit 52 IIRC
  • javascript shouldn't have a default especially when there are so many "standards", some say 100 some say 120

I think this should be within languages.toml

Copy link
Contributor Author

@vlmutolo vlmutolo Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes to me, and I included it in the version I just pushed.

Any opinions on the name of the config value? I made it a top-level key called max_line_length and it currently gets parsed to a usize.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line_wrapping_width?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add some sensible defaults for the languages.toml like rust 100.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'll have to look into that. Any tips on where those defaults might go? I was a bit confused by the code that processes the languages.toml file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go to

[[language]]
name = "rust"
max_line_len = 100

That is just an example, I think the variable name may not the that.

Comment on lines +1508 to +1514
TypableCommand {
name: "reflow",
aliases: &[],
doc: "Hard-wrap the current selection of lines to a given width.",
fun: reflow,
completer: None,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just bind it to gq.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gq could be awkward given the goto menu g. We might be able to use = although I think that is currently saving a spot for LSP-driven format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know any other good bindings for it beside gq which is similar to vim, we might have to tweak goto menu to goto/misc menu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this decision shouldn't block the PR. I imagine it's pretty easy to add an alias later.

Copy link
Member

@dead10ck dead10ck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me!

Copy link
Member

@archseer archseer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure yet if the optional features of text-wrap are worth keeping on by default but I think this is good to merge for now 👍🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Command to hard-wrap selected text.
8 participants