-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for custom width providers #128
Conversation
By the way this crate definitely deserves a refactor, it's getting quite big and it's a bit hard to work on it when it's only one single big file full of documentation. |
Hey @Moxinilian, thanks for the PR. It's not been forgotten, I've just been busy lately... I'm traveling this weekend, so it won't be before next week that I get proper time to look at this. Just some short remarks: I love that you could do this via a generic function parameter --- zero-cost abstractions are awesome, as you say :-) About measuring the characters: would it perhaps be better to measure words at a time? With combining characters, I guess you don't really know the true displayed width of a word until you typeset the entire thing. Take emoticons as an extreme example since I'm typing this on my phone: I recently learned that a symbol like 👩👩👧👦 is created at rendering time --- it's really multiple Unicode characters with zero-width-joiners between them, something like 👧👧👨👨 next to each other. See https://emojipedia.org/emoji-zwj-sequences/. So if you measure the width of this Finally, you're definitely right that the crate should be refactored by now... It was my first larger piece of Rust code, so I'm happy to learn how to structure it better! |
You're right to bring this up because I thought about it and forgot to mention it. But as I'm writing this I'm realizing that we could simply stop iterating over characters and make it split words. Even better, this approach would feel very natural in a functional programming style, which is in my opinion better in general, and unless you think otherwise, would really be appropriate. But then, what is a word? Where does it begin and where does it end? Is ending punctuation such as colons or interrogation marks part of a word or separate? And how would we handle hyphenation? It raises a lot of questions that I would struggle to answer. Maybe we could try to disallow hyphenation on custom lengths and use spaces as the only word delimiters. But then that would make more conditions and increase the code complexity, while also probably making it quite inelegant. Feel free to also check out my concerns about support of floating points in issue #126. |
I'm going to revert the changes and try to implement that instead, as there would be no point in merging this now that we're going down the word route. |
My comments about combining characters might have been distracting and unhelpful since it's the current code that suffers from those problems -- it already works Looking at the unicode-width crate, I see that it has code like: impl UnicodeWidthStr for str {
fn width(&self) -> usize {
self.chars().map(|c| cw::width(c, false).unwrap_or(0)).fold(0, Add::add)
}
} This means that the However, I like the idea of making this more generally applicable, so I would welcome a refactor in the direction of scanning for split points (breakable whitespace) and then measuring the width of these words on a per-word basis. The end result should be identical to what we do today, but it should be more adaptable. |
Alright then. I now remember why I accepted it anyway: the reason I'm working on it is Piston, and piston-graphics seems to also be fine with computing width character by character. |
I'm going to read this document to verify easy solutions have not already been found. |
I'm thinking I'll make a 0.10.0 release now (after #129 got reported) and then we can merge this in and play a bit with the API and do refactorings and whatnot for a later 0.11.0 release. Hope that seems okay! |
I’m also interested in this feature to be able to use |
Hi @robinkrahl, I think this PR has stalled, but I would still like to see this kind of feature implemented. @Moxinilian, if you're still interested in this, then please let us know! I recently rewrote some of the inner parts of textwrap (#213), so I think this PR should be closed and then a new one could be opened to fit with the new API. Ideally, this change should be invisible at the surface-level of the API, so that one can continue to write
Perhaps the Let me close this PR for now and then we can perhaps discuss more in #126. |
This is part of the ongoing effort to support #126
This PR offers a way to set custom (integer) widths for characters. It does not add support for floating point numbers yet.
This allows the user to pass to most functions a closure taking a
char
and returning its width as ausize
.I tried to limit API changes for simple tasks by leaving the
fill
andwrap
functions intact in signature, and addedwrap_dynamic
andfill_dynamic
that take the closure.I was not really inspired with those names, if you want me to change them to something else I would be glad to do it.
I also chose to re-export
unicode_width::UnicodeWidthChar
as it will probably come in handy for people writing custom closures, avoiding them to depend onunicode_width
.Example:
This example illustrates an hypothetical case where an h would actually take 5 columns instead of 1.
I included it as a test, with its equivalent for
Wrapper::fill_dynamic
.Benchmark:
Zero-cost abstraction really is awesome.