-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text/tabwriter: character width #8273
Comments
This as a (relatively minor) change to the tabwriter so that it can handle single and double-width characters based on the fixed (_font-independent_) Unicode Annex #11 width information, and assuming that the layout is for fixed-width (and multiples of the fixed-width) characters. It is an explicit non-goal to make the tabwriter work for variable-width fonts at this time (it is possible, but it only makes sense in context with an IDE which lays out code depending on font size). Owner changed to @griesemer. Status changed to Thinking. |
I see support for full-width characters as something integral to this package. It would be a bit sad if we left many millions of users in countries that use CJK characters without a usably text/tabwriter package. |
For an example implementation of a function to figure out how many columns a character occupies, see https://github.com/mattn/go-runewidth. |
For what it's worth, this is also a problem with combining characters (and not all meaningful combinations have canonical forms):
|
@imuli Combining characters are handled as if their width is 0. This is fine if the code will never introduce a line-break before a combining character (which it doesn't). There's no need for canonical forms as this scheme works just fine. |
@fuzxxl Yes, the package you linked to handles combining characters just fine. I meant that they are another side of this bug however, one that perhaps doesn't fall under "variable width font". |
Any updates on this? |
@XenoPhex Sorry, but this package is frozen: https://go-review.googlesource.com/#/c/31910/2/src/text/tabwriter/tabwriter.go . |
PS: Even if the package were not frozen, we are not going to add specific character sets or tables to this package for special treatment. The only sensible approach would be to provide a function that given a Unicode char returns a width, leaving the actual width determination to a client. However, the only way we could add such a function is by extending the API; specifically it would probably require a new Init function. This package is one of the earliest Go packages with some features (like HTML filtering) that are not needed/used anymore (at least by gofmt). We are not going to make further changes at this point. If you need a special version, you can always vendor and adjust the code. A future gofmt might use a rewritten and trimmed version of this package. None of this is high-priority. I will close this issue. |
FYI: one another ways to do it. https://github.com/olekukonko/tablewriter |
@griesemer thanks for pointing me to this issue. This comes up once in a while. The width information you need is already in golang.org/x/text/width. An implementation is not straightforward, though, as width cannot be determined unambiguously:
Now arguably, with the current implementation will never align properly for anybody if any non-halfwidth rune is used. One could at least:
I've implemented an algorithm to determine a string width based on some experience-based recommendation for interpretation of ambiguous characters for exactly this purpose. It was decided not to add this way back then, but perhaps in light of Go 2 the willingness to change things have increased. The main drawbacks of this approach:
The last one may be nasty if people are collaborating on the same project using different versions of gofmt. Gofmt would probably need some kind of logic to prevent flipping back and forth between two different interpretations and allow a flag to force an update. Another complementary approach is to allow line breaks after table values so that values are indented and spaced independently of the keys. Personally I think the best would be to rely on editors to do the outlining correctly, but have a best-effort implementation with some amount of stability guarantees that will render the alignment correctly in the majority of cases (albeit a small majority, I guess 66%). Note that even though no implementation will get the indentation right for everybody, it will at least at least do the right thing in many situations whereas now it is guaranteed to never do the right thing. |
@mpvl Thanks for the info. I don't think we want to be dependent on x/text. I was hoping that there might be a small number of unicode code point ranges that we could trivially detect (and that are unlikely to change in the future) to identify full-width/wide chars and just give them the space of 2 characters. Of course this all depends on the actual font used during rendering and so this assumes that wide characters are taking the space of 2 regular characters in that font. |
@griesemer: I don't think that is a scalable approach and leaves out handling zero-width characters, which is easier actually. We could do something similar what is done for core though: generate the tables in x/text and then copy them in to gofmt. x/text has been set up to automate this. |
At my visit to Gopher China I did some polling and almost exclusively people were using the preinstalled fonts for their editor (VSCode etc.) or a variant that would result in a CJK to Latin ratio of 5:3. Only sporadically somebody reported indeed using the traditional 2:1 ratio. IOW, it seems that adopting a 2:1 ratio will not fix the problem for the majority of the people. Conversely, adopting a 5:3 ratio would seem to do the trick, but it would also result in some peculiar artifacts in the gofmt rendering. I'm not sure that it is worth it. This doesn't preclude providing better handling for modifiers, of course. Emojis:Latin is typically also 5:3. Admittedly, my sample size was small (about 20), so I can Asta do a more large-scale poll, but I wouldn't hold my breath. It seems that having editor plugins to handle this really the most ideal approach. |
Thanks, @mpvl, that is useful additional input. It sounds like there's no simple solution to address this. |
The text was updated successfully, but these errors were encountered: