Improve console.lua item width heuristic with term_disp_width#17773
Conversation
|
This is better implemented as a native function rather than a command to avoid command calling overhead. The whole reason of doing this in the first place is for performance, and this calculation will be potentially done on lots of items. |
|
This changes the time to calculate the longest item in the property list from 50 microseconds to 0.2 seconds. |
Oh, I assumed commands were fast enough for this since
Native function? Like in |
|
Yeah but |
|
The easiest fix is to use |
Yeah it's similar for me if I build with optimizations. local str = 'どっちもこっちも\n'
local s = mp.get_time()
for i = 1, 10000 do
mp.command_native({'terminal-display-width', str})
end
print('10k calls: ' .. mp.get_time() - s)
str = str:rep(10000)
s = mp.get_time()
mp.command_native({'terminal-display-width', str})
print('1 call with 10k items: ' .. mp.get_time() - s)10k calls: 0.177s Doing the measurement within the command with 10k concatenated items: Calling a modified command that calls So the overhead is from multiple commands, not from copying strings. EDIT: I enabled optimization but still had ASan enabled here lol, it will be faster without it. |
I agree. commends are mostly for users to use, here we expose C function for our internal use. No need to go through command.
Arguably, it could also be native function. It's not however called in hot loops, unlike the term_disp_width. Also Generally there is reluctance to add native entrypoints, but I feel like lua integration in mpv is almost core level at this point, so we shouldn't be scared about adding internal entry points if it helps make code clearer or better. |
I disagree. I think that watever is exposed to builtin scripts should also be exposed to 3rd party scripts. So that we don't end up with scripts which only we can implement and 3rd party authors can't. No comment on what's the best solution is on this specific issue, but strong opinion that we should not use hidden APIs in internal scripts. EDIT: and non-script clients too, although I don't know how much this specific function would be useful to non-script clients, but this should be the rule that we try to follow. |
|
I just realized this but we can simply skip the calculation and use the whole window width if there are many items. The property list and history always use the whole window anyway. |
No one says otherwise. |
| char *text = cmd->args[0].v.s; | ||
|
|
||
| const unsigned char *cut_pos; | ||
| int width = term_disp_width(bstr0(text), INT_MAX, &cut_pos); |
There was a problem hiding this comment.
If we are going to do this then it should also accept an optional max_width argument so it can exit early when the limit is hit instead of hardcoding INT_MAX.
There was a problem hiding this comment.
I guess it doesn't hurt but I'm not sure whether there are valid use-cases for this. At least not without also exposing cut_pos.
There was a problem hiding this comment.
For the menu specifically, you can pass the osd width and if any of the items hit it we can stop processing further items, and particular item that hit the limit won't be needlessly processed fully.
There was a problem hiding this comment.
How many characters fit depends on the OSD width, font family and size though. The idea is to calculate the terminal width of all items and then use compute_bounds only on the longest item to get the accurate variable font width. I don't know if you can approximate the max needed width and end early based on that.
Ending early for individual items is not important because we already truncate huge lines when the menu is opened, originally because json-subprocess-result made libass completely freeze mpv.
There was a problem hiding this comment.
I think it's fine to omit this functionality until a valid use-case for it is found, an optional argument can always be added later (though it'd have to be a bit funny and return an object with cut_pos too I guess).
43adbcb to
3bb9e6d
Compare
|
Re-inplemented this as a function in mp.utils. It now takes <10ms when listing properties which is definitely better. Did forget to add an interface-changes file, will do that in next push (perhaps when it's decided whether we want to expose cutting too). Also I think the thing about it incorrectly returning 8 was just me lying, or maybe measured something different (will check later whether I can reproduce a wrong width, perhaps I measured with ytsubconcerter ZWSP junk which would still make the result wrong). |
|
It takes 6ms for 11k properties now. Nice. |
|
mpv/player/lua/context_menu.lua Lines 159 to 164 in 1be21a3 |
|
Note that Lines 679 to 683 in 1be21a3 In this case, the 2nd item is actually much wider than the 1st item, but its calculation result is 0 due to invalid character. local input = require 'mp.input'
mp.add_forced_key_binding('a', function()
input.select({
prompt = 'Foobar:',
items = {
('A'):rep(20),
('一'):sub(1, 1) .. ('A'):rep(200),
},
})
end)
|
The correct behavior here is probably to make it consider maximal subparts of ill-formed sequences to have a width of 1, under the assumption the terminal will replace them with a U+FFFD replacement character (this is the behavior kitty docs recommend, not sure how universal it is). It shouldn't bail out on invalid UTF-8, setting So this is something that should be fixed in
Will apply it to that later, thanks. |
Previously this function would just count bytes which was very easily broken by incantations like `OK` which take up 52 bytes while being just 2 normal width characters visually. `terminal_display_width` correctly counts `OK` as 2 visual characters. Note: GitHub strips all these in their web UI but there are many Unicode combining characters after each letter of the `OK`s above that produce a "cursed" text effect. Fixes mpv-player#17772
3bb9e6d to
f16257b
Compare
|
Added a file to The ill-formed UTF-8 issue still exists but it should probably be addressed in a separate PR. |
If mpv is not built with uchardet then invalid UTF-8 is pretty common. Many music files have metadata encoded in GB18030 and SHIFT_JIS. |
That's true, we can resolve this in next PR. There is no documentation for |
The string slicing function |
|
master...afishhh:mpv:partially-ill-formed-wcwidth implements handling of ill-formed UTF-8 in I can submit that first if illegal UTF-8 is such a concern. |
Previously the function just bailed on invalid input, this instead makes it count how many replacement characters would be shown by a terminal complying with the Unicode specification's recommendation here: https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G66453 Fixes mpv-player#17773 (comment)
Previously the function just bailed on invalid input, this instead makes it count how many replacement characters would be shown by a terminal complying with the Unicode specification's recommendation here: https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G66453 Fixes #17773 (comment)


player/{lua,javascript}: add
mp.utils.terminal_display_widthfunctionconsole.lua: use
terminal_display_widthwhen picking longest menu itemPreviously this function would just count bytes which was very easily
broken by incantations like
OKwhich take up 52 bytes while being just2 normal width characters visually.
terminal_display_widthcorrectly countsOKas 2 visual characters.Note: GitHub strips all these in their web UI but there are many Unicode
combining characters after each letter of the
OKs above that producea "cursed" text effect.
Fixes #17772
Considerations:
term_disp_widthskipsESC [sequences and handles tabs in a particular way which probably doesn't match ASS so those may break this somewhatAlternatives: