Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Bun.stringWidth #8327

Merged
merged 3 commits into from
Jan 21, 2024
Merged

Introduce Bun.stringWidth #8327

merged 3 commits into from
Jan 21, 2024

Conversation

Jarred-Sumner
Copy link
Collaborator

@Jarred-Sumner Jarred-Sumner commented Jan 21, 2024

What does this PR do?

This fixes an issue where console.table would not align text properly when the input contains ANSII escape sequences.

This also introduces Bun.stringWidth, which has nearly the same API as the extremely popular string-width npm package (> 100 million weekly downloads) except this implementation is roughly 490x faster compared to running string-width in node.

Fixes #8155

bun string-width.mjs
cpu: Apple M1 Max
runtime: bun 1.0.24 (arm64-darwin)

benchmark                                    time (avg)             (minmax)       p75       p99      p995
------------------------------------------------------------------------------- -----------------------------
npm/string-width (ansi + emoji + ascii)   10.68 µs/iter   (8.17 µs832.46 µs)   9.92 µs  16.83 µs  23.13 µs
npm/string-width (ansi + emoji)            8.67 µs/iter     (7.88 µs9.59 µs)   8.92 µs   9.59 µs   9.59 µs
npm/string-width (ansi + ascii)             9.9 µs/iter   (7.75 µs734.63 µs)   8.75 µs   14.5 µs  28.04 µs
Bun.stringWidth (ansi + emoji + ascii)    82.29 ns/iter   (78.89 ns88.89 ns)  83.53 ns  86.75 ns  87.52 ns
Bun.stringWidth (ansi + emoji)            82.48 ns/iter   (76.63 ns94.76 ns)  84.05 ns  87.02 ns   88.3 ns
Bun.stringWidth (ansi + ascii)            24.71 ns/iter   (20.75 ns35.67 ns)  25.37 ns  26.84 ns  27.58 nsnode string-width.mjs
cpu: Apple M1 Max
runtime: node v21.4.0 (arm64-darwin)

benchmark                                    time (avg)             (minmax)       p75       p99      p995
------------------------------------------------------------------------------- -----------------------------
npm/string-width (ansi + emoji + ascii)   13.85 µs/iter  (11.54 µs770.92 µs)  12.71 µs  18.29 µs  22.58 µs
npm/string-width (ansi + emoji)             9.5 µs/iter     (8.94 µs9.98 µs)   9.66 µs   9.98 µs   9.98 µs
npm/string-width (ansi + ascii)           12.31 µs/iter     (9.96 µs2.28 ms)  11.25 µs   15.5 µs  19.54 µs

How did you verify your code works?

There are some tests

@Jarred-Sumner
Copy link
Collaborator Author

Jarred-Sumner commented Jan 21, 2024

image

Copy link
Contributor

Copy link
Contributor

@tiptenbrink
Copy link

This is not a trivial problem, I would be careful taking on the maintenance burden. Looking at the implementation, I think the most important things are correct (also very nice fast implementation), although I'm not entirely sure how a character knows to be width zero after following a zero-width join. I'm sure you've looked at a lot of resources already, but https://www.jeffquast.com/post/ucs-detect-test-results/ is a good article on the difficulty of the problem. It includes some specific examples that I recommend adding to the tests. Of course there is also the problem of width changing over time for Unicode versions, so what is actually displayed might differ by device.

@Jarred-Sumner
Copy link
Collaborator Author

This is not a trivial problem, I would be careful taking on the maintenance burden. Looking at the implementation, I think the most important things are correct (also very nice fast implementation), although I'm not entirely sure how a character knows to be width zero after following a zero-width join. I'm sure you've looked at a lot of resources already, but https://www.jeffquast.com/post/ucs-detect-test-results/ is a good article on the difficulty of the problem. It includes some specific examples that I recommend adding to the tests. Of course there is also the problem of width changing over time for Unicode versions, so what is actually displayed might differ by device.

We're mostly aiming to match the behavior of the string-width library and/or node's readline. @paperdave has a branch that makes it more directly use ICU's tables instead of our copied version. The ZWJ issue will need to be fixed and I had a branch that does it but it really should use icu's functions instead of our copied version.

@Jarred-Sumner
Copy link
Collaborator Author

Merging because I'd like to get this in the release and it feels pretty safe to do so.

@Jarred-Sumner Jarred-Sumner merged commit b82656d into main Jan 21, 2024
27 of 32 checks passed
@Jarred-Sumner Jarred-Sumner deleted the jarred/bun-string-width branch January 21, 2024 12:47
Copy link
Contributor

❌🪟 @Jarred-Sumner, there are 10 test regressions on Windows x86_64

  • test\cli\inspect\inspect.test.ts
  • test\js\bun\shell\bunshell-instance.test.ts
  • test\js\bun\shell\bunshell.test.ts
  • test\js\bun\shell\commands\rm.test.ts
  • test\js\bun\shell\lazy.test.ts
  • test\js\bun\shell\leak.test.ts
  • test\js\bun\shell\lex.test.ts
  • test\js\bun\util\stringWidth.test.ts
  • test\js\node\events\events-cjs.test.js
  • test\regression\issue\08095.test.ts

Full Test Output

ryoppippi pushed a commit to ryoppippi/bun that referenced this pull request Feb 1, 2024
* Introduce `Bun.stringWidth`

* [autofix.ci] apply automated fixes

* Update utils.md

---------

Co-authored-by: Jarred Sumner <709451+Jarred-Sumner@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

console.table() doesn't handle colored text properly
2 participants