Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emoji split in word wrap #52

Closed
mscalora opened this issue Feb 13, 2020 · 2 comments
Closed

emoji split in word wrap #52

mscalora opened this issue Feb 13, 2020 · 2 comments

Comments

@mscalora
Copy link
Contributor

Problem Summary

Emoji is being split in half nu the wordwrap mechanism

Expected Result (screen shot if possible)

emoji should wrap instead of being split

Actual Result (screen shot if possible)

illegal char error symbol
wrap-bug

Environment Information

  • OS: Mac

  • Terminal Emulator: iTerm

Steps to Reproduce

  1. run this code form example folder:
const Table = require("../")

process.stdout.columns = process.argv[2] && parseInt(process.argv[2]) || 25

console.log("Starting... (user CTRL-C to give up)")
console.log("\"fix\" by uncommenting line format.js:70")

let output = new Table([
  {minWidth: 6, shrink: 0},
  {minWidth: 12, shrink: 1},
  {minWidth: 24, shrink: 1000}
], [[
  "aaa bbb ccc",
  "aaa bbb ccc 😀😃😄😁 eee fff ggg hhh",
  "aaa bbb ccc ddd eee fff ggg hhh iii jjj kkk lll mmm nnn ooo ppp qqq rrr sss ttt"
]], {
  paddingLeft: 0,
  paddingRight: 0,
  marginTop: 0
}).render()

console.log(output)

Discussion

I did not to dig into this one very much, not sure if the cause is in the smartwrap module

@tecfu
Copy link
Owner

tecfu commented Feb 13, 2020

Options:

  1. Embed a static array of emojis within the smartwrap module, and then refuse to wrap them.
  2. Create option to set an array of character combinations that cannot be split, then pass that array to smartwrap.

Preferred option:
2.

@mscalora
Copy link
Contributor Author

Doesn't the spread op [..."ccc 😀😃😄😁 eee"] idom correctly show where you can't split?

I implemented a string truncate using it that appears to always split chars in the right place.

// eslint-disable-next-line no-control-regex
const ansiMatcher = new RegExp("([^\u001B\u009B]*)([\u001B\u009B][[\\]()#;?]*(?:(?:(?:[a-zA-Z\\d]*(?:;[-a-zA-Z\\d\\/#&.:=?%@~_]*)*)?\u0007)|(?:(?:\\d{1,4}(?:;\\d{0,4})*)?[\\dA-PR-TZcf-ntqry=><~]))|$)", "g")

/**
 * split a unicode string into code points
 * @param {string} s - string with unicode
 * @return {Array<string>} - array of code points
 */
const splitUnicode = s => [...s]

/**
 * truncate a unicode string that may contain ansi and/or multi-column unicode characters
 * @param {string} s - string with unicode and/or multi-column unicode characters
 * @param {number} length - width in terminal char column
 * @return {string}
 */
Format.truncate = (s, length) => {
  let total = 0
  return s.replace(ansiMatcher, function (match, content, ansi) {
    if (total === length || 0) {
      // skip all visible content
      return ansi
    }
    let contentLength = Wcwidth(content)
    if (contentLength <= length - total) {
      // pass all visible content
      total += contentLength
      return content + ansi
    }
    // partial chunk
    let chars = splitUnicode(content)
    while (Wcwidth(chars.join("")) > length - total) {
      chars.pop()
    }
    content = chars.join("")
    // set total so the rest of content is skipped
    total = length
    return content + ansi
  })
}

-Mike

tecfu added a commit that referenced this issue Feb 14, 2020
tecfu added a commit that referenced this issue Feb 14, 2020
tecfu added a commit that referenced this issue Feb 14, 2020
@tecfu tecfu closed this as completed in 0e7ac1b Feb 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants