Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add failing test case for new-lines #3

Closed
wants to merge 1 commit into from

Conversation

Gregoor
Copy link

@Gregoor Gregoor commented Jun 21, 2021

Initial checklist

  • I read the support docs
  • I read the contributing guide
  • I agree to follow the code of conduct
  • I searched issues and couldn’t find anything (or linked relevant results below)
  • If applicable, I’ve added docs and tests

Description of changes

Hi there,

thanks for putting all the work into this. I found a case where this lib diverges from innerText and added a test case for it. If you agree that it should behave differently, I'd be happy to also work on a fix.

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Jun 21, 2021
@wooorm

This comment has been minimized.

@wooorm
Copy link
Member

wooorm commented Jun 23, 2021

ah, it makes more sense to add it to the dom, which does use the space:

const $div = document.createElement('div')
const $span = document.createElement('span')
$div.appendChild(document.createTextNode('alpha\n'))
$div.appendChild($span)
$span.appendChild(document.createTextNode('bravo'))

document.body.innerHTML = ''
document.body.appendChild($div)

document.body.innerText
alpha bravo

@Gregoor
Copy link
Author

Gregoor commented Jun 23, 2021

I could have also added a fiddle, for completeness sake, here is one: https://jsfiddle.net/L5y31gxc/ (though I see you also reproduced it already)

@wooorm wooorm added the 🙆 yes/confirmed This is confirmed and ready to be worked on label Jun 23, 2021
@github-actions github-actions bot added 👍 phase/yes Post is accepted and can be worked on and removed 🤞 phase/open Post is being triaged manually labels Jun 23, 2021
@github-actions

This comment has been minimized.

@wooorm
Copy link
Member

wooorm commented Jun 23, 2021

Hmm, I looked into it for a while. Here are some more tests:

+  t.equal(
+    toText(h('p', ['A\n', h('span', 'b')])),
+    'A b',
+    'should support line endings around element breaks (1)'
+  )
+
+  t.equal(
+    toText(h('p', ['A\nb', h('span', 'c')])),
+    'A bc',
+    'should support line endings around element breaks (2)'
+  )
+
+  t.equal(
+    toText(h('p', ['A', h('span', '\nb')])),
+    'A b',
+    'should support line endings around element breaks (3)'
+  )
+
+  t.equal(
+    toText(h('p', ['A\n', h('span', '\nb')])),
+    'A b',
+    'should support line endings around element breaks (4)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A\n'), h('span', 'b')])),
+    'A b',
+    'should support line endings around element breaks (5)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A'), h('span', '\nb')])),
+    'A b',
+    'should support line endings around element breaks (6)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A\n'), h('span', '\nb')])),
+    'A b',
+    'should support line endings around element breaks (7)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A\n'), 'b'])),
+    'A b',
+    'should support line endings around element breaks (8)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A'), '\nb'])),
+    'A b',
+    'should support line endings around element breaks (9)'
+  )
+
+  t.equal(
+    toText(h('p', [h('span', 'A\n'), '\nb'])),
+    'A b',
+    'should support line endings around element breaks (10)'
+  )
+
+  t.equal(
+    toText(h('div', [h('p', [h('span', 'A\n'), '\nb'])])),
+    'A b',
+    'should support line endings around element breaks (11)'
+  )
+

For a solution, there are two parts. The first loop in collectText is incorrect, and the correct break{before/after} should be given:

-  while (start < value.length) {
+  while (start <= value.length) {
     searchLineFeeds.lastIndex = start
     match = searchLineFeeds.exec(value)
     // @ts-expect-error: `index` is set.
     end = match ? match.index : value.length
-
-    lines.push(
-      // Any sequence of collapsible spaces and tabs immediately preceding or
-      // following a segment break is removed.
-      trimAndCollapseSpacesAndTabs(
-        // [...] ignoring bidi formatting characters (characters with the
-        // Bidi_Control property [UAX9]: ALM, LTR, RTL, LRE-RLO, LRI-PDI) as if
-        // they were not there.
-        value
-          .slice(start, end)
-          .replace(/[\u061C\u200E\u200F\u202A-\u202E\u2066-\u2069]/g, ''),
-        options.breakBefore,
-        options.breakAfter
-      )
-    )
-
+    lines.push(value.slice(start, end))
     start = end + 1
   }
 
+  lines = lines.map((line, i) =>
+    // Any sequence of collapsible spaces and tabs immediately preceding or
+    // following a segment break is removed.
+    trimAndCollapseSpacesAndTabs(
+      // […] ignoring bidi formatting characters (characters with the
+      // Bidi_Control property [UAX9]: ALM, LTR, RTL, LRE-RLO, LRI-PDI) as if
+      // they were not there.
+      line.replace(/[\u061C\u200E\u200F\u202A-\u202E\u2066-\u2069]/g, ''),
+      i === 0 ? options.breakBefore : true,
+      i === lines.length - 1 ? options.breakAfter : true
+    )
+  )
+

Second, is add a space at the start/end, if a line ending is found, and there is no break before/after.
When one is added, that should also affect the next node: it doesn’t need one anymore.
This last part is a bit similar to rehypejs/rehype-minify@64c6206#diff-1b9ff2153d4d4ef101fc2ade7c5df0bf95f910b7e1294785fc0d2d86fdebfa88.

@wooorm wooorm added help wanted 🙏 This could use your insight or help 🐛 type/bug This is a problem 👶 semver/patch This is a backwards-compatible fix labels Jun 23, 2021
@wooorm wooorm closed this in 387eff4 Jan 5, 2023
@github-actions

This comment has been minimized.

@wooorm wooorm added the 💪 phase/solved Post is done label Jan 5, 2023
@wooorm
Copy link
Member

wooorm commented Jan 5, 2023

Found it! :)

@github-actions github-actions bot removed 👍 phase/yes Post is accepted and can be worked on 🙆 yes/confirmed This is confirmed and ready to be worked on help wanted 🙏 This could use your insight or help labels Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 phase/solved Post is done 👶 semver/patch This is a backwards-compatible fix 🐛 type/bug This is a problem
Development

Successfully merging this pull request may close these issues.

None yet

2 participants