Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code blocks: Support partial text highlighting #1478

Merged
merged 21 commits into from
Feb 27, 2021

Conversation

ryoarmanda
Copy link
Contributor

@ryoarmanda ryoarmanda commented Feb 12, 2021

What is the purpose of this pull request?

  • Feature addition or enhancement

Resolves #1381

Overview of changes:
Partial text highlighting is available with the line-slice syntax format lineNumber[start:end]. You can see this in action in the deployed UG here.

The processing is done in NodeProcessor in order to efficiently leverage the parsed HTML to traverse the code block node. For each line, the highlighting range is carried over by hl-start and hl-end attributes, specified from the markdown-it patch, which will be extracted in NodeProcessor.

In short, the traversal strategy is as follows:

  • If the node is a "text" node, determine whether some/all/none of the text should be highlighted. As this node only exists as a child of another node, inform the node's parent on the decision via return value.
  • If the node is a "tag" node, recursively call the traversal to each child, and collate the highlight data from the return value as above.
    • If all children want the node to highlight, we can just add the highlighted class in the node for conciseness
    • If not, handle the highlight for each child accordingly, which can include transforming "text" nodes into "tag" nodes so highlighting can be applied to them.

With the addition of partial text highlighting, better support for range highlighting is achievable. Users can specify whether to
start/end the range with partial text highlight. Range highlight processing is modified to support this.

Anything you'd like to highlight / discuss:

  • Any comments for improvements on the general approach is appreciated. Particularly, the DOM manipulation specifics (such as setting the references) might need a bit more set of eyes, I don't know if I missed anything.
  • About range highlighting: In the past we have agreed that the default (single-line) highlight style should be full-text, but we haven't really discussed on the ranged version. Should the default highlight for the ones in-the-middle of the range be full-text or full-line? Currently it's full-text (and at the moment, full-line can only be invoked when empty line-slices are present), and I'm open for suggestions.
  • Edit: About indices convention: should it be 0-based or 1-based? As the slicing is akin to python's slice, it's natural to start with 0 as users may be accustomed to do it in that way. However, I also feel that it's more natural to start counting the characters in the line from 1.

Testing instructions:

Proposed commit message: (wrap lines at 72 characters)
Code blocks: Support partial text highlighting

The code blocks line highlighting functionality only supports full-text
and full-line highlights. Authors who wish to only highlight certain
words or key variables are not able to do so with the supported syntax.

Let's add a new highlight syntax to provide authors with the ability to
partially highlight text in a line.


Checklist: ☑️

  • Updated the documentation for feature additions and enhancements
  • Added tests for bug fixes or features
  • Linked all related issues
  • No blatantly unrelated changes
  • Pinged someone for a review!

@ang-zeyu ang-zeyu mentioned this pull request Feb 13, 2021
10 tasks
Copy link
Contributor

@raysonkoh raysonkoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work at getting this to work @ryoarmanda !

For the proposed syntax lineNumber[start:end], would it be better to have start and end refer to the index of a word in a line? I think this would reduce the work on the part of the user as they would not need to manually count the character indices.

What are your thoughts about this?

}

const [boundStart, boundEnd] = this.bounds;
const start = lineStart <= boundStart && boundStart <= lineEnd ? boundStart : lineStart;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we put a parenthesis around the two conditions for readability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, no problem

return data;
});

if (shouldHighlight.every(v => v === true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have .every(v => v) if shouldHighlight is an array of booleans?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all items in shouldHighlight array is boolean. It is just an array to collate the return value as described in the beginning of the function, which may be true, false, or an array of two numbers. I explicitly used === true and === false to not include the array of two numbers (which is truthy in itself).

In hindsight maybe I should call this just highlightData

// Essentially, we have to change the text node to become a tag node

node.children.forEach((child, idx) => {
if (shouldHighlight[idx] === false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have !shouldHighlight[idx] if shouldHighlight is an array of booleans?

const text = child.data;
let newElement;

if (shouldHighlight[idx] === true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be similar to the issue above.

@ryoarmanda
Copy link
Contributor Author

ryoarmanda commented Feb 14, 2021

For the proposed syntax lineNumber[start:end], would it be better to have start and end refer to the index of a word in a line? I think this would reduce the work on the part of the user as they would not need to manually count the character indices.

Hmm I was working under the spec on the original discussion here. My two cents is, if the unit of highlighting is words rather than characters, we would have to do some extra work in determining which of these characters can be grouped up as words. Moreover, with how code is generally written, sometimes users may want to highlight beyond words, such as words continued with special characters (like List<Item>), etc, and then the distinction of words becomes blurry (e.g. do we add special characters into words?)

@damithc
Copy link
Contributor

damithc commented Feb 14, 2021

For the proposed syntax lineNumber[start:end], would it be better to have start and end refer to the index of a word in a line? I think this would reduce the work on the part of the user as they would not need to manually count the character indices.

Having to count characters is not ideal, but I guess we can anticipate users will do that in two passes i.e., uses a rough count in the first try and tweak the numbers based on the result.
We can consider supporting an additional format where the user specifies the exact text to highlight e.g., 12['void main'] but that can be a separate PR, if at all?

@ryoarmanda
Copy link
Contributor Author

We can consider supporting an additional format where the user specifies the exact text to highlight e.g., 12['void main'] but that can be a separate PR, if at all?

This can be a good addition as the easier version of the line-slice syntax, provided the section to be highlighted is not that long. The line-slice syntax can then be a more advanced-user variant where users can express highlights in a more concise way. In fact, we can build up this easier syntax on top of the line-slice one.

But to (user-side) writing and (dev-side) parsing the rule can be quite tricky, there might be quotes present in the text to be highlighted, in which the user may have to manually escape this in order for the whole highlight-lines to keep properly parsed, and MarkBind needs to match and unescape this as well.

May need to put some proper thought into this, I might address it in a separate PR for now.

@damithc
Copy link
Contributor

damithc commented Feb 14, 2021

But to (user-side) writing and (dev-side) parsing the rule can be quite tricky, there might be quotes present in the text to be highlighted, in which the user may have to manually escape this in order for the whole highlight-lines to keep properly parsed, and MarkBind needs to match and unescape this as well.

True. But given it is an additional convenience feature only, we can specify under what conditions the simpler syntax should be used. In all other cases, the user should use the slice syntax instead. Hence, we can avoid dealing with those complex corner cases.

@ryoarmanda
Copy link
Contributor Author

ryoarmanda commented Feb 14, 2021

True. But given it is an additional convenience feature only, we can specify under what conditions the simpler syntax should be used. In all other cases, the user should use the slice syntax instead. Hence, we can avoid dealing with those complex corner cases.

Just now I found out a way to support the convenience syntax on top of the line-slice processing (behind the scenes, the former rule will be converted to the latter, before handing over to NodeProcessor). I might push it here after docs and tests updates.

But yeah, looks like we have a hard restriction, for some reason we can't have the same type of quotes that specifies the whole highlight-lines attribute in whatever inside the []s, even with backslash-escaping. I'm not sure where the main issue lies here, maybe with markdown-it-attrs and/or the patch of it.

So, something with highlight-lines="4['It\"s designed']" can't be done, as is highlight-lines="4['It"s designed']".
But, highlight-lines="4['It\'s designed']" can (and interestingly unexpected: highlight-lines="4['It's designed']" is parsed properly even if there are no backslash-escapes)

@ang-zeyu
Copy link
Contributor

So, something with highlight-lines="4['It\"s designed']" can't be done, as is highlight-lines="4['It"s designed']".
But, highlight-lines="4['It\'s designed']" can (and interestingly unexpected: highlight-lines="4['It's designed']" is parsed properly even if there are no backslash-escapes)

How about sticking to positioning? we could introduce something like 4[1::3] (:: for words instead of :)

Just now I found out a way to support the convenience syntax on top of the line-slice processing (behind the scenes, the former rule will be converted to the latter, before handing over to NodeProcessor). I might push it here after docs and tests updates.

but you can try removing the patch now as well as its no longer needed with #1403

it was introduced as we previously had multiple nunjucks passes to make {% raw %} work, and one markdown pass in between, causing issues with {% raw %} and markdown-it-attrs #1220

@ryoarmanda
Copy link
Contributor Author

How about sticking to positioning? we could introduce something like 4[1::3] (:: for words instead of :)

Perhaps we can explore this syntax as well. I might define a word as just a sequence of non-whitespace characters in order to include special characters (e.g. List<Item> is considered one word). Otherwise, discerning words can be tricky.

However, specific portions of words text might not be able to be highlighted which I guess is to be expected as word-level highlight is inherently less fine than character-level.

For example, if a user wants to highlight only the Item part of private List<Item> items, they can only do so with the character-level highlight.

With this syntax, do you think we should keep the previously proposed one?

@ang-zeyu
Copy link
Contributor

With this syntax, do you think we should keep the previously proposed one?

I'm open to either, specifying the words is more flexible, but also more repititive and verbose.
Specifying positions seems more consistent with character highlight and less verbose, but may be more restrictive as you mentioned, then users would have to fallback to character highlight.

On a personal stance I would go with positions however, for consistency with the whole line-slice syntax. Also, since repitition is against the trend here =P.

Any thoughts? @damithc @raysonkoh

So, something with highlight-lines="4['It"s designed']" can't be done, as is highlight-lines="4['It"s designed']".
But, highlight-lines="4['It's designed']" can (and interestingly unexpected: highlight-lines="4['It's designed']" is parsed properly even if there are no backslash-escapes)

Does &quot; work?

backslash escapes \" are only a commonmark spec, I suppose markdown-it-attrs is free to disregard this =X

@damithc
Copy link
Contributor

damithc commented Feb 16, 2021

On a personal stance I would go with positions however, for consistency with the whole line-slice syntax. Also, since repitition is against the trend here =P.

Any thoughts? @damithc @raysonkoh

@ryoarmanda what are the choices being considered here? I haven't been following the previous discussion.

@ryoarmanda
Copy link
Contributor Author

ryoarmanda commented Feb 17, 2021

What are the choices being considered here? I haven't been following the previous discussion.

To sum up the discussion, we have two proposed syntax for word-highlighting.

One is the line-part syntax lineNumber[part] (e.g. 4['void main'])

  • Pros:
    • Straightforward usage on the user-side (if they want to highlight void main, just write void main within the bracket)
    • Retains the fine granularity of character-highlighting, so highlighting a part of a word is achievable (e.g. user can highlight Item from List<Item> just fine with this syntax)
  • Limitations:
    • Can become quite repetitive and verbose as users have to repeat the content to be highlighted
    • Somehow we can't use the type of quote that is exactly the same as the one used for specifying the highlight-lines value, even with escaping (still under investigation). So, this limits the syntax content to be those who don't have quotes.

Another is the word variant of line-slice syntax lineNumber[start::end] (note the double :s, e.g. 4[1::3])

  • Pros:
    • Similar to the character variant of line-slice syntax, so user does not need to remember an entirely different syntax.
    • More concise
  • Limitations:
    • Highlight is word-level, so granularity is ultimately less fine than character highlight (using the previous example, highlighting Item from List<Item> cannot be achieved with this syntax, and have to use character highlight as fallback)

With their own pros and limitations, which of the two do you prefer to be supported? Or should we just support both at once?

@damithc
Copy link
Contributor

damithc commented Feb 17, 2021

To sum up the discussion, we have two proposed syntax for word-highlighting.

Why not support both but let lineNumber[start::end] specify character positions instead of word positions?
Or lineNumber[start:end] for character positions and lineNumber[start::end] for word positions?

IIRC, we used [ : ] syntax earlier with the plan to extend it for more fine grain control. So, shouldn't we continue with the single colon syntax?

Sorry if these were discussed before.

@raysonkoh
Copy link
Contributor

I think we can support lineNumber[start:end] for character positions and lineNumber[start::end] for word positions for a start.

Somehow we can't use the type of quote that is exactly the same as the one used for specifying the highlight-lines value, even with escaping (still under investigation). So, this limits the syntax content to be those who don't have quotes.

For lineNumber[part], this would be more of a convenient syntax (i.e. anything that can be achieved by lineNumber[part] can be achieved by lineNumber[start:end], but user types less) but as you said, more investigation is needed for some edge case. I think we can raise a separate PR for that.

@raysonkoh
Copy link
Contributor

Some additional concerns about the expected behavior of lineNumber[part]:

  1. If there are duplicates parts in the line, do we highlight all parts or only the first occurrence?
  2. If part is a substring of another word in the same line, would it get highlighted too?

@damithc
Copy link
Contributor

damithc commented Feb 17, 2021

Some additional concerns about the expected behavior of lineNumber[part]:

  1. If there are duplicates parts in the line, do we highlight all parts or only the first occurrence?
  2. If part is a substring of another word in the same line, would it get highlighted too?

My vote:

  1. all
  2. yes

@ang-zeyu
Copy link
Contributor

ang-zeyu commented Feb 17, 2021

IIRC, we used [ : ] syntax earlier with the plan to extend it for more fine grain control. So, shouldn't we continue with the single colon syntax?

yup. character positions would be supported in either choice

Or lineNumber[start:end] for character positions and lineNumber[start::end] for word positions?

I think we should stick with this (versus single colon for words). Its consistent with the existing 1[:] (meaning 'first' to 'last' character) whole line syntax.

Why not support both but let lineNumber[start::end] specify character positions instead of word positions?

I'm open to supporting both as well (although it is a niche feature) (are we going with both?)

@ryoarmanda

If possible, this PR could be split into 4 commits (or prs) as well:

  • character positions
  • word positions (if you're up for it) (or can handle in another PR too - preferably here for 'chronological' order)
  • part syntax
  • docs

I'll be doing a couple things shortly as well to make the reviewing easier:

  • some reorganization of the markdown-it directory (the files here untouched)
  • then enabling eslint appropriately as mentioned previously (hopefully not too many conflicts, do eslint the modifications before merging in to reduce the conflicts)

@ang-zeyu
Copy link
Contributor

ang-zeyu commented Feb 17, 2021

then enabling eslint appropriately as mentioned previously (hopefully not too many conflicts, do eslint the modifications before merging in to reduce the conflicts)

All done. Sorry, this should really have been done beforehand. 😓

diff for highlight/ directory: https://github.com/MarkBind/markbind/pull/1481/files

@ryoarmanda
Copy link
Contributor Author

ryoarmanda commented Feb 18, 2021

If possible, this PR could be split into 4 commits (or prs) as well:

  • character positions
  • word positions (if you're up for it) (or can handle in another PR too - preferably here for 'chronological' order)
  • part syntax
  • docs

Currently in this PR there are already character positions and part syntax (although the part syntax does not yet support highlighting multiple occurrences as proposed by @damithc , will try to figure out more). I will develop the word positions syntax here too.

I am not sure if this is too much introduction of new features in one PR, should I break away the part syntax to another PR, or would it be fine to keep it here as well?

@ang-zeyu
Copy link
Contributor

I am not sure if this is too much introduction of new features in one PR, should I break away the part syntax to another PR, or would it be fine to keep it here as well?

Both sounds fine. It all ties under one logical change (partial text highlighting), but is a little large as you mentioned.

Let's split up the commits if sticking to the PR though. You can also use fixups (or just rebasing in your git ui) to address reviews later in new, separate commits, then squash them with the relevant commit once done)

@@ -176,6 +176,142 @@ class NodeProcessor {
cheerio(node).remove();
}

/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move these to a separate file for a start

@@ -9,6 +9,14 @@ const {
markdownFileExts,
} = require('../constants');

const htmlUnescapedMapping = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about something like https://www.npmjs.com/package/he?

might become a little tedious to maintain the list

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mapping is taken from markdown-it library, just reversed. I did this as in our codebase we use markdown-it/utils function escapeHtml for a lot of purposes, but that package didn't expose unescaping methods for HTML, so I took it upon myself to add a complementary function. Looks like it's going to be only that 5 entries though, as this function only aims to reverse the escapeHtml function and that function only escapes those 5 entries.

Though if you feel we can add a new package to make everything concise I'm also okay with it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh, thanks for clarifying!

In that case let's stick with this then, we might not want to unescape things markdown-it did not escape as well.

On the same lines, should we put this somewhere in the lib/markdown-it folder? Also a comment to document why / where the mappings came from may help 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I agree. This is more related to operations involving markdown-it as well so it makes sense to put it in the lib/markdown-it folder. Though I can't actually expose the new function through the markdownIt object in lib/markdown-it/index.js, and I can't disturb the exports there unless I replace all imports in the project, so I will just make it available for manual import through lib/markdown-it/utils.

It's the best middle ground I have, but I'm afraid it will somewhat be confusing as now we have two different modules for utilities on markdown-it, one from the library itself usually referred to as md.utils, the other is the new utils module. I might put an explanation on the top of our utils module that this is a separate extension of the one from the library.

Copy link
Contributor

@ang-zeyu ang-zeyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! 🙂

Just 1 / 2 edges cases and some minor suggestions for the code:

const highlightLinesInput = getAttributeAndDelete(token, 'highlight-lines');
let highlightRules = [];
if (highlightLinesInput) {
const highlightLines = highlightLinesInput.split(',');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this cause issues if the word variant contains ,?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this will also eat up the commas inside the brackets. I'll add a regex for a more robust split so that it can ignore the inner commas.

highlightRules.forEach((rule) => {
// Note: authors provide line numbers based on the 'start-from' attribute if it exists,
// so we need to shift line numbers back down to start at 0
rule.offsetLines(-startFromZeroBased);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about shifting this into the parsing / construction stage? so index.js dosen't need to be concerned with accidentally (not) calling this method; we keep these concerns isolated there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I hadn't thought of this approach. I can take it one step further so that the result of parseRuleComponent is always a properly defined rule component wrt the code block (i.e. already accounting for line offset, figuring out the actual bounds wrt the line, converting to word slices wrt the line, etc), or null if it's not valid.

So, no need to have intermediate properties such as isWordSlice, linePart, anymore.

}

// Convert word variant of line-slice to char
if (rule.hasWordSlice()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here and hasLinePart

the checks may not be needed as well, since you already have the guards in the convertPartsToSlices / convertWordSliceToCharSlice methods

}

const line = lines[comp.lineNumber - 1]; // line numbers are 1-based
const { 1: content } = HighlightRule._splitCodeAndIndentation(line);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could array destructing be used instead?

Suggested change
const { 1: content } = HighlightRule._splitCodeAndIndentation(line);
const [, content] = HighlightRule._splitCodeAndIndentation(line);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to earlier, could we move this (and the similar logic in convertWordSliceToCharSlice) down into HighlightRuleComponent construction?

So the all the heavy lifting for the component stays in HighlightRuleComponent, HighlightRule just facilitates accessing / using these as a whole 🙂

return new HighlightRule(components);
}

offsetLines(offset) {
this.ruleComponents.forEach(comp => comp.offsetLineNumber(offset));
}

convertPartsToSlices(lines) {
if (!this.hasLinePart()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this guard may be unneeded as well since you have !comp.linePart below

return;
}

codeNode.children.forEach((line) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
codeNode.children.forEach((line) => {
codeNode.children.forEach((codeEl) => {

* @returns {[number, boolean | [number, number]]} An array of two items.
*/
function traverseLinePart(node, hlStart, hlEnd) {
// Return value is an array of two items:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use /*...*/ for multiline comments per https://github.com/airbnb/javascript#comments
dosen't seem to be configured correctly in our eslint rules 🤔

// 2. Highlighting data to be used by the node's parent. It can be:
// - true (ask to apply highlighting from parent)
// - false (do not process this node further)
// - array of two numbers (only for text nodes, inform parent to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about an object with named properties?

{
  numCharTraversed: ...,
   ....
}

});

// Set the references accordingly
node.children.forEach((child, idx) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I read this right, you could use (in the earlier foreach) cheerio(child).wrap('<span class="highlighted"></span>'), so you don't have to fiddle with the lower level linking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this certainly is more concise if the text is undisturbed (like the case when highlightData[idx] === true i.e. highlight spans the whole text), but so far I wasn't able to work a similar solution for the else case (highlight only spans partially). The problem is that the case needs to break up the text, creating multiple elements/nodes in the process before being wrapped up by a span. I tried out something like cheerio(child).wrap('<span></span>'); cheerio(child).html(...) and it doesn't work :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: I found out a way to make it work with cheerio, will implement this approach instead of manually assigning the references.


To highlight only the text portion of the line, you can just use the line numbers as is.
You can specify the rules in many different ways, depending on how you want it to be. There are three main variants:
full text, substring, bounded (character-wise or word-wise), or full line highlighting.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems a little lengthy; Could we compress all this into a table? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree :/ Felt like a lot to explain from the various types, syntax formats, and its limitations. But not sure how far I can condense it to a table, looks like some parts may need to just have a brief sentence or two without making the table feel like a wall of text. I'll try to do it soon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really neat now 👍 thanks!

@@ -56,8 +56,8 @@ Content in a fenced code block
20
```

**`highlight-lines` attr with line-slice syntax of empty indices should highlight leading/trailing spaces
```xml {highlight-lines="2[:],4[:]-5[:]"}
**`highlight-lines` attr with empty (any variant) line-slice syntax should highlight leading/trailing spaces**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add in the edge cases (if they are) earlier once fixed as well

@ryoarmanda
Copy link
Contributor Author

Hi @ang-zeyu, I have reworked the processing flow for the highlight rules, now the major work is done during parsing the component rule (including offsetting line numbers, computing actual bounds, etc) so it is ensured that after parsing, the rule can be applied immediately to the code block without further processing. Can you help review whether the flow is streamlined enough? Thanks!

Copy link
Contributor

@ang-zeyu ang-zeyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excellent work, especially on the test cases and docs.

Just a few more nits:

* @param node The node of the line part to be traversed
* @param hlStart The highlight start position, relative to the start of the line part
* @param hlEnd The highlight end position, relative to the start of the line part
* @returns {object} An array of two items.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed one here

let curr = 0;
const highlightData = node.children.map((child) => {
const data = traverseLinePart(child, hlStart - curr, hlEnd - curr);
curr += data.numCharsTraversed;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would resData.numCharsTraversed += data.numCharsTraversed; be more concise? since curr isn't used anywhere else

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but replacing curr with resData.numCharsTraversed might end up being more verbose (especially at the data constant declaration there). But I'll try to remove curr and rework the hlStart - curr and hlEnd - curr to be computed before the recursive call.

* @return
*/
function highlightCodeBlock(node) {
const codeNode = node.children.find(c => c.name === 'code');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops 🙈, misread this

const [indents, content] = HighlightRule._splitCodeAndIndentation(codeStr);
return `<span>${indents}<span class="highlighted">${content}</span>\n</span>`;
static _highlightPartOfText(codeStr, bounds) {
// Note: As part-of-text highlighting requires walking over the node of the generated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multiline here

</foo>
```

**`highlight-lines` attr with partial word-variant line-slice syntax should defaults highlight to start/end of line**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**`highlight-lines` attr with partial word-variant line-slice syntax should defaults highlight to start/end of line**
**`highlight-lines` attr with partial word-variant line-slice syntax should default highlight to start/end of line**

**Ranged full text highlight**<br>Highlights only the text portion of the lines within the range | `lineStart-lineEnd` | `2-4`
**Ranged full line highlight**<br>Highlights the entirety of the lines within the range | `lineStart[:]-lineEnd` or `lineStart-lineEnd[:]` | `1[:]-5`,`10-12[:]`
**Ranged character-bounded highlight**<br>Highlights the text portion of the lines within the range, but starts/ends at an arbitrary character | `lineStart[start:]-lineEnd` or `lineStart-lineEnd[:end]` | `3[2:]-7`, `4-9[:17]`
**Ranged word-bounded highlight**<br>Highlights the text portion of the lines within the range, but starts/ends at an arbitrary word | `lineStart[start::]-lineEnd` or `lineStart-lineEnd[::end]` | `16[1::]-20`,`22-24[::3]`

<include src="codeAndOutputCode.md" boilerplate >
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's shift this up before the value of highlight-lines... so the user has a brief idea what the usage is like first :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might have missed one ^ @ryoarmanda
disregard if it looks stranger

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this already @ang-zeyu, I just reworded the value of highlight-lines... to the ones in line 85-86 of the file

Edit: strange it doesn't show up in the diffs, you can look for it on the preview site, it's already changed there. Maybe because I force-pushed as I rebased a fixup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this

was suggesting the reverse - moving the example up (the <include> tag), so the user gets a brief overview of the entire syntax usage first before the details

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh okay, will rebase the fix shortly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, the change is reverted in the force push :)

For ranges, you only need to use line-slices on either ends.
Type | Format | Example
-----|--------|--------
**Ranged full text highlight**<br>Highlights only the text portion of the lines within the range | `lineStart-lineEnd` | `2-4`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Ranged full text highlight**<br>Highlights only the text portion of the lines within the range | `lineStart-lineEnd` | `2-4`
**Ranged full text highlight**<br>Highlights from the first non-whitespace character to the last non-whitespace character | `lineStart-lineEnd` | `2-4`

interpretation of text portion is subjective (e.g. in-between whitespaces should / should not be highlighted) =P

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, will do

-----|--------|--------
**Ranged full text highlight**<br>Highlights only the text portion of the lines within the range | `lineStart-lineEnd` | `2-4`
**Ranged full line highlight**<br>Highlights the entirety of the lines within the range | `lineStart[:]-lineEnd` or `lineStart-lineEnd[:]` | `1[:]-5`,`10-12[:]`
**Ranged character-bounded highlight**<br>Highlights the text portion of the lines within the range, but starts/ends at an arbitrary character | `lineStart[start:]-lineEnd` or `lineStart-lineEnd[:end]` | `3[2:]-7`, `4-9[:17]`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like ranged full text highlight, but...
the word one too

@ryoarmanda
Copy link
Contributor Author

The docs has been updated to address the comments, can have a look again and see if there is anything I missed out. Thanks for all the reviews so far @ang-zeyu 🙇

Copy link
Contributor

@ang-zeyu ang-zeyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm 👍

@ang-zeyu ang-zeyu added this to the v3.0 milestone Feb 27, 2021
@ang-zeyu ang-zeyu merged commit 619f457 into MarkBind:master Feb 27, 2021
@damithc
Copy link
Contributor

damithc commented Feb 28, 2021

@ryoarmanda try to avoid 'all possible usage in one example' technique. It might work for test cases but not a good option in user documentation. From user POV, the example is too complicated and unrealistic. Can you do a minor PR to split the example into separate small examples?

@damithc
Copy link
Contributor

damithc commented Feb 28, 2021

BTW, kudos for getting this difficult feature to work @ryoarmanda 💯

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fenced code blocks: support highlighting words
4 participants