Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook: Find in both text model and output #140098

Closed
rebornix opened this issue Jan 4, 2022 · 4 comments · Fixed by #140180
Closed

Notebook: Find in both text model and output #140098

rebornix opened this issue Jan 4, 2022 · 4 comments · Fixed by #140180
Assignees
Labels
feature-request Request for new features or functionality notebook-find on-testplan
Milestone

Comments

@rebornix
Copy link
Member

rebornix commented Jan 4, 2022

We will explore how to support searching text in both text model and outputs. The major challenge is that the notebook is a virtualized list view and we need to search the DOM content.

@rebornix rebornix added this to the January 2022 milestone Jan 4, 2022
@rebornix rebornix added notebook-find feature-request Request for new features or functionality labels Jan 4, 2022
@rebornix
Copy link
Member Author

rebornix commented Jan 6, 2022

Most browsers support basic text search but it comes with quite some weaknesses: no UI customization, no regular expression support, no support for Replace, and most importantly, not working well if the view is virtualized (as it can only search for what’s in the DOM tree). This is one of the reasons why we build our own find widget for the text editor and notebook. It searches on the text model of the resources, instead of rendered view / DOM tree. The search operations are synchronous and most of the time performant.

With that being said, searching on text model has it limitation. It doesn’t work for outputs in notebook documents as outputs can be stored in any format (e.g., binary), and since they are rendered by custom renderers, what’s stored in ipynb file is not necessarily the same as what users would see in the editor after they are rendered.

One simple example is markdown outputs. Say we have a markdown output a**b** . It would be rendered as a<strong>b</strong> in the webview, so if we search for ab, we won’t be able to get any match in its source but browsers’ find dialog can find ab properly.

image

The notebook editor in the core is a virtualized list as we don’t want to render all cell inputs (rendering monaco editors are costly). To allow users to have a seamless search experience, we would need to search on both text models and rendered outputs:

  • Text search on text models of notebook cells. This doesn’t require the cell to be rendered or attached to monaco editor. This is already supported.
  • Text search on outputs DOM nodes.

There is no simple browser API for searching string on DOM so we would explore what we can do to implement something close to the browser’s find experience, which would include:

  • Find all matches of a string in the DOM
  • Highlight all matches and the active match

Search on the DOM

The browser find widget is not accessible to JavaScript applications. The only API available to us is window.find. It can search string on the DOM tree sequentially but it’s stateless and only searches one at a time. To get all the matches, we would need to run window.find multiple times until it searches through the whole DOM tree.

let found = true;
const ranges = [];

while (found) {
  found = window.find(query,
    /* caseSensitive*/ false,
    /* backwards*/ false,
    /* wrapAround*/ false,
    /* wholeWord */ false,
    /* searchInFrames*/ false,
    false
  );

  if (found) {
    ranges.push(document.getSelection().getRangeAt(0).cloneRange())
  }
}

Since window.find searches for the nearest match each time, it will only highlight the active match so you won’t see the highlight colors of all matches like what the native browser find widget does.

image

The other limitation of window.find is in some browsers, like Safari, it will skip Shadow DOMs completely. The github issue notebook’s outputs are rendered in Shadow DOM so we won’t get any match with window.find.

Even though window.find has limitations, IMHO it’s still better than walking the DOM tree and search the textContentof DOM nodes, which can be buggy and sluggish.

Highlight matches

Once we get the ranges of the matches in the DOM tree, the next step is highlighting them based on users’ theme colors. We have a few options but none of them is mature or performant enough:

  • document.execCommand("hiliteColor", false, "#333") can change the active selection’s background with specified color but it has two disadvantages
    • firstly, it doesn’t work in Shadow DOM
    • secondly, it modifies the DOM. For example, highlighting b in text node abc would change the text node to a<span style="background-color: #333">b</span>c. 200 hiliteColor can block the UI for 3 seconds.
      • note that in Firefox, we can highlight multiple ranges at one time, which is a huge improvement but we don’t have multi-range selection in other browers
  • Highlight API Proposal. It’s already shipped in Safari behind experiment flag and based on testing it’s performant enough. However it’s not ready yet in Chromium and we are not sure if it would work for Shadow DOM.

Alternatives

Based on all the investigations, window.find + Highlight API is the most promising combination, window.find can find all the matches and Highlight API can change match ranges background without touching the DOM. But since there is no guarantee when Highlight API will come to all browsers, we would need to find alternatives to deliver a smooth find experience to users early on. Below are a few alternatives we discussed in sync meetings:

  • To support searching in Shadow DOM, we can walk the DOM tree and do our own string search. Jupyter Lab did something similar already Searching cell output jupyterlab/jupyterlab#6768 and they don’t rely on window.find. The performance and reliability is unknown and it won’t work with “Closed” Shadow DOM.
  • We can probably render overlays on top of the find match. Challenges would be line wrapping and Bidi.
  • We can highlight the view port first and then highlight others when it’s idle, but it might slow down the scrolling.
  • In VS Code desktop, we use Electron’s findInFrame API to search text in iframes. It make calls to chromium and the behavior is pretty close to the native find widget, except that the find widget is not rendered. We can consider to use this API other than doing our own window.find in desktop, but currently this API doesn’t allow us to change the find match highlight color. It only gives us the coordinates of the active match but we need the coordinates of all matches in order to know which cell each match belongs to.
  • It would be pretty good if we can have an API in Electron to highlight a range without modifying the DOM (what the native Find Widget does). But this can be a waste since Highlight API is coming.
  • Firefox supports hiliteColor on multi-range selection, can we patch hiliteColor in Electron
    • Highlighting 200 ranges in Chrome sequentially can take 2 seconds while batching them in Firefox can reduce the time to ~200ms.

@rebornix
Copy link
Member Author

rebornix commented Jan 10, 2022

Had offline discussions with @deepak1556 @rzhao271 @mjbvz on Chromium/Electron APIs for find and highlight text and their JavaScript alternatives and here are what we found:

  • We can explore how to extend Electron's FindInFrame API which searches and highlights text without modifying the DOM nodes at all. What the enhanced FindInFrame would offer Explore improvements to the findInFrame api #140420
    • Stateful text search
    • Return coordinates of all matches
    • Navigate to a specific find match (by coordinate)
    • 💪 Customize the highlight color
  • Highlight API proposal. The new Highlight API can highlight a DOM range without tweaking the DOM. Safari has already shipped this feature behind an experiment flag, based on testing it's pretty performant. Edge team is working on this for Chromium.
  • Before we get any native support from Electron or Chromium itself, there is a JavaScript implementation of DOM highlight which wraps the DOM ranges with <mark> element. It's performant and works in Shadow DOM. The only catch is it modifies the DOM which has a lot of side effects (e.g., break event listeners)

With solutions proposed above, we would currently take the JS approach and continue UX experiments, and polish the implementation with more native API coming in.

@rebornix rebornix reopened this Jan 10, 2022
@rebornix
Copy link
Member Author

Good news is the Highlight API is already available in Chromium behind a flag and we can enable it for our upcoming Electron build. We would use the JS version for now and use the Highlight API once it's available. From our Edge friends, the Highlight API is likely to release around April.

Since we support searching in both the cell source code, markup preview and cell outputs, we would introduce a new filter in the notebook find widget, next to Case Sensitivity, Match Case and Regex.

image

@cedric05
Copy link
Contributor

tried it, looking good, were able to search through outputs and inputs.

@github-actions github-actions bot locked and limited conversation to collaborators Feb 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature-request Request for new features or functionality notebook-find on-testplan
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@rebornix @hediet @cedric05 and others