[INT-1162] Infinite scroll in file explorer #9715

emmajsadams · 2024-02-07T20:28:56Z

This PR implements infinite scroll pagination in our custom JuypterLab FileBrowser. It does so by setting up a scroll event listener that increments and decrements an index when user scrolls to the upper 20% and lower 20% respectively of the FileBrowser Contents DOM node. When this index changes the FileBrowser is re-rendered. Each index represents an offset of 200 Contents from the start of the overall cached Contents. Since only a 700 Contents are visible at any time regardless of index position or total Contents count this solution can scale with a large number of files (tested with 500k) since the low Contents limit of the Juypterlab FileBrowser is always respected.

As part of testing this I also made a few changes to improve the performance of the each re-render of the FileBrowser.

Cached the initial shallowResponse after navigating to a directory and reuse that in each call of _get. Making an GET request on each re-render is unnecessarily slow and is quite noticeable when infinite scroll causes a re-render.
Only re-render for infinite scroll not on each paginated request for Contents. These requests can happen safely in the background now without needing to update the pagination UI.

One feature currently not implemented is the ability to jump to anywhere in the Contents list. Since this solution works off an index this should be possible in a follow up PR. Dynamically changing the scrollbar height of the Contents DOM node to match the total count of cached Contents then determining where the user clicks to set the index should accomplish this. However this is tricky and worth splitting into a separate story.

Demo

new.infinite.scroll.mov

Why no library was used for infinite scroll

Most mature infinite scroll libraries (TanStack scroll, react-window, react-virtualized) require some sort of framework like React or Svelte and modify the DOM nodes as the user scrolls. This does not work with the Juypterlab FileBrowser code since even if we mounted a ReactWidget that output lumino styled li those li would not have the same event handlers setup by the FileBrowser. On top of that the FileBrowser breaks completely if Contents DOM nodes are modified outside the expected render loop.

…NT-1162

jupyter-extension/src/plugins/mount/mountDrive.ts

bbonenfant

This looks good to me, but the portions were a bit outside my expertise. If you'd like more substantial feed back we might be able to convince someone from console to take a look?

bbonenfant · 2024-02-07T21:03:56Z

jupyter-extension/src/plugins/mount/mountDrive.ts

-// How many items to request per page
-const PAGINATION_NUMBER = 400;
+// How many Content li are visible at a given time for the user to scroll.
+const VISIBLE_CONTENT_LIS = 500;


feel free to not do this, but maybe VISIBLE_CONTENT_LI_COUNT would be a clearer name?

I like that name better will change!

bbonenfant · 2024-02-07T21:06:42Z

jupyter-extension/src/plugins/mount/mountDrive.ts

+// How many items to request per page.
+const PAGINATION_NUMBER = 1000;
+// How tall in pixels a Content li is.
+const LI_HEIGHT_PX = 25;


Is this safe to set as a constant? This seems like something we don't really have control over -- could we extract this info from the browser at runtime?

Good point! I can grab the clientHeight from the first contentsNode.childNodes[0] instead of hard coding this.

bbonenfant · 2024-02-07T21:38:48Z

jupyter-extension/src/plugins/mount/mountDrive.ts

  private _cache: {
    key: string | null;
    now: number | null;
    contents: Contents.IModel[];
+    filteredContents: Contents.IModel[];


Just for clarification, why this addition to the cache?

_cache.contents refers to all the files for the current directory whereas _cache.filteredContents refers to the contents currently filtered by the value the user has typed into the FilterBox above the FileBrowser.

Since _get is called each time the user changes the _index by scrolling, we want to avoid filtering the cached contents each call since the cached contents could be huge. Thus the filtering only occurs when the user changes directory or changes the filtering then is cached.

Ahh ok, the piece I was missing was that _get is called for every scroll. thanks!

emmajsadams · 2024-02-08T19:55:38Z

jupyter-extension/cypress/integration/mount.ts

@@ -18,7 +18,9 @@ describe('mount', () => {
    cy.findAllByText('/ pfs').should('have.length', 2);
    cy.findAllByText('Load').first().click();
    cy.findAllByText('Unload').should('have.length', 1);
-    cy.findAllByText('default_images').first().click();
+    cy.get('#jupyterlab-pachyderm-browser-pfs')


Not exactly sure why this change is needed now, but the problem with searching findByText without an id is that default_images also appears as text in the repo list. When I debugged this locally the text is the element that was clicked which obviously does nothing. Changing this to findByText in the FileBrowser ID scope fixes this issue.

This applies to all other places where I added the get('#jupyterlab-pachyderm-browser-pfs') call.

emmajsadams · 2024-02-08T19:57:30Z

jupyter-extension/cypress/integration/repo-datum-mode-switch.ts

    cy.findAllByText('liberty.png').should('have.length', 1);
-    cy.findAllByText('default_images_branch').first().click();
+    cy.get('#jupyterlab-pachyderm-browser-pfs').findByText('/ pfs').click();


This was really strange I honestly do not know how this test passed before this change. What this test is doing is clicking on a mounted repo folder, checking the file is there, then repeating that for a second mounted repo folder. The problem with the test is that you have to navigate back to the root directory to see the second mounted repo folder, but the test does not do that. Thus I added one step that clicks on the / pfs breadcrumb to make it pass.

emmajsadams · 2024-02-08T19:58:55Z

Ok I responded to your comments and included two new changes @bbonenfant.

Disabled FileBrowser restore so that the user scrolling immediately isn't interrupted.
Cached the initial shallowResponse after navigating to a directory and reuse that in each call of _get. Making an GET request on each re-render is unnecessarily slow and is quite noticeable when infinite scroll causes a re-render.

I commented on the changes that fix the tests to explain why they were needed. I did figure out the issue with running Cypress tests locally, but it requires us migrating to the latest cypress. I will split that out into a separate PR since it results in a bunch of changes.

As far as having console approval maybe lets wait until the next PR which allows the user to jump to anywhere in the Contents list via the scroll bar https://pachyderm.atlassian.net/browse/INT-1203?

bbonenfant · 2024-02-08T23:27:13Z

Those cypress changes are weird. Hopefully moving to a more recent version will make these tests easier to work with. LGTM -- merge when you're ready

This PR implements infinite scroll pagination in our custom JuypterLab FileBrowser. It does so by setting up a scroll event listener that increments and decrements an index when user scrolls to the upper 20% and lower 20% respectively of the FileBrowser Contents DOM node. When this index changes the FileBrowser is re-rendered. Each index represents an offset of 200 Contents from the start of the overall cached Contents. Since only a 700 Contents are visible at any time regardless of index position or total Contents count this solution can scale with a large number of files (tested with 500k) since the low Contents limit of the Juypterlab FileBrowser is always respected. As part of testing this I also made a few changes to improve the performance of the each re-render of the FileBrowser. * Cached the initial shallowResponse after navigating to a directory and reuse that in each call of _get. Making an GET request on each re-render is unnecessarily slow and is quite noticeable when infinite scroll causes a re-render. * Only re-render for infinite scroll not on each paginated request for Contents. These requests can happen safely in the background now without needing to update the pagination UI. One feature currently not implemented is the ability to jump to anywhere in the Contents list. Since this solution works off an index this should be possible in a follow up PR. Dynamically changing the scrollbar height of the Contents DOM node to match the total count of cached Contents then determining where the user clicks to set the index should accomplish this. However this is tricky and worth splitting into a separate story. ### Demo https://github.com/pachyderm/pachyderm/assets/401518/0ee94933-49be-461f-b5ec-cb144d9a2f06 ### Why no library was used for infinite scroll Most mature infinite scroll libraries (TanStack scroll, react-window, react-virtualized) require some sort of framework like React or Svelte and modify the DOM nodes as the user scrolls. This does not work with the Juypterlab FileBrowser code since even if we mounted a ReactWidget that output lumino styled `li` those li would not have the same event handlers setup by the FileBrowser. On top of that the FileBrowser breaks completely if Contents DOM nodes are modified outside the expected render loop.

- Lumino tabs (#9692) - Uninstantiated python client bugfix (#9705) - Prevent bad config from being stored (#9711) - Remove dev_server (#9713) - UNKNOWN cluster status state (#9719) - Prevent error popups bugfix (#9721) - Infinite scroll in file explorer (#9715) - Fix local cypress tests (#9728) - Infinite scroll w/ scrollbar (#9746) - Clear mounted datums before new datums (#9716) - Make input spec non-resizeable (#9720)

emmajsadams added 5 commits February 7, 2024 19:23

Delete paging widget

fdddcc5

Remove paging and disable refresh

9079054

Update Drive to use infinite scroll

668868f

Remove unnecssary console.log

fad6cec

Update comments and naming

3d7bcc4

emmajsadams requested a review from bbonenfant February 7, 2024 20:28

emmajsadams assigned bbonenfant Feb 7, 2024

emmajsadams added 3 commits February 7, 2024 20:32

Merge branch 'master' of github.com:pachyderm/pachyderm into eadams/I…

0b68ea0

…NT-1162

Run eslint

37570e0

Remove any typing

5d52cd8

emmajsadams commented Feb 7, 2024

View reviewed changes

jupyter-extension/src/plugins/mount/mountDrive.ts Show resolved Hide resolved

bbonenfant approved these changes Feb 7, 2024

View reviewed changes

emmajsadams added 11 commits February 8, 2024 00:44

Prevent restoring which messes with infinite scroll

fb06aac

Refactor mountDrive to avoid a shallowResponse on each call of _get

150722d

Merge branch 'master' into eadams/INT-1162

c368a14

Add check for localPath

71f5fe5

Revert refresh interval back to normal

adfea78

Test dblclick instead

67fe371

Add id select and dblclick

719ffff

Run eslint on mount.ts

8946757

Update getters to use id

37ccb4b

Add switch back pfs breadcrumb click

4490832

Run eslint

e232778

emmajsadams commented Feb 8, 2024

View reviewed changes

emmajsadams merged commit 41ab138 into master Feb 9, 2024
12 checks passed

emmajsadams deleted the eadams/INT-1162 branch February 13, 2024 03:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[INT-1162] Infinite scroll in file explorer #9715

[INT-1162] Infinite scroll in file explorer #9715

emmajsadams commented Feb 7, 2024 •

edited

bbonenfant left a comment

bbonenfant Feb 7, 2024

emmajsadams Feb 8, 2024

bbonenfant Feb 7, 2024

emmajsadams Feb 8, 2024

bbonenfant Feb 7, 2024

emmajsadams Feb 8, 2024

bbonenfant Feb 8, 2024

emmajsadams Feb 8, 2024 •

edited

emmajsadams Feb 8, 2024

emmajsadams commented Feb 8, 2024

bbonenfant commented Feb 8, 2024

[INT-1162] Infinite scroll in file explorer #9715

[INT-1162] Infinite scroll in file explorer #9715

Conversation

emmajsadams commented Feb 7, 2024 • edited

Demo

Why no library was used for infinite scroll

bbonenfant left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emmajsadams Feb 8, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emmajsadams commented Feb 8, 2024

bbonenfant commented Feb 8, 2024

emmajsadams commented Feb 7, 2024 •

edited

emmajsadams Feb 8, 2024 •

edited