New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[INT-1162] Infinite scroll in file explorer #9715
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but the portions were a bit outside my expertise. If you'd like more substantial feed back we might be able to convince someone from console to take a look?
// How many items to request per page | ||
const PAGINATION_NUMBER = 400; | ||
// How many Content li are visible at a given time for the user to scroll. | ||
const VISIBLE_CONTENT_LIS = 500; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feel free to not do this, but maybe VISIBLE_CONTENT_LI_COUNT
would be a clearer name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that name better will change!
// How many items to request per page. | ||
const PAGINATION_NUMBER = 1000; | ||
// How tall in pixels a Content li is. | ||
const LI_HEIGHT_PX = 25; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this safe to set as a constant? This seems like something we don't really have control over -- could we extract this info from the browser at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I can grab the clientHeight from the first contentsNode.childNodes[0] instead of hard coding this.
private _cache: { | ||
key: string | null; | ||
now: number | null; | ||
contents: Contents.IModel[]; | ||
filteredContents: Contents.IModel[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarification, why this addition to the cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_cache.contents
refers to all the files for the current directory whereas _cache.filteredContents
refers to the contents currently filtered by the value the user has typed into the FilterBox above the FileBrowser.
Since _get
is called each time the user changes the _index
by scrolling, we want to avoid filtering the cached contents each call since the cached contents could be huge. Thus the filtering only occurs when the user changes directory or changes the filtering then is cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh ok, the piece I was missing was that _get is called for every scroll. thanks!
@@ -18,7 +18,9 @@ describe('mount', () => { | |||
cy.findAllByText('/ pfs').should('have.length', 2); | |||
cy.findAllByText('Load').first().click(); | |||
cy.findAllByText('Unload').should('have.length', 1); | |||
cy.findAllByText('default_images').first().click(); | |||
cy.get('#jupyterlab-pachyderm-browser-pfs') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly sure why this change is needed now, but the problem with searching findByText
without an id is that default_images
also appears as text in the repo list. When I debugged this locally the text is the element that was clicked which obviously does nothing. Changing this to findByText in the FileBrowser ID scope fixes this issue.
This applies to all other places where I added the get('#jupyterlab-pachyderm-browser-pfs')
call.
cy.findAllByText('liberty.png').should('have.length', 1); | ||
cy.findAllByText('default_images_branch').first().click(); | ||
cy.get('#jupyterlab-pachyderm-browser-pfs').findByText('/ pfs').click(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was really strange I honestly do not know how this test passed before this change. What this test is doing is clicking on a mounted repo folder, checking the file is there, then repeating that for a second mounted repo folder. The problem with the test is that you have to navigate back to the root directory to see the second mounted repo folder, but the test does not do that. Thus I added one step that clicks on the / pfs
breadcrumb to make it pass.
Ok I responded to your comments and included two new changes @bbonenfant.
I commented on the changes that fix the tests to explain why they were needed. I did figure out the issue with running Cypress tests locally, but it requires us migrating to the latest cypress. I will split that out into a separate PR since it results in a bunch of changes. As far as having console approval maybe lets wait until the next PR which allows the user to jump to anywhere in the Contents list via the scroll bar https://pachyderm.atlassian.net/browse/INT-1203? |
Those cypress changes are weird. Hopefully moving to a more recent version will make these tests easier to work with. LGTM -- merge when you're ready |
This PR implements infinite scroll pagination in our custom JuypterLab FileBrowser. It does so by setting up a scroll event listener that increments and decrements an index when user scrolls to the upper 20% and lower 20% respectively of the FileBrowser Contents DOM node. When this index changes the FileBrowser is re-rendered. Each index represents an offset of 200 Contents from the start of the overall cached Contents. Since only a 700 Contents are visible at any time regardless of index position or total Contents count this solution can scale with a large number of files (tested with 500k) since the low Contents limit of the Juypterlab FileBrowser is always respected. As part of testing this I also made a few changes to improve the performance of the each re-render of the FileBrowser. * Cached the initial shallowResponse after navigating to a directory and reuse that in each call of _get. Making an GET request on each re-render is unnecessarily slow and is quite noticeable when infinite scroll causes a re-render. * Only re-render for infinite scroll not on each paginated request for Contents. These requests can happen safely in the background now without needing to update the pagination UI. One feature currently not implemented is the ability to jump to anywhere in the Contents list. Since this solution works off an index this should be possible in a follow up PR. Dynamically changing the scrollbar height of the Contents DOM node to match the total count of cached Contents then determining where the user clicks to set the index should accomplish this. However this is tricky and worth splitting into a separate story. ### Demo https://github.com/pachyderm/pachyderm/assets/401518/0ee94933-49be-461f-b5ec-cb144d9a2f06 ### Why no library was used for infinite scroll Most mature infinite scroll libraries (TanStack scroll, react-window, react-virtualized) require some sort of framework like React or Svelte and modify the DOM nodes as the user scrolls. This does not work with the Juypterlab FileBrowser code since even if we mounted a ReactWidget that output lumino styled `li` those li would not have the same event handlers setup by the FileBrowser. On top of that the FileBrowser breaks completely if Contents DOM nodes are modified outside the expected render loop.
- Lumino tabs (#9692) - Uninstantiated python client bugfix (#9705) - Prevent bad config from being stored (#9711) - Remove dev_server (#9713) - UNKNOWN cluster status state (#9719) - Prevent error popups bugfix (#9721) - Infinite scroll in file explorer (#9715) - Fix local cypress tests (#9728) - Infinite scroll w/ scrollbar (#9746) - Clear mounted datums before new datums (#9716) - Make input spec non-resizeable (#9720)
This PR implements infinite scroll pagination in our custom JuypterLab FileBrowser. It does so by setting up a scroll event listener that increments and decrements an index when user scrolls to the upper 20% and lower 20% respectively of the FileBrowser Contents DOM node. When this index changes the FileBrowser is re-rendered. Each index represents an offset of 200 Contents from the start of the overall cached Contents. Since only a 700 Contents are visible at any time regardless of index position or total Contents count this solution can scale with a large number of files (tested with 500k) since the low Contents limit of the Juypterlab FileBrowser is always respected.
As part of testing this I also made a few changes to improve the performance of the each re-render of the FileBrowser.
One feature currently not implemented is the ability to jump to anywhere in the Contents list. Since this solution works off an index this should be possible in a follow up PR. Dynamically changing the scrollbar height of the Contents DOM node to match the total count of cached Contents then determining where the user clicks to set the index should accomplish this. However this is tricky and worth splitting into a separate story.
Demo
new.infinite.scroll.mov
Why no library was used for infinite scroll
Most mature infinite scroll libraries (TanStack scroll, react-window, react-virtualized) require some sort of framework like React or Svelte and modify the DOM nodes as the user scrolls. This does not work with the Juypterlab FileBrowser code since even if we mounted a ReactWidget that output lumino styled
li
those li would not have the same event handlers setup by the FileBrowser. On top of that the FileBrowser breaks completely if Contents DOM nodes are modified outside the expected render loop.