Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial loading of file list #116

Closed
ivantha opened this issue May 9, 2018 · 14 comments
Closed

Partial loading of file list #116

ivantha opened this issue May 9, 2018 · 14 comments

Comments

@ivantha
Copy link
Contributor

ivantha commented May 9, 2018

The file list in a directory should not be loaded at once when the directory has large number of sub-folders and files. It could increase the loading period leading to bad user experience.

However there isn't any method in the owncloud/js-owncloud-client yet that could be used to achieve this task.

@ivantha ivantha self-assigned this May 9, 2018
@DeepDiver1975
Copy link
Member

Things to be done:

  • implement pagination on the server by adding a dav report which allows searching and pagination
  • add support to js-owncloud-client
  • implement in phoenix

@PVince81
Copy link
Contributor

note: the old web UI simply loads the whole file lists but caches it in a JS array and does paginated rendering. Scrolling down would render the second page by reading the next page from the JS array.

Maybe Phoenix could do that as well for now.

@ivantha ivantha added the GSoC label Jun 13, 2018
@DeepDiver1975 DeepDiver1975 added this to the backlog milestone Jan 11, 2019
@butonic
Copy link
Member

butonic commented Mar 13, 2019

hm even 1k files in the same folder triggers the long running js warning of firefox / chromium. We really need pagination for the new architecture.

@DeepDiver1975
Copy link
Member

hm even 1k files in the same folder triggers the long running js warning of firefox / chromium. We really need pagination for the new architecture.

Alternative approach would be to do some kind of syncing of meta data and use the service worker to keep the file lit up to date in parallel.

but this kills IE support 🤷‍♂️

@PVince81
Copy link
Contributor

in theory if we'd support pagination (REPORT method) you could request the next page after scrolling down.

to make sure that the page you're requesting is still from the same result set, send the etag from the folder you are browsing. if the contents changed, might need to refresh the page...

@DeepDiver1975
Copy link
Member

in theory if we'd support pagination (REPORT method) you could request the next page after scrolling down.

to make sure that the page you're requesting is still from the same result set, send the etag from the folder you are browsing. if the contents changed, might need to refresh the page...

well - we are using vuex state store. So in theory the data being displayed in the browser is served from that store and is disconnected to the network calls which are done within the store.

this allows us to browse and operate on the file list without direct interaction of the network and the other way round: network operations have no direct impact on the view.

so we can load the full list into the store (one shot or paginated or delta-updates) and independently display the file list using virtual scrolling or what soever .....

@labkode
Copy link

labkode commented Mar 14, 2019

Hi, if you support grpc-web from the browser this problem goes away:

Consider the example of a client wanting to list the contents of a folder containing millions of entries. In a plain HTTP connection, the response needs to be paginated, bringing more complexity to the client as it needs to perform multiple requests to obtain a full response. By using gRPC server streaming, all the files can be streamed to the client without performing additional network round trips, reducing latency and complexity. On the other side, sometimes a client wants to send various resources to the server. On protocols that do not support client-side streaming like HTTP, the often found solution is to bundle requests into one, increasing the complexity both client and server side. Using gRPC client-side streaming there is no need for bundling as the underlying connection is persistent and messages can be streamed to the server as soon as they are ready on the client side.

I think now that Phoenix is being developed is a good opportunity to give it a try ...
My 2 cents

@DeepDiver1975
Copy link
Member

Sounds really interest @labkode - never the less with the scope that Phoenix shall replace the user interface for ownCloud 10 we need a solution on the WebDAV protocol layer. We will see .....

@butonic
Copy link
Member

butonic commented Mar 14, 2019

@labkode the problem is not transferring the list to the browser. try it with 1k files in a folder. reva does a pretty good job of serving that PROPFIND (it took ~103ms here). The problem is the browser having to parse the results and rendering the items. Neither problem goes away with streaming.

Offset based paging becomes slow for large datasets (slides). We should use keyset based bidirectional paging:

  • Files in a folder are unique, so we can order and filter the result set by the filename
  • When scrolling down we can say give me 100 files after 'foobar.txt'
  • When scrolling up we can say give me 100 files before 'welcome.txt'

Pros:

  • we can limit the files we have to keep in ram to eg. three to five pages (a page being as big as needed to cover the screen height)
  • new files will show up an a page just by scrolling
    • we can even invalidate pages by checking the etag of the folder

Cons

  • the value needed fer filtering changes with the attribute used for ordering
    • makes implementation more complicated
  • we are leaking the sorting and pagination mechanism used in the db / storage up to the ui ... a cursor would be nicer and the server side could provide an encrypted json token the client has to send on the next request to get the next page.
  • scrolling through a huge list fast might cause the client to iterate over all pages, when he probably should just have done an offset query or a percent query, eg. when the user drags the scrollbar down to 80% he says i want to see the file listing at around 80% of the filelist ordered in whatever order
    • we can probably detect fast scrolling and add this kind of query as a performance optimization
  • I can imagine virtual folders containing more than one file with the same name but they then need to use a different pagination mechanism, or they are ordered on a different column. in any case an encrypted cursor would also work here

So actually, we should use encrypted cursor based bidirectional paging:

  • When scrolling down we can say give me 100 files after cursor <encrypted json with eg {"sort":"filename","value":"foobar.txt"}>
  • When scrolling up we can say give me 100 files before cursor <encrypted json with eg {"sort":"filename","value":"welcome.txt"}>

links:

@DeepDiver1975
Copy link
Member

The current approach with respect to filtering and sorting relies on having the full file list available in browser memory. We need to keep a close eye on where we loose time and where we can optimize processing.

Especially with the offline capability in mind we need to work out of the browser mem anyhow.

For me the classic PROPFIND to get the full list is enough. But let's see where we travel the next weeks ...

@labkode
Copy link

labkode commented Mar 15, 2019

@butonic

The problem is the browser having to parse the results and rendering the items. Neither problem goes away with streaming.

The problem is the time you take to show something to the user, that's the only thing the user cares, users have zero opinion on how you implement the things behind, they want to click and see something under less than a second.

Using streaming you can be sure that you will show something to the user in ms-time; having to parse a huge blob you can't. Streaming and pagination are two ways of dealing with the same thing, the first is a new and more efficient approach and the latter is an old-timer ;)

@butonic
Copy link
Member

butonic commented Mar 15, 2019

@labkode can you skip to a certain offset in the stream? We need to differentiate between transport layer and payload again. Of course streaming is nice, but having to stream 100000 entries to see the end of the list still leads to bad user experience. It boils down to a flexible paging mechanism. Each page then should be streamed...

https://hackernoon.com/guys-were-doing-pagination-wrong-f6c18a91b232 has a nice description of paging.

@butonic
Copy link
Member

butonic commented Apr 5, 2019

reva can push out propfinds pretty fast, parsing it is ok as well. critical is handling large amounts of components in vue. For that there is https://github.com/Akryum/vue-virtual-scroller

@pascalwengerter
Copy link
Contributor

Closing this since server-side pagination won't happen anytime soon and we've sped up file table rendering and added pagination

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

6 participants