Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve paging experience #1022

Closed
jianghaolu opened this issue Aug 10, 2016 · 10 comments
Closed

Improve paging experience #1022

jianghaolu opened this issue Aug 10, 2016 · 10 comments

Comments

@jianghaolu
Copy link
Contributor

jianghaolu commented Aug 10, 2016

Status quo

Synchronous list() calls return an instance of PagedList which contains only the 1st page of the items. As users

  • iterate through the list or
  • call indexOf()

the list grows only if necessary.

As users

  • call .size() or
  • call .loadAll() or
  • call .toArray() or
  • call .lastIndexOf()

the list gets fully populated.
Asynchronous listAsync() accepts a ListOperationCallback which allows users to handle every page and the entire list. The call automatically makes subsequent calls to list all the pages until ListOperationCallback.progress() defined by the user returns a PagingBehavior.STOP. The method returns a ServiceCall object that ties to the current REST call to fetch a single page.

Problems to fix

  1. Users should have the option not to load the entire list into memory
  2. Users should be able to continue after doing a STOP
  3. Empty paging results should be skipped

Proposed solution

Synchronous

The following call will iterate through the entire list, with only a maximum of 2 pages stored in memory. One page is for the current page. The other is the cached next page, if available. This provides an accurate representation of hasNext() even when nextPageLink points to an empty last page.

PagedList<Foo> fooList = Azure.foos().list();
for (Foo foo : fooList) {
    handleFoo(foo);
}

The following calls will throw exception:

  • .size()
  • .loadAll()
  • .toArray()
  • .lastIndexOf()
    In order to use them, PagedList exposes a entireList() method that returns the entire list. Developers can use this when necessary with caution.

Asynchronous

ListOperationCallback's 3 methods user can override will be changed to

public abstract void failure(Throwable t); // This remains unchanged
public void success(int pages, int size, String nextPageLink); // this is now optional
public PagingBehavior progress(Page<Product> page); // page contains the list and next link

the callback object doesn't store the entire list of items - developers are responsible of doing that themselves.

A sample code for pausing and resuming:

final List<Foo> fooList = new ArrayList<>();
ServiceCall<Page<Foo>> call = Azure.foos().listAsync(new ListOperationCallback<Foo>() {
    @Override
    public void failure(Throwable t) {
        handleError(t);
    }

    @Override
    public PagingBehavior progress(Page<Foo> page) {
        fooList.addAll(page.items());
        return PagingBehavior.STOP;
    }

    @Override
    public void success(int pages, int size, String nextPageLink) {
        doOtherStuff(fooList);

        // Now I want more
        Azure.foos().listNextAsync(nextPageLink, new ListOperationCallback<Foo>() {
            @Override
            public void failure(Throwable t) {
                handleError(t);
            }

            @Override
            public PagingBehavior progress(Page<Foo> page) {
                fooList.addAll(page.items());
                return PagingBehavior.STOP;
            }
        });
    }
});

As you see, we end up having a potential callback hell. We may resolve this by returning an Observable, but that's in the longer term.

@martinsawicki
Copy link

for synch lists, I think .size() throwing is a bad/unexpected user experience. Especially since many lists in reality will be 1 page long, I'm not sure there is a compelling reason to have the user code go through this extra .entireList() layer of indirection. I think many will be tripping up on this, including our own tests, samples, etc. It's just not expected in the user experience. The synch list is for the simple "just make it work" scenarios, so I think it's important to preserve its simplicity. I'm especially concerned about .size() here.

@jianghaolu
Copy link
Contributor Author

@martinsawicki this is a valid point.
We have two other approaches for .size()

  1. return just the size of the current page (preferable to me)
  2. load each page (but not stored in memory) and count the total number of items. This is problematic if developers write the following code:
int size = pagedList.size();
for (int i = 0; i != pagedList.size(); ++i) {
    doStuff(pagedList.get(i));
}

@martinsawicki
Copy link

I think #1 will be confusing. I don't think anyone expects azure.virtualMachines().list().size() to return anything other than the count of the vms. Any other number is just unexpected here, since for the non-advanced users, the fact there is paging happening under the hood is just an esoteric implementation detail.
I can't say that #2 sounds great either.
But is it necessary to change today's synch behavior at all -- couldn't we address the asynch (power user) issues independently?

@jianghaolu
Copy link
Contributor Author

Yes we could.

@anuchandy
Copy link
Member

anuchandy commented Aug 11, 2016

I assume as a part of implementing this - Users should have the option not to load the entire list into memory, we will have a ctr argument in PagedList that indicates whether user want us to [1] store multiple pages in the list internally maintained in PagedList or [2] just store the current page (i.e. he just want to iterate through the list).

If user choose [1] then we should be able to provide normal expected behavior for following methods:

.size()
.loadAll()
.toArray()
.lastIndexOf()

But if he choose [2] then I think it's ok to throw exceptions from the above methods as it was user's choice not to cache the entire pages.

Just my thought on this.

@jianghaolu
Copy link
Contributor Author

jianghaolu commented Aug 11, 2016

The question is which should be the default?
My original proposal is to return a memory-efficient list and expose an entireList() method. The other way could be to return the list we have today and expose an iterable().

@anuchandy
Copy link
Member

i.e. exposing two separate methods in SDK, one return iterable (memory efficient) and the other one PagedList (that caches all pages) ?

@jianghaolu
Copy link
Contributor Author

You can't have return type overloads in Java.

@anuchandy
Copy link
Member

yes, got that, was thinking like PagedList<VirtualMachine> virtualMachines().list() and Iterator<VirtualMachine> virtualMachines().Iterator(). But your proposal seems better Iterator<VirtualMachine> PagedList<VirtualMachine>::fun()

@jianghaolu
Copy link
Contributor Author

Released with beta3.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants