Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pagination support #722

Closed
1 task done
Tracked by #129
rarkins opened this issue Feb 3, 2019 · 20 comments · Fixed by #833
Closed
1 task done
Tracked by #129

Pagination support #722

rarkins opened this issue Feb 3, 2019 · 20 comments · Fixed by #833
Labels
enhancement This change will extend Got features 🎁 Rewarded on Issuehunt This issue has been rewarded on Issuehunt ✭ help wanted ✭

Comments

@rarkins
Copy link
Contributor

rarkins commented Feb 3, 2019

Issuehunt badges

What would you like to discuss?

With the recent enhancements such as create/extend and hooks, is there any clean way to implement pagination with the current approach?

Note: there exists a discussion in gh-got already but I want to discuss it generically here.

Checklist

  • I have read the documentation.

IssueHunt Summary

szmarczak szmarczak has been rewarded.

Backers (Total: $80.00)

Submitted pull Requests


Tips


IssueHunt has been backed by the following sponsors. Become a sponsor

@szmarczak
Copy link
Collaborator

What about:

const options = {
	hooks: {
		pagination: (response, options) => {
			if (!response) {
				// First request
				options.searchParams = {page: 0};
			}

			options.searchParams.page++;
		}
	}
};

for await (const response of got.paginate(url, options)) {
	// ...
}

@sindresorhus
Copy link
Owner

Shouldn't the hook be called beforePaginate for consistency? And I think the the signature should be (options, isFirstPage) => {}, unless you have other use-cases for the previous response? Should it have some built-in automatic behavior for the Link header?

@szmarczak
Copy link
Collaborator

Shouldn't the hook be called beforePaginate for consistency?

I have no idea. Maybe it should be an option (paginate)... And the hook should be called beforePaginate. I think that way would be better.

unless you have other use-cases for the previous response

The response could give a link to the next page.

Should it have some built-in automatic behavior for the Link header?

Sure, that would be nice.

@sindresorhus
Copy link
Owner

The response could give a link to the next page.

It should at least be called previousResponse, not response.

I have no idea. Maybe it should be an option (paginate)... And the hook should be called beforePaginate. I think that way would be better.

Yeah, maybe drop the hook aspect from it altogether? I wonder if we could make it got(…, {paginate: 'link'}) for automatic Link-header support and got(…, {paginate: (options, previousResponse) => {}}) for custom handling.

@szmarczak
Copy link
Collaborator

That sounds good. Let's do it that way :)

@sindresorhus
Copy link
Owner

Some other use-cases:

  1. I want the first 4 pages of a request. Would be nice if we could make got() work normally in that case and just gather up the first 4 pages for us instead of requiring the verbose async iterator coe.
  2. I want to get new pages until I say I'm done. For example, I want to get pages, filter results, and end it when I have enough filtered items. Would be nice if this too was possible without the verbose async iterator syntax.

@sindresorhus
Copy link
Owner

I would also include a currentPage argument in the paginate method with the page number.

@szmarczak
Copy link
Collaborator

  1. I want the first 4 pages of a request. Would be nice if we could make got() work normally in that case and just gather up the first 4 pages for us instead of requiring the verbose async iterator coe.
const arrayOfResponses = await got.pagination(url, {
	...options,
	pagination: {
		paginate: (previousResponse, options) => {
			// By default this would be the Link-header thing...
		},
		from: 1, // Default = 1
		to: 4 // If specified, it will return an array.
	}
});

I'm not sure if it should be implemented. You can achieve the same in a few lines of code:

const arrayOfResponses = await Promise.all([
	got(url, {searchParams: {page: 1}}),
	got(url, {searchParams: {page: 2}}),
	got(url, {searchParams: {page: 3}}),
	got(url, {searchParams: {page: 4}})
]);
  1. I want to get new pages until I say I'm done. For example, I want to get pages, filter results, and end it when I have enough filtered items. Would be nice if this too was possible without the verbose async iterator syntax.

Not possible. The Got logic is async.

@rarkins
Copy link
Contributor Author

rarkins commented Feb 19, 2019

For my primary use case, I simply want all paginated results with a single got command, e.g. something like await got(url, { paginate: true }).

For my secondary use case, I would like to add a limit to how many results are returned, e.g. maximum 1000. If you prefer that the caller calculate this instead using number of pages returned, it's also OK. But I think users usually think in terms of number of results, not number of pages. The number of returned entries per page is usually an endpoint restriction and users ultimately care about total entries, not total pages. The only downside might be if the number of entries per page is not divisible by the total and hence you need to crop the results.

@sindresorhus
Copy link
Owner

The number of returned entries per page is usually an endpoint restriction and users ultimately care about total entries, not total pages.

👍 This is my experience too. I take back wanting to specify pages. What I actually want is to be able to specify item count.

@sindresorhus
Copy link
Owner

I'm not sure if it should be implemented. You can achieve the same in a few lines of code:

Sure, but you can achieve pagination in Got today too, it's just verbose. The point of this issue is figuring out the most common use-case, and making those super easy, while still giving user's the power to do custom logic.

@sindresorhus
Copy link
Owner

  1. I want to get new pages until I say I'm done. For example, I want to get pages, filter results, and end it when I have enough filtered items. Would be nice if this too was possible without the verbose async iterator syntax.

Not possible. The Got logic is async.

I don't understand this statement. Got being async is exactly what makes this possible. Got would handle getting the required pages by using got.paginate internally and then gather up the results for me, so I only had to call:

got(, {
	pagination: {
		itemCount: 100,
		filter: item => item.isUnicorn
	}
});

In the above example, Got would fetch items and paginate until it had gathered 100 items with item.isUnicorn === true.

@szmarczak
Copy link
Collaborator

Can you elaborate on:

without the verbose async iterator syntax

What do you mean specifically?

@sindresorhus
Copy link
Owner

What do you mean specifically?

Instead of:

const results = [];
for await (const response of got.paginate()) {
	results.concat(response.items);
}

I could just do something like this:

const results = await got(, {
	pagination: true
});

And it would gather all the items for me.

@szmarczak
Copy link
Collaborator

Oh. I thought that by without the verbose async iterator syntax you meant the sync iterator :P

That just returns an array, right?

@sindresorhus
Copy link
Owner

That just returns an array, right?

Yes

@szmarczak szmarczak added this to the v10 milestone Feb 19, 2019
@szmarczak szmarczak changed the title Implementing pagination support via extends/create Pagination support Mar 16, 2019
@IssueHuntBot
Copy link

@IssueHunt has funded $80.00 to this issue.


@sindresorhus sindresorhus removed this from the v10 milestone Apr 18, 2019
@issuehunt-oss issuehunt-oss bot added the 💵 Funded on Issuehunt The issue has been funded on Issuehunt label May 10, 2019
@hutson
Copy link

hutson commented May 20, 2019

Unfortunately my use cases typically revolve around querying slow services, so I benefit from processing each paged result as they're received instead of waiting for got to aggregate all pages together into a single result.

I strongly agree with the switch to thinking about results in terms of items and not pages. All I care about is receiving all results that match my URL query. Whether the server uses pagination or not, doesn't really matter to me.

@sindresorhus
Copy link
Owner

@hutson We could have a transform callback or something so you could do processing. I think this is a common need.

@szmarczak szmarczak pinned this issue Dec 1, 2019
@szmarczak szmarczak mentioned this issue Feb 6, 2020
18 tasks
@sindresorhus sindresorhus unpinned this issue Feb 6, 2020
@issuehunt-oss
Copy link

issuehunt-oss bot commented Feb 7, 2020

@sindresorhus has rewarded $72.00 to @szmarczak. See it on IssueHunt

  • 💰 Total deposit: $80.00
  • 🎉 Repository reward(0%): $0.00
  • 🔧 Service fee(10%): $8.00

@issuehunt-oss issuehunt-oss bot added 🎁 Rewarded on Issuehunt This issue has been rewarded on Issuehunt and removed 💵 Funded on Issuehunt The issue has been funded on Issuehunt labels Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement This change will extend Got features 🎁 Rewarded on Issuehunt This issue has been rewarded on Issuehunt ✭ help wanted ✭
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants