-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DuckDuckGo Search API] serpapi_pagination.next
doesn't take into account the current offset of results
#619
Labels
Comments
ilyazub
added a commit
to serpapi/google-search-results-python
that referenced
this issue
Feb 2, 2023
DuckDuckGo tests are failing because DuckDuckGo pagination doesn't take into account an offset of current results: serpapi/public-roadmap#619 Co-authored-by: Dimitry <dimitry@serpapi.com>
jvmvik
pushed a commit
to serpapi/google-search-results-python
that referenced
this issue
May 1, 2023
…client (#30) * Use pagination parameters from SerpApi instead of calculating on the client `start` and `num` parameters are not suitable for token-based pagination. Such pagination is used on Google Maps, YouTube, Google Scholar Authors, and other search engines. This commit consumes URL query parameters for the next page. It stops paginating when parameters not change. Details: #22 Some tests are failing because `start` and `num` parameters are not supported anymore. These tests will be fixed in the following commits. * Add pagination tests for Bing, Baidu, and DuckDuckGo search API clients * Fix typo in SerpApi name in documentation * Add more pagination tests All of the tests follow the same pattern. Limit number of pages, iterate, and check for duplicates in the results. This is to make sure that pagination actually changes pages. * Test pagination for Naver and HomeDepot * Stop pagination when SerpApi backend doesn't update parameters * Fix flake8 linting errors Example errors: https://github.com/serpapi/google-search-results-python/runs/6659757610?check_suite_focus=true#step:5:37 * Lint code via `make lint` Currently linting script exists only in GitHub Action: `.github/workflows/python-package.yml`. This commit wraps that script in Makefile and invokes in an Action. * fix(tests): fix failing integration tests DuckDuckGo tests are failing because DuckDuckGo pagination doesn't take into account an offset of current results: serpapi/public-roadmap#619 Co-authored-by: Dimitry <dimitry@serpapi.com> * perf: run pytest in parallel Sample output: platform linux -- Python 3.10.9, pytest-7.2.1, pluggy-1.0.0 rootdir: /home/ilyazub/Workspace/google-search-results-python plugins: parallel-0.1.1 collected 48 pytest-parallel: 8 workers (processes), 6 tests per worker (threads) `py` dependency is used because pytest-parallel depends on it but doesn't require 😕 kevlened/pytest-parallel#118 Co-authored-by: Dimitry <dimitry@serpai.com> * style: don't lint vendor packages with Flake8 Co-authored-by: Dimitry <dimitry@serpapi.com> * docs: fix minor typos in documentation Co-authored-by: Dimitry <dimitry@serpapi.com> * ci: cache pip dependencies Support Python 3.7+ based on the readme: https://github.com/serpapi/google-search-results-python/blob/35e51c94e7243c29650ed7b630db4e4e6d0c61aa/README.md#L18 Co-authored-by: "dimitryzub <dmitriy@serpapi.com>" --------- Co-authored-by: Dimitry <dimitry@serpapi.com> Co-authored-by: Dimitry <dimitry@serpai.com>
I recently tested this. The first page, shows the wrong total number of organic_results: The second page shows 50 results but shows start=50 on the next_pagination. Where I assume it should be 50+27 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
serpapi_pagination.next
doesn't take into account the current offset of results. (Search Inspect page.)Expected result
serpapi_pagination.next
contains query parameterstart=105
.Actual result
serpapi_pagination.next
contains query parameterstart=29
.The text was updated successfully, but these errors were encountered: