-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duplicate responses when paging query results with api.inaturalist.org/v1/projects/autocomplete #218
Comments
This is a forum thread discussing this issue: https://forum.inaturalist.org/t/unable-to-page-through-projects-from-the-api/17328/11 |
The autocomplete endpoint is... for autocomplete interfaces, not for scraping. It shouldn't have duplicates so I'll leave this open, but you might want to try https://api.inaturalist.org/v1/docs/#!/Projects/get_projects instead. |
If i'm looking for all projects containing Bioblitz, emulating this search: I would use Thanks for the help! |
The source of the problem is discussed in a similar ticket #227 (comment) . I'm going to close this as there is little we can do on our end to prevent duplicates across pages when not using a unique ordering param as long as we're using this version of Elasticsearch. We could add a statement to the API documentation describing the problem and stating that unless you're sorting by a reasonably unique order parameter like ID or date, we cannot guarantee complete and non-unique results across multiple pages of requests. |
We are trying to build an output of all projects with "Bioblitz" in the description. Using the
https://api.inaturalist.org/v1/projects/autocomplete?q=
GET request and collating the json outputs in python.I'm looping over a request with page size 100, and waiting at least a second between requests (less then 60 requests a minute as per guidelines).
I'm getting a lot of duplicate responses, and also not fetching all results. I'm expecting around 1600 pages, but only getting around 500 total results.
This is the code I'm running:
Am I addressing the API in the wrong way? Is there a way I can get a larger query response and page trough that result? Or is there a bug in the API resulting in duplicate responses?
The text was updated successfully, but these errors were encountered: