You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
GET /observations/observers is giving some funky responses and seems to have an affinity for the number 500.
TL;DR
Observations paging only gives unique results until ~500th record, apparently no matter the per_page or page values
After ~500 more than per_page results are returned
Duplicate records are not exact duplicates (they have different stats)
order_by=observation_count works (it's the default), but order_by=species_count maxes out at 500 records and returns {"error":"Error","status":500} thereafter
Pages past the 500th record contain 500 records no matter per_page. Only 49 non-duplicate records were returned in the subsequent page, and no unique records were returned in the one after. Record 503 is the first duplicate.
Duplicated records are not exact duplicates
The duplicated observers that are returned after ~500 are not actually duplicates, exactly. Their user info is the same but their stats are different. Here are 2 examples:
Curiously in both of these examples, species_count in the second record equals observation_count in the first. Maybe that's the key to finding this bug.
Requests with order_by=species_count max out at 500 records
GET /observations/observers
is giving some funky responses and seems to have an affinity for the number 500.TL;DR
per_page
orpage
valuesper_page
results are returnedorder_by=observation_count
works (it's the default), butorder_by=species_count
maxes out at 500 records and returns{"error":"Error","status":500}
thereafterBreakdown with examples
Pages after ~500th record contain duplicates
Here's some Python code (tweaked from #227) to test this behavior:
Results from a few different page sizes:
Pages after 500th record are oversized
After around the 500th record, subsequent pages contain more than
per_page
records. Here are 2 examples forper_page=30
andper_page=20
, respectively:Pages past the 500th record contain 500 records no matter
per_page
. Only 49 non-duplicate records were returned in the subsequent page, and no unique records were returned in the one after. Record 503 is the first duplicate.Duplicated records are not exact duplicates
The duplicated observers that are returned after ~500 are not actually duplicates, exactly. Their user info is the same but their stats are different. Here are 2 examples:
Stats from first record of user 3569438 on page: https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=10
...And their stats on the first post-500th-record page:
https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=18
Same for user 15723:
Initial stats: https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=1
Post-500th-record stats: https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=18
Curiously in both of these examples,
species_count
in the second record equalsobservation_count
in the first. Maybe that's the key to finding this bug.Requests with
order_by=species_count
max out at 500 recordsWith
per_page=30
we get 16 pages (= 480 records) of expected results:https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=1&order_by=species_count
...
https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=16&order_by=species_count
...then page 17 only returns 20 records (making 500 total), instead of the expected 30:
https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=17&order_by=species_count
Page 18 (record 501+) returns
{"error":"Error","status":500}
:https://api.inaturalist.org/v1/observations/observers?place_id=72645&per_page=30&page=18&order_by=species_count
I just spend like 2 hours writing this. I hope it's helpful and not toooooo looooooong!
The text was updated successfully, but these errors were encountered: