fix(spot): stop quadratic hidden_points growth and backward timestamp rollback#96
Merged
fix(spot): stop quadratic hidden_points growth and backward timestamp rollback#96
Conversation
… rollback in rolling anchor The findmespot crawler uses does_point_exist() (by SPOT message ID) to detect already-processed points, but the rolling anchor absorbs stationary pings without storing their SPOT ID. On every subsequent cron tick, those IDs are not found, the anchor fires again, and hidden_points increments — growing quadratically with the number of stationary pings. Fix 1 (crawler): when insert_point() returns 0 (rolling anchor absorbed), treat the point as accounted for and break out of the loop. This reduces re-processing from O(N²) to O(1) per cron tick. Fix 2 (insert_row): when the rolling anchor updates an existing row, skip overwriting the timestamp if the incoming time is older than the stored one. SPOT API returns messages newest-first, so without this guard the anchor row keeps rolling its time backward on each suppressed ping.
get_last_point_for_feed() was returning the last row of any type. After a CUSTOM (or STOP/HELP/SOS) event, the very next TRACK ping within the distance+time thresholds would UPDATE the event row — overwriting its lat/lon/time and incrementing hidden_points — instead of inserting a new TRACK row. This caused the event marker on the map to show a growing hidden-point count from all subsequent TRACK pings, and prevented any new TRACK rows from being committed while the device remained stationary near the event location. Fix: pass type='TRACK' to the get_last_point_for_feed query when checking rolling-anchor suppression, so non-TRACK event rows are never used as the deduplication anchor. Adds a regression test that verifies a TRACK ping after a CUSTOM event is always inserted as a new row with the CUSTOM row's hidden_points left at 0.
8c59afc to
4fa47a8
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The SPOT (findmespot) crawler uses
does_point_exist($spot_id)to detect already-processed messages. The rolling-anchor dedup ininsert_row()absorbs stationary TRACK pings by updating the previous row without inserting a new row, so the SPOT message ID is never stored in the DB.Consequence 1 — quadratic
hidden_pointsgrowth:On every subsequent cron tick, those unrecognised IDs are not found by
does_point_exist, so the anchor fires again andhidden_pointsincrements once per unrecognised ID per run. With N absorbed pings and K cron runs, totalhidden_points= N × K, growing quadratically. This explains the 762 hidden-points accumulation reported on Reynald's stationary SPOT feed.Consequence 2 — anchor timestamp rolls backward:
The SPOT API returns messages newest-first. Rolling anchor overwrites the anchor row's
timewith each incoming (older) timestamp, leaving the anchor row with the oldest stationary ping's time instead of the newest. On the next cron tick this causes an artificially largetime_diff, which breaks the anchor threshold and forces an unnecessary new row insertion.Fix
class-spotmap-api-crawler.php— wheninsert_point()returns0(rolling anchor absorbed), treat the point as fully accounted for and break the pagination loop. Reduces per-cron reprocessing from O(N²) to O(1).class-spotmap-database.php— in the rolling anchor update, skip overwritingtimeif the incoming timestamp is older than the stored one. This keeps the anchor row at the newest observed time regardless of API delivery order.Test plan
npm run test:php— 208 tests, 532 assertions, all green before and afterhidden_pointsshould increment by at most 1, not by Ntimecolumn reflects the latest ping time, not an older one🤖 Generated with Claude Code