Skip to content

Releases: johnnyh1975/ha_roomba_plus

v3.2.0

Choose a tag to compare

@github-actions github-actions released this 02 Jul 05:48
a9ee8e8

Roomba+ v3.2.0 — Twelve New Features, Self-Calibrated Throughout

Config entry version: 24 (unchanged — no migration required)
Minimum Home Assistant: 2025.5.0
Breaking changes: none for end users

Overview

The largest feature release since v3.0.0: twelve new capabilities spanning maintenance diagnostics, room-level intelligence, anomaly explanation, and layout-change detection — plus a systematic multi-round code review that caught and fixed several real bugs before they reached anyone's install. Every self-calibrating feature in this release follows the same principle already established for L9-BATTERY and L8's health score: judge a robot against its own learned normal, not a fixed number that's wrong for half the fleet.

New features

Twelve new capabilities, split below by how quickly they become useful — several of the most interesting ones are genuinely worth the wait, but it's worth knowing upfront which is which.

Useful right away

  • Anomaly explanation (roomba_plus.explain_mission service + REST) — turns a flagged anomaly into a plain-language reason (obstacle/blockage, excessive recharge, dirt spike, incomplete coverage) with a matching recommended action. Works on any past mission immediately — and the result is also folded straight into every roomba_plus_mission_completed event payload, so a notification automation gets the reason for free without knowing this service exists.
  • Reset diagnostics (sensor.*_reset_diagnostics) — surfaces the robot's own reset-cause breakdown, previously entirely unread. Shows real data from the moment it's created, if your robot has any reset history at all.
  • Mission path replay (REST) — room-granular post-hoc reconstruction of a mission's path. Works on any archived mission immediately.
  • Stuck context event (roomba_plus_stuck) — fires with an actionable payload the next time the MQTT watchdog trips. No history needed.
  • Child lock & eco charge switches, dock firmware version — plain controls/readouts, active immediately on models that report the preference.
  • Team clean indicator (disabled by default) — shows on the very next mission if it happens to be an Imprint Link team clean.

Builds up over the following weeks

These are self-calibrating — each one learns this specific robot's normal behaviour rather than using a fixed threshold, which is what makes them useful, but it does mean a real wait before they show anything besides "learning" on a fresh install or update. New readiness attributes added in this release now make that wait visible on the entity itself rather than a silent, unexplained blank state:

  • Health score trend — 44 days of history needed; days_until_ready is exposed as an attribute from day one so you can watch it count down.
  • Layout change detection — 23 missions of per-cell coverage history needed; missions_until_first_ready (and cells_tracked) are now shown even before any candidate is found, so "still learning" no longer looks identical to "nothing to report."
  • Room accessibility scores — appears per-room as soon as that room has any signal at all (score null until then); a handful of missions with stuck events or timing data.
  • Cleaning cadence health — needs at least 3 recorded intervals between cleans of a given room (so, at minimum, 4 cleans of that specific room) before it can judge "overdue" against that room's own rhythm.
  • Stuck hotspot clusters — needs at least 2 adjacent grid cells to each independently reach the stuck-event threshold before a cluster forms; a real (if unwelcome) obstacle typically produces this within a handful of missions.
  • Room type suggestion — depends on iRobot's cloud having already computed a room-type suggestion for that region; timing is outside this integration's control.

Notable fixes found during development

Several of these features went through real design corrections during development, and a following multi-round code review caught issues that wouldn't have surfaced from normal testing alone:

  • Health score trend reference-window contamination. The original design excluded only the most recent 14 days from the historical reference used to judge a decline — meaning a decline lasting longer than 14 days started pulling its own older days back into the reference set, silently masking the exact decline it was meant to detect (reproduced directly: a 25-day decline was misclassified "stable"). Fixed with a wider, independently-sized exclusion buffer.
  • Layout-change dismiss suppression never actually worked. Confirmed against Home Assistant's own Issue Registry source: re-creating a previously-dismissed issue does not reset its dismissed state, so after the intended 30-day suppression window, the issue would have stayed permanently hidden instead of resurfacing. Fixed by deleting the stale issue before recreating it.
  • Orphaned Repair Issues for stuck hotspots and cleaning cadence. Both could leave stale issues behind — stuck hotspots when a cluster's membership shifted, cleaning cadence when a room was renamed in the iRobot app. Both now clean up automatically.
  • Import endpoint accepted type-malformed records. A history-import record only needed an id to be accepted — a record with, say, a numeric timestamp instead of an ISO string was persisted as-is and then permanently crashed the cleaning-cadence computation on every cloud-refresh cycle, surviving restarts. Found during a full-codebase review; fixed with type validation at the import gate plus defensive guards at both consumer sites, so even a store poisoned before this release degrades gracefully instead of crashing.
  • Explicit-null firmware fields could crash the map, vacuum, and select platforms. iRobot firmware occasionally reports fields as explicit null rather than omitting them. A v2.8.0 fix addressed this class of bug in the MQTT callback path — but three other files with the same vulnerable pattern were missed at the time and stayed latent for years. The same full-codebase review found and fixed all 9 remaining sites (image.py, vacuum.py, select.py).
  • Two independent instances of a pre-existing "hardcoded test date drifts out of a rolling query window" issue were found and fixed along the way (unrelated to this release's new features, but caught while working in the same test files).

Test suite

3,253 tests, 0 failures (baseline before this release: 2,993).

Upgrade notes

No config entry migration — all persisted data is additive; existing GridStore, MissionStore, and RobotProfileStore data loads unchanged. See "Builds up over the following weeks" above for which new features need mission history before they show a real value — that's expected self-calibration, not a bug, and it's now visible directly on the affected entities: sensor.*_health_score_trend exposes a days_until_ready attribute, and binary_sensor.*_layout_change_detected exposes missions_until_first_ready even before any candidate is found, so the learning period no longer looks identical to "already checked, nothing to report."

v3.1.1

Choose a tag to compare

@github-actions github-actions released this 30 Jun 20:44
8b8553e

Roomba+ v3.1.1 — Last Mission Summary Bugfix

Config entry version: 24 (unchanged — no migration required)
Minimum Home Assistant: 2025.5.0
Breaking changes: none for end users

Overview

A small follow-up release fixing a bug in sensor.*_last_mission_summary (introduced in v3.1.0) and adding the room-by-room coverage data that was requested alongside it.

Bug fix

cleaned_rooms on sensor.*_last_mission_summary always returned None on real installations. The attribute was reading a last_cleaned_rooms key directly from the stored mission record — but that key never actually exists there; it's only ever computed on the fly (the same name is used as a computed attribute on the vacuum entity, which created the confusion). cleaned_rooms now uses the same MissionStore.latest_cleaned_rooms() resolution path the vacuum entity already used correctly, with the same region-map/UMF-alignment fallback logic for SMART vs. EPHEMERAL robots.

New attribute

sensor.*_last_mission_summary now also exposes room_coverage — a {room_name: coverage_fraction} dictionary for the most recent mission, using the same data the vacuum entity's room_coverage attribute already provides. No new computation, no new data source — same information, now available in one place alongside the rest of the mission summary instead of needing a separate entity lookup.

Both cleaned_rooms and room_coverage are None when no cloud credentials are configured and the robot's Smart Map isn't yet aligned — same gate the vacuum entity attribute uses.

Field-data acknowledgement

This release exists because of a thoughtful follow-up suggestion from a field tester comparing Roomba+ against the official iRobot app's mission summary — specifically asking whether room coverage could be folded into the last-mission entity. Looking into it surfaced the cleaned_rooms bug as a side effect.

Test suite

3029 tests, 0 failures (baseline: 3027 before this release).

Upgrade notes

No config entry migration. If you were already reading cleaned_rooms from last_mission_summary and handling its (incorrect) None value in an automation or template, that workaround is no longer necessary — the attribute now returns real data.

v3.1.0

Choose a tag to compare

@github-actions github-actions released this 30 Jun 20:41
c1f4245

Roomba+ v3.1.0 — Self-Calibrating Intelligence

Config entry version: 24 (unchanged — no migration required)
Minimum Home Assistant: 2025.5.0
Breaking changes: none for end users

Overview

v3.1.0 focuses on making Roomba+ smarter about its own robot rather than relying on fixed, one-size-fits-all thresholds. Three new self-calibrating systems learn each robot's individual normal behaviour over time and only raise an alert when something genuinely deviates from it — replacing an earlier approach that would have produced false positives (or, in one case, a literally nonsensical 354-year battery life projection) on perfectly healthy robots.

This release also adds three new dictionary-style sensors for mission and room history, a plain-language layer on top of the existing health-score sensors, and a documentation pass on privacy and robot lifecycle.

What's new

Self-calibrating navigation health (SMART-tier)

sensor.*_relocalisation_rate tracks how often the robot has to relocalise during a mission — a direct signal of Smart Map quality. Rather than using a fixed threshold (which field testing showed doesn't generalise across different homes, floor plans, and WiFi conditions), the sensor learns each robot's own normal relocalisation rate over its first 15 cleaning missions, then compares the recent 10-mission window against that personal baseline. A Repair Issue fires only when the rate climbs to 3× the established baseline, and clears itself automatically once the rate returns to normal — no manual dismissal needed.

Disabled by default (diagnostic-tier); enable it if you want visibility into navigation health, especially after a Smart Map retrain.

Self-calibrating battery health

The existing estimated_battery_eol sensor (days remaining until 65% capacity) has been hardened against measurement noise. Field testing on a long-running i7+ revealed that estCap readings naturally oscillate by ±10–15 mAh between missions on an otherwise healthy battery — and the previous linear extrapolation, taken at face value, projected a 354-year remaining lifespan from that noise alone. Technically correct math, useless as information.

The sensor now learns each robot's own measurement noise floor (using Welford's algorithm for a numerically stable running standard deviation) and only trusts a degradation trend once it clearly exceeds that floor. A conservative fallback threshold means genuinely failing batteries are still detected immediately, without waiting through the full learning period.

Map drift detection — sliding window instead of lifetime total

map_drift_detected previously accumulated drift as a lifetime sum, which meant any EPHEMERAL-tier robot with enough mission history would eventually cross the trigger threshold regardless of whether it was actually drifting right now — a robot that had logged 5820mm of lifetime drift over months of normal use, none of it currently problematic, would still show a permanent false-positive warning.

Drift is now tracked over a 10-mission sliding window. The issue fires only when the recent average exceeds 250mm, and clears automatically (with hysteresis at 150mm to prevent flapping) once drift returns to normal. The lifetime total is retained as a diagnostic value but no longer drives the alert.

New sensors

  • sensor.*_last_mission_summary — the most recent mission as a single entity, with 14 attributes (duration, area, battery delta, recharges, dirt events, initiator, timestamps) for easy automation triggers and troubleshooting, without digging through mission history.
  • sensor.*_room_cleaning_history — a dictionary sensor mapping each room to its last-cleaned timestamp, built from existing mission records (SMART-tier with cloud access). No new storage schema, no entity-count growth on map retrain.
  • sensor.*_room_areas — a dictionary sensor mapping each room to its floor area in m², calculated from UMF polygon data. The only automatically-measured room area available without a tape measure.

All three use the dictionary-attribute pattern rather than one entity per room — stable entity count regardless of how many rooms your home has, and no orphaned entities after a Smart Map retrain.

Plain-language health status

integration_health and robot_health_score now carry status_text and recommendation attributes in addition to their numeric score — a human-readable summary ("Cloud connection has been stuck for 6h" / "Check your iRobot credentials in the options") instead of requiring you to interpret a raw breakdown dict. Available in all 7 supported languages.

Documentation

  • Privacy: the cloud credentials setup step now explicitly states that all commands run locally over MQTT and never touch the cloud — cloud access is read-only, used only for room names, mission history, and favorites.
  • Lifecycle: a new "Replacing or selling your robot" section in the README and Troubleshooting guide covers exporting your mission history before removing the integration, and restoring it on a replacement robot.

Field-data acknowledgements

Several fixes and the entire self-calibration approach in this release came directly out of real-world field reports:

  • The relocalisation-rate sensor's design — and the discovery that nMssn increments automatically without an actual cleaning mission starting — came from detailed MQTT captures and patient back-and-forth troubleshooting on an i7+.
  • The battery noise-floor fix was driven by a multi-month estCap reading history on the same robot, which made the false-positive risk concrete rather than theoretical.
  • A Smart Map corruption report (recurring "New space found" messages, inaccurate room tracking, eventually requiring a full map reset) provided useful context on what genuine navigation degradation looks like, distinct from normal noise.

Thank you to everyone who took the time to capture diagnostics, re-run templates, and report back — this release would have shipped with much weaker thresholds without that data.

Internal changes

  • RobotProfileStore gained three new self-calibrating subsystems (relocalisation baseline, estCap noise floor, drift sliding window), all following the same running-mean/Welford pattern already established for the existing coverage baseline.
  • estcap_to_mah() extracted to const.py as shared logic between sensor.py and callbacks.py (third call site — same reasoning as the existing active_charge_cycles() extraction).
  • FAN_SPEED_AUTOMATIC/ECO/PERFORMANCE constants changed from Capital-Case to lowercase slugs to satisfy Home Assistant's translation_key validation rules (hassfest). Both select.select_option and vacuum.set_fan_speed accept the old Capital-Case values too via case-insensitive matching — no automation changes required.
  • Three mop-related sensors (mop_clean_mode, mop_tank_status, mop_ars_behavior) had the same Capital-Case state-key issue as the fan speed select above, caught by a second hassfest validation pass. All three now use lowercase underscore slugs (e.g. "Dirty Pause + Dry + Wash""dirty_pause_dry_wash"). One additional bug found and fixed during this pass: mop_tank_status was missing a translation entry for its "unknown" state entirely, in all 7 languages — a pre-existing gap, now closed. New test coverage guards against the underlying class of bug (options array, strings.json, and all 7 translation files silently drifting out of sync with each other).
  • manifest.json cleaned up: removed two invalid fields that were never part of the schema, added the http (direct) and recorder (optional) dependencies that were previously undeclared, removed an unrecognised services key (the services.yaml file is auto-discovered by Home Assistant via directory convention and needs no manifest entry).

Test suite

3027 tests, 0 failures (baseline: 2942 before this release).

Upgrade notes

No config entry migration — this release does not change the persisted schema. All new sensors and Repair Issues activate automatically; self-calibrating sensors will show Unknown/None until they've accumulated enough mission history (10–20 missions depending on the signal) to establish a personal baseline. That's expected — it's exactly the behaviour the self-calibration is designed to provide instead of guessing with insufficient data.

v3.0.0

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 29 Jun 05:11
ee91329

Roomba+ v3.0.0 — Technical Cleanup Release

Config entry version: 22 → 24 (two automatic migration steps, no user action required)
Minimum Home Assistant: 2025.5.0
Breaking changes: none for end users


Overview

v3.0.0 is a pure technical cleanup release with no user-facing behavior changes.
It removes deprecated sensor infrastructure introduced in v2.0, adds two new diagnostic
sensors, and significantly restructures the integration's internal architecture for
long-term maintainability.


What's new

New sensors

consecutive_mission_anomalies (disabled by default)
Exposes the number of consecutive most-recent missions classified as anomalous
(stuck, error, or significantly below the robot's baseline performance). Disabled
by default — consumed by the companion Card's C5-ANOMALY banner (threshold ≥ 3) and
available for custom automations. Activate in the entity registry when needed.

Observed obstacle zone overlay

The map image now overlays robot-detected obstacle zones (from UMF
observed_zones) as semi-transparent orange circles when the UMF aligner is active.
These represent positions where the robot has repeatedly detected obstacles over time,
supplementing the existing red keep-out zone overlay.


Bug fixes and resilience hardening

This release also folds in a round of real-field-data bug-hunting and hardening:

  • Vacuum state fix: a freshly-connected or sparsely-reporting robot could be
    shown as Paused instead of Idle/Docked. When cleanMissionStatus was
    missing a cycle, the value None was treated as an active cycle. Fixed so a
    missing cycle correctly reads as idle.
  • Store corruption resilience: five storage loaders (robot_profile, grid,
    mission_archive, outline, and related) could raise on a corrupted or
    hand-edited .storage file instead of degrading gracefully. They now reset
    cleanly and never propagate the error, so a bad store can't block setup.
  • Sensor-platform null safety: roomba_reported_state is hardened against an
    explicit {"state": null} frame that previously could take down the entire
    sensor platform. Verified by running every sensor's filter against the real 980
    diagnostic plus degraded (all-null, empty, missing-subdict) variants.
  • clean_room edge case: an empty room_passes list raised an IndexError;
    it now returns a clear validation error (no_rooms_resolved).
  • Dead-code cleanup: removed unreachable calls and unused imports surfaced
    during the audit (no behavior change).

Test suite grew from 2867 to 2946 over this hardening pass, all passing.


Removed (deprecated sensors from v2.0)

The following 13 sensors introduced in v2.0 as raw cloud data exposers are
permanently deactivated. They were superseded by their consolidated replacements
in v2.7–v2.9 and have been disabled by default since v2.7.0:

Removed sensor Consolidated replacement
recent_completion_rate cleaning_analytics_30drecharge_pct attribute
recent_recharges event_counts_30d
recent_evacuations event_counts_30d
recent_dirt_events event_counts_30d
recent_error_code event_counts_30d
recent_error_time event_counts_30d
recent_wifi_floor wifi_health
recent_wifi_stability wifi_health
recent_cleaning_speed cleaning_performance
recent_dirt_density cleaning_analytics_30ddirt_density attribute
recent_recharge_fraction cleaning_analytics_30drecharge_pct attribute
cleaning_speed_trend cleaning_performancetrend attribute
recent_coverage_pct cleaning_performancecoverage_pct attribute

If any of these are referenced in your dashboards or automations, replace them with
the corresponding consolidated sensor or attribute listed above.

The F6a (performance degradation) and F6b (battery recharge) Repair Issues continue
to work — the data they depend on is now computed by the consolidated sensors.


Internal architecture changes

These changes have no effect on runtime behavior but significantly improve code
structure and long-term maintainability:

SETUP-SPLIT: async_setup_entry reduced from 657 lines to a 35-line orchestrator
by extracting six named phase functions (_phase_connect, _phase_spatial,
_phase_data, _phase_cloud, _build_runtime_data, _phase_finalize) and a
_SetupContext accumulator dataclass. Each phase is independently readable and
documented.

Cloud refresh callback extracted: The _on_cloud_refresh_complete callback
closure (analytics merge, Repair Issue checks, UMF realignment trigger) is now a
proper named factory function make_cloud_refresh_callback() in callbacks.py,
consistent with the other make_*_callback functions.

Select descriptor pattern: The four simple select entities
(CleaningPassesSelect, DisposablePadWetnessSelect, ReusablePadWetnessSelect,
CarpetBoostSelect) are now driven by frozen RoombaPlusSelectDescription
dataclass descriptors via a single generic SimpleRoombaSelect class, eliminating
~180 lines of duplicated boilerplate.

Dead code removed: MissionArchive.missions_by_room() and
MissionArchive.dirt_series() — identified as structurally unreliable for 900-series
robots via field data analysis — have been removed along with their tests.


Upgrade notes

  • Two automatic migrations run on first load (config entry 22 → 23 → 24).
    No user action is required; both are non-destructive.
    • 22 → 23: stabilises FavoriteButton entity_ids. Buttons previously got
      locale-dependent ids derived from the iRobot routine name
      (e.g. button.roomba_980_og_montag_morgen); they are renamed to stable,
      discoverable ids so the companion Card can find them after upgrade.
    • 23 → 24: disables sensors that are permanently unavailable for most
      robots and have no UI path to become available, clearing them out of the
      entity list instead of leaving them as "unavailable".
  • Deprecated sensors that were already disabled (default since v2.7) are simply gone.
    HA will remove their entity registry entries automatically on first load.
  • If you had manually re-enabled any deprecated sensor, re-enable the corresponding
    consolidated sensor or attribute instead.
  • The hacs.json minimum HA version is now 2025.5.0.

Contributor notes

  • Test suite: 2946 tests, 0 failures (baseline: 2872 before v3.0.0 cleanup)
  • __init__.py net line delta: −660 lines (1942 → 1282 in the setup/teardown shell)
  • callbacks.py net delta: +310 lines (cloud helpers + refresh callback)
  • select.py net delta: −132 lines (descriptor pattern)
  • sensor.py net delta: −215 lines (deprecated descriptors + helpers removed)

v2.10.3 Hot Fix Release

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 28 Jun 08:58
295a2fd

Roomba+ v2.10.3

Hotfix release. Three bugs found after v2.10.2: a structural discrepancy
between format=summary and format=records, a cloud-result classification
gap that caused error/cancelled missions to count as completed, and the
device tracker entity being invisible by default on all installations.
No config schema changes; no card changes required.

Fixed

format=records missing cloud-unmatched local missions (RECORDS-UNION)

format=records used a strict either-or source selection: when the cloud
coordinator was healthy and had any records, the entire local MissionStore
was ignored. Any mission that completed locally via MQTT but never appeared
in the cloud history feed (for whatever reason on iRobot's side) was
silently invisible in the records array, even though format=summary's
total (always local) correctly counted it.

Confirmed in the field: format=summary showed total: 3 for a day while
format=records returned only 2 records for the same day — the third was
a real, 38-minute completed mission that had no cloud counterpart.

Local records that never received a cloud merge signal (dirt/chrgM/
wlBars all absent — the same heuristic cloud_coordinator.py's CR3
fallback uses) are now unioned into the cloud-sourced array and sorted by
started_at. The source field correctly marks these as "local".

format=summary counting error/cancelled missions as completed (B1-EXT)

MissionStore._merge_cloud_fields() only corrected one specific
cloud/local result mismatch: pauseId=224 + local "stuck""error"
(map localisation failure). All other cases where the MQTT mission-end
packet carried a different result than what the cloud later revealed were
never corrected, leaving the local result field permanently wrong.

Confirmed in the field: the same day had completed: 3 in format=summary
while format=records (cloud-sourced) showed result: "error_battery" and
result: "cancelled_by_user" for two of those missions — completed: 0
would have been correct.

Two new correction rules added to _merge_cloud_fields() (B1-EXT), applied
whenever the cloud record arrives via backfill_from_cloud() or
merge_latest_from_cloud():

  • done == "bat" with local "completed" or "stuck_and_resumed""error";
    error_code backfilled from pauseId if absent.
  • done_raw == "usrEnd" with local "completed" or "stuck_and_resumed"
    "cancelled".

"stuck_and_abandoned" is deliberately not touched — the robot stopped on
its own, independently of user or battery. Generic "error"/"cancelled"
values are used (not "error_battery"/"cancelled_by_user") so the
corrected values stay within the documented local result enum.

Device tracker entity invisible by default (all robot tiers)

TrackerEntity.entity_registry_enabled_default returns False when both
mac_address and device_info are None. Both are always None for
RoombaDeviceTracker: the robot is identified by BLID (not MAC), and
device_info is None by HA core design for device tracker entities.
The entity was registered correctly but disabled by default on every
installation, so it never appeared in the UI without manual intervention.

Confirmed root cause of Thonno's field report: "I don't seem to have that
entity on my i7+" — the entity existed in the registry but was invisible
on all tiers (SMART and EPHEMERAL alike).

Fixed by setting _attr_entity_registry_enabled_default = True explicitly
on RoombaDeviceTracker. Users who already have the entity in their
registry in a disabled state will need to enable it once manually; new
installations will see it enabled by default immediately.

Testing

  • 4 new regression tests for RECORDS-UNION (TestLocalRecordHasCloudMergeSignal,
    TestRecordsUnionWithLocal).
  • 18 new regression tests for B1-EXT (TestMergeCloudFieldsB1Ext), including
    a direct field-bug repro against the real archive timestamps from 26.06.2026.
  • 2 new regression tests for the device tracker fix
    (TestEntityRegistryEnabledDefault).
  • Full suite: 2,836 passing / 0 failing.

Upgrade notes

No action required for RECORDS-UNION and B1-EXT — corrections apply
automatically on the next cloud refresh after upgrading.

For the device tracker: if you previously saw no device_tracker.* entity
for your Roomba, it now appears enabled by default after upgrading. If it
was already in your entity registry in a disabled state, enable it once
manually in Settings → Devices & Services → Roomba+ → the tracker entity.

v2.10.2 - Hotfix release

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 27 Jun 06:48
ed89378

Roomba+ v2.10.2

Hotfix release. Five bugs found and fixed during a post-v2.10.1 audit of
the mission_archive/mission_store analytics pipeline and the new
RoomSegStore watershed engine introduced in v2.10.0. No config schema
changes; no card changes required.

Fixed

mission_archive — duplicate records from a pre-v2.8.6 bug, never cleaned up

The discontinuity-guard bug fixed in v2.8.6 (Round 1/2) stopped new
duplicate records from being appended, but never retroactively cleaned
up duplicates it had already written to disk before the fix shipped.
Archives that hit that bug while it was still active can have had a
block of missions tripled in the persisted derived/timeline arrays
ever since, silently inflating any analytics that aggregate over the
full archive.

Added a one-time DEDUP-V1 pass in MissionArchive.async_load(),
gated by a persisted flag so it runs exactly once per archive and is a
no-op on every load after that.

MissionStore.area_sqft — never backfilled from the cloud on accurate-timestamp robots

area_sqft is the canonical field read by RobotProfileStore's mission
statistics and compute_rolling_stats(). Its cloud backfill was
implemented inside the timestamp-correction branch of
backfill_from_cloud() — so it only ever ran on the rare mission that
also needed a started_at/ended_at fix. On any robot whose local
timestamps are already accurate (the common case), area_sqft stayed
None forever regardless of how much cloud history was available, even
though the raw sqft field was being merged correctly the whole time.
The same gap existed in merge_latest_from_cloud(), the hook that runs
after every mission completion.

Both call sites now backfill area_sqft from sqft independently of
the timestamp-correction path, via a shared _backfill_area_sqft()
helper.

RoomSegStore — every door's saddle_mm collapsed to the same value

The new (v2.10.0) watershed door-detection step computed each door's
saddle width as the minimum raw distance-transform value across the
entire shared boundary between two rooms — and that boundary almost
always contains at least one cell sitting right at the edge of the
visited mask, unrelated to the actual doorway. Confirmed in the field:
every door in a real archive measured exactly 1 grid cell (150.0mm),
regardless of true corridor width.

Fixed by computing the saddle on the Gaussian-smoothed distance field
(dist_smooth) instead of the raw one — the same field
merge_regions() already uses for its own boundary-saddle calculation.
Verified against the same real-archive fixture: five doors now measure
five different, geometry-correct values.

RoomSegStore — doors silently deleted on recompute

_match_doors() rebuilt self.doors from scratch on every recompute,
dropping any existing door whose room pair wasn't re-detected that
round — along with its entire observations history and stable id.
This is inconsistent with the room-preservation policy already
documented and tested for rooms (test_unmatched_existing_room_is_kept_not_deleted).
It also isn't just theoretical: GridStore decays and prunes
low-traffic cells every mission, and a narrow, infrequently-crossed
doorway is exactly the kind of cell that can legitimately drop out of
the visited set for one recompute without the door having stopped
existing.

Unmatched existing doors are now kept, mirroring the room-preservation
policy, as long as both rooms they connect still exist.

RobotProfileStore.update_mission_stats() — never wired into the callback chain

This method's own docstring has said "L3/L8 — called after each mission
from the callback chain" since at least v2.6.0. It had no caller
anywhere in the codebase, in production or in tests. Practical effect:
mission_duration_mean, mission_duration_std, and mission_area_mean
never populated for any installation, no matter how much mission
history existed — including, but not limited to, the area_sqft fix
above.

Wired into _async_update_robot_profile_store() (the existing L5/L6/J
callback chain) as the missing L3/L8 step. update_mission_stats() now
returns whether it wrote anything, matching the bool-return convention
already used by update_lifetime_sqft_tracking().

Testing

  • 19 new regression tests added across test_mission_archive.py,
    test_mission_store.py, test_room_segmentation.py,
    test_room_seg_store.py, and test_init_wiring.py.
  • Every fix in this release was verified to fail against the pre-fix
    code before being confirmed fixed — including a direct repro against
    a real, anonymised field archive for the saddle_mm bug
    (tests/fixtures/sample_grid_980_og.json).
  • Full suite: 2,814 passing / 0 failing.

Upgrade notes

No action required. The mission_archive dedup pass and the
area_sqft backfill both run automatically on the next HA restart /
cloud refresh after upgrading. Existing RoomSegStore doors will pick up
corrected saddle_mm values on the next recompute that re-detects them
(unaffected doors keep their last-known value until then, per the
preservation fix above — they are not deleted or reset).

v2.10.1 Faster mission-end confirmation for the last room in a multi-room mission

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 26 Jun 16:02
2da5603

Roomba+ v2.10.1

Improvement

Faster mission-end confirmation for the last room in a multi-room mission (SMART/Smart Map robots)

Multi-room missions wait for a deliberate confirmation delay (up to ~90 seconds) before the integration commits to "mission really ended" rather than "robot is just pausing between rooms" — this protects against a real prior regression where a single misread of the robot's own cycle field mid-mission reset progress tracking to zero.

That confirmation delay relied on tracking which room the robot is currently in, advanced automatically as the robot moves between rooms. Closer analysis (prompted by a field report) found that this tracking has a structural blind spot: it can never advance into the last planned room, because the signal it depends on to confirm a room change requires the robot to still be actively running its cleaning cycle — which is no longer true by the time the robot is finishing its last room and starting to wrap up the whole mission. In practice, this meant every multi-room mission waited out close to the full 90 seconds on its last room, not just the rare edge case the delay exists for.

The robot does report fully reliable per-room completion data — but only via the cloud, with its own delay. v2.10.1 opportunistically checks whatever cloud mission data is already cached (from the integration's regular background refreshes, not a new request) for explicit confirmation that every planned room has been completed. When that data already confirms it, the mission is recorded immediately instead of waiting out the delay. When it isn't there yet — most of the time, since cloud data lags — behavior is unchanged from before. This can only ever make confirmation faster, never less reliable: the original protections stay fully in place as the fallback.

Internal

  • 6 new tests (the override firing, a partial-completion case correctly not firing it, no matching cloud data falling through unchanged, the error-recovery completion status, an in-progress status not being mistaken for completion, and a sanity check that the index-only path is untouched when it already has the answer).
  • 2,831 tests, 0 failing.

v2.10.0 - ROOM-SEG: room/door detection completely rebuilt for 900-series (EPHEMERAL) robots

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 26 Jun 15:29
21c15fe

Roomba+ v2.10.0

Highlights

ROOM-SEG: room/door detection completely rebuilt for 900-series (EPHEMERAL) robots

The gap-distance heuristic that previously detected rooms and doors from travel data (ZoneStore) has been replaced end to end with a watershed-based segmentation engine (RoomSegStore) that works directly on the same accumulated coverage-density data used for the cleaning heatmap implemented from scratch with zero new runtime dependencies (no scipy/scikit-image).

The old approach had a structural flaw: with dense MQTT pose sampling, the maximum distance between consecutive samples never reliably exceeded the door-gap threshold, so most real doorways were never detected. The new approach instead finds rooms as density basins in the coverage map and doors as the saddle points between them — verified against real field data to produce a stable room result across a range of parameters, not a lucky one-off.

What's new for EPHEMERAL (900-series) robots:

  • Room and door identity stay stable across recomputations — a name you assign survives even as more missions refine the boundaries.
  • A door's tracked position is now the median of its recent crossing observations (not the latest single reading), making it robust to a door being at a different swing angle from one mission to the next.
  • A Repair Issue prompts naming for newly detected rooms, same as before.
  • If you're updating from an earlier version, any room names you already confirmed are migrated automatically the first time this version starts — no action needed.

What was deliberately removed:

  • The door-width calibration helper (Options Flow → map scale calibration). It only ever adjusted a cosmetic rendering scale that's still directly editable as a plain number in the general map settings — the calibration step itself added no functionality beyond that, and field experience showed it saw effectively no use.

Battery / dock contact monitoring (new Repair Issue)

A new check (battery_contact_suspect) catches two patterns that usually indicate a loose or corroded battery/dock contact rather than a failing battery:

  • An implausible jump in reported battery level — more than ~25 percentage points within under 10 minutes. No real battery changes state that fast; this is the BMS communication link dropping and recovering.
  • The highest battery level reached declining over three consecutive charge cycles — a slower-motion version of the same underlying issue.

This complements the existing dock_contact_health check, which depends on firmware-reported counters (nChatters/nKnockoffs/nAborts) that not every firmware version exposes — the new check works from batPct alone, which is universal. If dock_contact_health was never firing for you despite real symptoms, this is likely why; it now also logs a debug note when its expected fields are absent so the gap is visible rather than silent.

Map rendering

  • Door markers and room outlines on the live map are now sourced from the same RoomSegStore/GeometryStore pipeline above. The previous gap heuristic feeding these is gone.
  • Hidden rooms are now correctly excluded from the map overlay (previously they were excluded from selectors and Repair Issues but still drawn on the map — an inconsistency, now fixed).

Fixes

  • dock_contact_health's detection gate now logs a debug message when none of its expected fields are present for a given firmware, instead of silently never firing.
  • CLOUD-CATCHUP false-positive (SMART/Smart Map robots). The delayed cloud-refresh retry introduced in v2.9.1 to fix last_cleaned_rooms staying stale could itself be fooled into giving up too early: its "has the cloud caught up yet" check looked at whatever MissionStore's newest local record happened to be, without confirming it was actually the record for the mission that just ended. If an unrelated record (e.g. an earlier failed attempt) coincidentally received its own backfill enrichment at almost the same moment the check ran, the check would conclude "done" against the wrong record and skip its second, later retry — leaving last_cleaned_rooms permanently null for the real mission instead. Confirmed via a field retest (Thonno, three-attempt room-selected mission with two failed short attempts followed by a genuine 67-minute completion) and fixed by comparing against the record present before the mission ended, by identity, rather than by its content.

Internal

  • Six-stage migration from ZoneStore to RoomSegStore, fully decoupled from the also-removed ZoneStore-only data path. A minimal read-only migration shim (legacy_zone_migration.py) is kept solely so existing installations can carry their confirmed room names forward once; it has no other purpose.
  • Two small, genuinely independent pieces of logic that happened to live inside the old ZoneStore were relocated rather than deleted: dock-drift detection (now a standalone function in image.py) and the door-gap-width constants used by SMART-tier door detection (now in const.py).
  • 2,825 tests (up from 2,776), 0 failing.

Upgrade notes

No action required. Existing confirmed room names for 900-series robots are migrated automatically on first start after upgrading.

v2.9.1 - Bugfix release. No breaking changes.

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 25 Jun 16:53
5c965a7

Roomba+ v2.9.1

Bugfix release. No breaking changes.

Fixed

  • Completed missions could be misclassified as stuck_and_resumed/stuck_and_abandoned.
    bbrun, runtimeStats, bbchg3, pose, and lastCommand are separate
    top-level MQTT keys from cleanMissionStatus. A delta message that only
    updates cleanMissionStatus (e.g. the mission-start phase transition) is
    not guaranteed to carry any of them — reading them directly off that one
    message silently substituted an empty/zero default for the robot's actual
    last-known value. In practice this meant: a brand-new mission's nStuck
    baseline could get read as 0 instead of the robot's true lifetime
    count, so the very next message carrying bbrun (any nonzero count)
    looked like a fresh stuck event — even on a completely clean run.
    (#20)

  • zones=[] recorded for clearly room-selected missions. Same root
    cause, via lastCommand — a multi-room clean_room mission could be
    recorded with an empty zone list if the mission-start message didn't
    happen to carry lastCommand.

  • last_cleaned_rooms (and similar cloud-derived attributes) could get
    stuck on unknown for up to 24 hours.
    The post-mission cloud refresh
    fired synchronously on mission end — exactly the moment iRobot's own
    cloud has had the least possible time to ingest the mission. When that
    single attempt missed, nothing retried until the next scheduled 24h
    poll. Replaced with two fixed checkpoints (delayed first attempt, one
    fallback ~10 minutes later) instead of one immediate, too-early attempt.

  • translations/fr.json had two Repair Issue keys misspelled
    (error_reçurrence/cancellation_reçurrence instead of the ASCII
    error_recurrence/cancellation_recurrence every other language file
    and repairs.py itself use). French-locale users saw these two Repair
    Issues untranslated.

Added

  • Per-room floor area. select.* zone-selector entities for Smart Map
    robots now expose a region_areas_m2 attribute (room name → m²),
    computed once from the same UMF geometry used for map rendering.

Internal

  • Consolidated several feature/version-named test files into their domain
    test files (no behavior change).
  • Added a regression guard that checks every translation_key used in
    the integration's source against strings.json and all 7 language
    files, including an ASCII check on translation file keys — the class of
    bug the fr.json fix above falls into.

Full test suite: 2776 passing.

Bug Fix & monitoring / events + map optimization

Choose a tag to compare

@johnnyh1975 johnnyh1975 released this 24 Jun 20:26
ae37e49

Roomba+ v2.9.0 — Release Notes

Based on: v2.8.7 (published). Test status: 2750 passing, 0
failures, 49 test files (45 in v2.8.7 → 49).


Overview

This release closes out the v2.9.0 monitoring/events milestone with five
features plus an unrelated, more urgent fix for a real production
bug found in a field report from Thonno on v2.8.7.


Fixed: EVENT_ROOM_COMPLETED crashed the local MQTT thread

Thonno reported what looked like a regression in the v2.8.7 stuck-mission
fix: a multi-room mission (Bagno principale → Corridoio → Soggiorno,
Two Passes) completed correctly in the iRobot app, but Home Assistant
stayed stuck on Returning to base / docking – end of mission, with
elapsed_run_min still climbing 45+ minutes after the robot had docked.

It wasn't the stuck-mission fix. The v2.8.7 periodic 30-second
safety-net recheck was running exactly as designed — it just never had
anything new to look at. His log showed the real root cause: at the very
first room transition (Bagno principale → Corridoio), _on_mission_message
fired hass.bus.async_fire(EVENT_ROOM_COMPLETED, ...) directly from
roombapy's paho-mqtt background thread, not from the event loop. On his
HA core (a notably stricter/newer build than ours), Home Assistant's
thread-safety guard raised a hard RuntimeError for this — which
propagated all the way up through paho-mqtt's on_message callback and
killed the entire MQTT message-processing thread for the rest of the
mission. Every subsequent symptom — frozen state, the climbing timer, the
stuck-mission recheck endlessly re-evaluating the same cached message —
followed directly from no further MQTT traffic ever being processed
again, not from any logic bug in the end-of-mission classifier itself.

EVENT_ROOM_COMPLETED was introduced in v2.8.6's EVENT-BUS feature. A
sibling event added in the same release, EVENT_MAP_RETRAIN_STARTED/ COMPLETED, already bridges correctly via
asyncio.run_coroutine_threadsafe — this one event type didn't get the
same treatment when it was added.

Fix: hass.loop.call_soon_threadsafe(hass.bus.async_fire, ...)
instead of a direct call — the same bridging pattern already used
elsewhere in this file for async_check_map_retrain_workflow.

Audit: checked every other direct hass.bus.async_fire call site in
the integration. blocking_manager.py and presence_manager.py both
fire exclusively from service calls and state-change listeners — already
running on the event loop, no change needed.

Likely impact beyond this report: this affects any multi-room mission
on any HA core build strict enough to enforce (rather than just warn
about) cross-thread event-bus calls — not unique to Thonno's setup.


Fixed: mqtt_watchdog cloud-connectivity hint was hardcoded in German

Community member boutXIII pasted what looked like a confusing, broken
Repair Issue message — a French sentence with a German clause stitched
into the middle of it ("Connectivité cloud du robot : verbunden — spricht
für..."). Not a robot/network issue at all — a real localization bug.

Root cause: the cloud-connectivity hint inserted into the
{cloud_hint} placeholder was built as a hardcoded German string in
binary_sensor.py, regardless of the user's configured HA locale. The
surrounding sentence was correctly localized via the normal translation
system; the substituted value itself never went through it. Present
since the same v2.9.0 session that added the enriched mqtt_watchdog
message — the existing tests for this code only ever asserted on German
substrings, so the locale gap was invisible in CI from the start.

Fix: _async_watchdog_tick is a @callback (synchronous, can't await
a translation lookup), so server-side string substitution isn't a clean
fit here. Replaced the single mqtt_watchdog translation_key + hardcoded
hint with three fully-localized translation_keys —
mqtt_watchdog_cloud_connected, _disconnected, _unknown — one full,
natural sentence per language per status, resolved automatically by Home
Assistant's existing per-locale translation_key mechanism, the same way
{minutes}/{last_phase} already work. No more placeholder substitution
for this field at all.


Fixed: mqtt_watchdog false positive right after undocking

Both boutXIII and Jean-Christoph reported the watchdog firing within
minutes of starting a mission, on robots that were perfectly fine. Both
reports showed the issue firing with minutes≈5 — right at the existing
silence threshold, not far past it.

Root cause: RoombaMqttStale.is_on only checked phase=="run" and
5 minutes of silence — with no awareness of how long the mission had
actually been running. A genuine, benign Wi-Fi gap right after undocking
(reassociation while the robot physically moves away from the router;
motor startup interference) is common, especially on older robots like
the 980 OG with an aftermarket NiMH battery. The last message received
before the gap already showed phase=="run", so the watchdog fired the
instant the 5-minute threshold was crossed, regardless of how fresh the
mission was.

Fix: new MQTT_WATCHDOG_START_GRACE_SECONDS (420s/7min, chosen with
margin above the ~5min observed in both reports — exact gap duration
wasn't precisely measured in either case). The watchdog now suppresses
entirely for this long after mssnStrtTm, regardless of silence
duration. A genuine outage starting early in the mission is still caught
once both this grace window and the normal silence threshold have
elapsed — this only suppresses the very start of a mission, not any
arbitrary mid-mission silence.


MAP-FONT — embedded font for map rendering

map_renderer.py rendered all zone/wall/door/obstacle labels with PIL's
tiny, non-anti-aliased bitmap default font. Now uses a bundled DejaVu Sans
TTF (Bitstream Vera License — freely redistributable) at two preloaded
sizes. No new config option — there's no scenario where the old default
would be preferable.

ROOM-PALETTE — distinct fill colour per room

_render_rooms_png() filled every Smart Map room with the same uniform
colour. Now rotates through an 8-colour palette by region index, so
adjacent rooms are visually distinguishable even without the
xiaomi-vacuum-map-card's own label overlay. Doesn't touch the v2.7.3
decision to omit room-name labels from this PNG (colour ≠ text — no
duplicate-label risk reintroduced).

CLEAN-ROOM-PER-ROOM-PASSES — individual two-pass setting per room

clean_room gains an optional room_passes field — a list of
{name, two_pass} entries — for setting two-pass cleaning independently
per room within a single multi-room sequence (e.g. Kitchen twice,
Hallway once, same job). Backward compatible: the existing room_name +
global two_pass fields are unchanged for the simple case.

Bug found and fixed in the same area: two_pass was documented in
services.yaml and read by the handler, but missing entirely from the
registered voluptuous schema — any call going through real schema
validation (a YAML automation, or the Developer Tools UI) was rejected
with extra keys not allowed @ data['two_pass']. This had apparently
never worked over that path. Not caught earlier because the existing
test_clean_room_action.py only exercises a hand-copied reference
implementation of the room-resolution logic, never the real registered
service end-to-end.

REST980-MIGRATE — import room names from roomba_rest980

A new options-flow step, shown only when a Smart Map robot is configured
and an existing roomba_rest980 installation is detected
(hass.config_entries.async_entries("roomba_rest980")). Reads room names
straight from that integration's own select.* entities via the state
machine (their room_data attribute) and pre-fills smart_zone_labels
for anything not already named here — read-only, never writes to or
calls services on the foreign integration, and never overwrites a name
already assigned through our own naming workflow. Closing note in the
flow suggests the old roomba_rest980 setup (plus its external rest980
relay container) can be removed once done.

Side discovery: this is the first test to ever import config_flow.py
directly rather than against a hand-copied reference implementation. On
our Python 3.12 test environment (capped at HA core 2025.1.4 — newer
requires Python ≥3.13), helpers.service_info.dhcp/.zeroconf don't
exist yet; added as Shim 4 in conftest.py, aliasing to the equivalent
classes under homeassistant.components.dhcp/.zeroconf. Test
infrastructure only — no production code affected.

ZONE-LAYER-CACHE — cached room-polygon rendering

RoombaRoomsImage._render_rooms_png() re-rendered the full PIL room
layer on every async_image() call (every frontend poll/refresh) even
though room polygons only change on map retrain. Now cached, keyed by
(pmap_version_id, aligned), including the transform parameters the
cache entry was computed with (so calibration_points/_to_px_last
attribute consistency holds on a cache hit, not just the PNG itself).
Known, documented limitation: assumes the pose-space transform is stable
for a given pmap_version_id once aligned=True is reached — not
expected to drift in practice, but not structurally prevented either.


Tests

2750 passing, 49 test files (45 in v2.8.7 → 49: test_rest980_migrate.py
new). +38 net: MAP-FONT +3, ROOM-PALETTE +2, CLEAN-ROOM-PER-ROOM-PASSES +8,
REST980-MIGRATE +16, ZONE-LAYER-CACHE +5, THREAD-SAFETY-FIX +1,
MQTT-WATCHDOG-START-GRACE +3. Four pre-existing tests in
test_mission_timer_store.py needed hass.loop changed from None to
MagicMock() — they exercise the AUTO-ADVANCE-ROOM event-firing path for
the first time now that it actually touches `has...

Read more