Skip to content

Show ADC recovery instead of stuck zero#47

Merged
tadelv merged 2 commits into
decentespresso:developfrom
skialpine:codex/zero-stuck-adc-recovery
May 8, 2026
Merged

Show ADC recovery instead of stuck zero#47
tadelv merged 2 commits into
decentespresso:developfrom
skialpine:codex/zero-stuck-adc-recovery

Conversation

@skialpine
Copy link
Copy Markdown
Contributor

@skialpine skialpine commented May 5, 2026

Summary

This PR hardens firmware-side ADC sample-loss and display-zero recovery paths. It does not address the separate U21 cold-solder failure mode where DRDY continues clocking and raw readings are pinned at 0xFFFFFF.

Changes:

  • releases the quick-zero display hold if a nonblocking tare does not complete in time
  • uses a dedicated quick-zero start timestamp instead of reusing the button tare timestamp
  • detects when raw weight has moved but the filtered/displayed value remains pinned at zero, then resets the output filters and publishes the raw weight
  • marks ADC sample-loss recovery on the OLED as ADC RECOVER, then ADC ERROR / CHECK SCALE after repeated recovery attempts
  • suppresses remote weight publishes while ADC recovery is active, preserving existing BLE/USB/WebSocket payload formats instead of sending fake zero or a sentinel value
  • resets ADC recovery state when the ADC stack is explicitly initialized or reset

Why

The existing firmware could make genuine ADC sample-loss and unresolved tare/filter states look like a legitimate 0.0g reading. This PR makes those firmware-detectable states visible locally and avoids publishing misleading zero-weight updates remotely.

Per review feedback, the field-reported stuck-zero case tied to a U21 cold solder joint is a different hardware failure mode. In that case the ADC can continue producing ready samples, so scale.update() may keep succeeding and this PR's sample-loss recovery path will not trigger.

Validation

  • Built successfully with platformio run

@skialpine skialpine force-pushed the codex/zero-stuck-adc-recovery branch 2 times, most recently from 2a86733 to 5a9b5e5 Compare May 5, 2026 16:12
Copy link
Copy Markdown
Collaborator

@tadelv tadelv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work here. A few changes requested before merge.

Context note: per our internal HDS investigation, the field-reported stuck-zero is root-caused to a cold solder joint on U21 (raw pinned at 0xFFFFFF, DRDY clocking normally). In that case scale.update() keeps returning true and getSignalTimeoutFlag() never trips, and raw_weight is large (not zero), so neither of the two recovery paths added here will engage. The PR is still useful for genuine ADC sample-loss, but the PR body overstates coverage — please reword to match what the code actually catches.

Requested changes

1. t_tareByButton reuse for the quick-zero hold timeout (src/hds.ino:1110)

if (b_weight_quick_zero && millis() - t_tareByButton > QUICK_ZERO_HOLD_TIMEOUT) { ... }

t_tareByButton is set only on the physical button path. If b_weight_quick_zero is ever set true by another path (BLE/USB tare, boot, programmatic tare) without also stamping t_tareByButton, the timestamp is stale (or 0) and the timeout fires immediately on the next loop — releasing the zero hold before it ever takes effect.

Please introduce a dedicated t_quickZeroStart and stamp it everywhere b_weight_quick_zero is set true, then compare against that instead.

2. Sentinel value -999.9 leaking onto BLE / ESP-NOW / USB / WebSocket

ADC_ERROR_WEIGHT = -999.9 is now published unconditionally over every wire protocol (ble.h:556, ble.h:672, espnow.h:15, espnow.h:63, usbcomm.h:588, hds.ino:1414). Receivers (Decent app, partner integrations, websocket clients) will render a literal -999.9 g unless they're updated in lockstep, and encodeWeight() may clamp/wrap the value into something even more confusing.

Please use a non-sentinel approach: either a separate "ADC error" status flag/byte in the packet, or suppress weight publishes entirely while b_adc_recovery_active is true and let the existing connection-loss handling on the receiver side take over. Whichever you choose, coordinate with the app team before merge.

3. Globals defined in parameter.h (include/parameter.h:160-161)

bool b_adc_recovery_active = false;
uint8_t i_adc_recovery_count = 0;

Non-static definitions in a header. Works only because we currently have a single translation unit; it's an ODR landmine if the header is ever included from a .cpp. Please mark both static to match the surrounding pattern (e.g. b_stable_output_enabled).

4. i_adc_recovery_count only resets on a successful sample

Counter clears in the scale.update() branch only. Once it reaches ADC_ERROR_RECOVERY_COUNT, the OLED latches to ADC ERROR / CHECK SCALE and stays there across sleep/wake, USB re-init, and BLE reconnects until a new sample lands. Please reset the counter (and b_adc_recovery_active) on the same code paths that reinitialize the ADC stack — at minimum the charging-wake recovery path and any explicit scale.begin()/reset entry points.

5. Whitespace-only reformatting

Large regions of pureScale, loop, updateOled, and drawTime are reindented with no behavioral change. Please revert the whitespace-only churn so the diff (and future git blame) reflects only the real changes.

6. (char*) cast in drawAdcRecovery (src/hds.ino)

u8g2.drawStr(AC((char*)line1), AM() - 12, line1);

The cast discards const to satisfy the AC() macro signature. Please change AC() to take const char* instead — the cast hides the real fix and the macro almost certainly doesn't mutate its argument.

Optional follow-up

While we're here: the cheapest improvement for the documented field bug is surfacing the raw ADC value in the Doctor report (test #5 in the HDS NWD note). Costs zero new firmware code and lets Doctor distinguish 0xFFFFFF (cold joint / open input) from 0x000000 (shorted) from a true mid-scale hang. Worth a separate PR.

@skialpine skialpine force-pushed the codex/zero-stuck-adc-recovery branch from 5a9b5e5 to fea4e16 Compare May 5, 2026 17:10
@tadelv tadelv merged commit e81ecd9 into decentespresso:develop May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants