From 80355f32abce160800b7ee1540c7fff6ca7d341b Mon Sep 17 00:00:00 2001 From: agessaman Date: Sun, 10 May 2026 19:19:17 -0700 Subject: [PATCH 1/7] Add fault alert functionality for WiFi and MQTT disconnections Implemented a new fault alert system that broadcasts notifications over LoRa when WiFi or MQTT connections are down for a specified duration. The alerts are configurable via CLI commands, allowing operators to set private PSKs or hashtags for alert channels. Default settings for alert thresholds and intervals are established, and the system ensures that alerts do not spam the public channel. Updated relevant files to integrate this feature into the MyMesh implementations and CLI handling. --- MQTT_IMPLEMENTATION.md | 93 ++++++++ examples/simple_repeater/MyMesh.cpp | 21 ++ examples/simple_repeater/MyMesh.h | 11 + examples/simple_room_server/MyMesh.cpp | 9 + src/helpers/AlertReporter.cpp | 292 +++++++++++++++++++++++++ src/helpers/AlertReporter.h | 89 ++++++++ src/helpers/CommonCLI.cpp | 209 +++++++++++++++++- src/helpers/CommonCLI.h | 24 ++ src/helpers/bridges/MQTTBridge.cpp | 23 ++ src/helpers/bridges/MQTTBridge.h | 23 ++ 10 files changed, 792 insertions(+), 2 deletions(-) create mode 100644 src/helpers/AlertReporter.cpp create mode 100644 src/helpers/AlertReporter.h diff --git a/MQTT_IMPLEMENTATION.md b/MQTT_IMPLEMENTATION.md index c562179df8..6d9efef610 100644 --- a/MQTT_IMPLEMENTATION.md +++ b/MQTT_IMPLEMENTATION.md @@ -678,6 +678,99 @@ set timezone EST # Abbreviation set timezone UTC-5 # UTC offset ``` +## Fault Alerts (Group Channel) + +The repeater can broadcast a one-line fault notification on a configured group channel when WiFi or any active MQTT slot has been disconnected longer than a configurable threshold. + +The alert is sent over **LoRa** as a `PAYLOAD_TYPE_GRP_TXT` flood packet on the configured channel (with sender = device name) — *not* over MQTT. This is intentional: the MQTT path is what's broken, so the only working delivery is the mesh itself. Anyone in radio range subscribed to the same channel/hashtag in their companion app will see the alert inline with normal channel chat. + +> **The default Public channel is intentionally NOT supported.** Fault alerts are operator-infrastructure noise — broadcasting them on the well-known Public PSK would spam every node in the area. The implementation explicitly rejects the Public PSK (`izOH6cXN6mrJ5e26oRXNcg==`) at both the CLI validation step and the alert-send path. You must explicitly point alerts at a **private PSK** (`set alert.psk`) or a **hashtag channel** (`set alert.hashtag`) before alerts can fire. + +### What triggers an alert + +- **WiFi**: continuously down for at least `alert.wifi` minutes (default 30) +- **MQTT slot N**: enabled, has connected at least once since boot, and has been disconnected for at least `alert.mqtt` minutes (default 240, i.e. 4 h) + +A "recovered" message is sent once when the underlying connection comes back. After firing, a fault is rate-limited by `alert.interval` (default 60 minutes) before it can re-fire — this prevents flapping links from spamming the channel. + +### Defaults + +| Setting | Default | Notes | +|---------|---------|-------| +| `alert` | `off` | Master enable for automatic fault alerts | +| `alert.psk` | *(unset)* | Private base64 PSK (24 or 44 chars). The active channel key. | +| `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | +| `alert.wifi` | `30` (min) | 0 disables WiFi alerts | +| `alert.mqtt` | `240` (min) | 0 disables MQTT alerts | +| `alert.interval` | `60` (min) | Minutes between repeat alerts of the same fault. **Hard floor of 60 min** so a flapping link can't spam the mesh; the CLI rejects lower values and AlertReporter clamps stale prefs at runtime. | + +> `alert.psk` is unset on a fresh flash. **Alerts cannot fire and `alert test` will refuse to send until you configure either `alert.psk` directly or `alert.hashtag` (which derives one).** The sender shown on outgoing alert messages is always the node name (`set name ...`); there is no separate `alert.name`. + +### CLI + +Get: +- `get alert` — master on/off +- `get alert.psk` — the active base64 PSK (or `(unset)`) +- `get alert.hashtag` — the originating hashtag (or `(unset)`, e.g. after `set alert.psk` overrides the hashtag-derived key) +- `get alert.wifi` / `get alert.mqtt` / `get alert.interval` + +Set: +- `set alert on` / `set alert off` +- `set alert.psk ` — 24-char (16-byte) or 44-char (32-byte) base64; rejects the well-known Public PSK. Clears `alert.hashtag` since the new key is operator-supplied. +- `set alert.psk` (no argument) — clears both `alert.psk` and `alert.hashtag` +- `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). +- `set alert.hashtag` (no argument) — clears both `alert.psk` and `alert.hashtag` +- `set alert.wifi ` (0–1440; 0 = disabled) +- `set alert.mqtt ` (0–10080; 0 = disabled) +- `set alert.interval ` (60–10080; 60-minute floor to protect mesh airtime) + +Action: +- `alert test` — send a one-off `[test] alert channel ok` immediately on the configured channel; ignores `alert on/off` so operators can verify the channel before enabling fault firing. Returns an error if no channel is configured. +- `alert test ` — send a custom test message: `[test] `. + +### Example: dedicated hashtag channel (recommended for operator groups) + +```bash +set alert.hashtag ops-alerts # stored as "#ops-alerts"; key = sha256("#ops-alerts")[0..15] +set alert.wifi 10 # tighter for ops monitoring +set alert.mqtt 60 +set alert on +alert test +``` + +Anyone running a companion app and subscribed to the `#ops-alerts` hashtag channel will see the alerts inline. + +### Example: dedicated alerts channel with a private PSK + +Generate a 16-byte random PSK and base64-encode it (24 chars), or use the companion app's "Add channel" feature to create one and copy the secret. Then: + +```bash +set alert.psk +set alert.wifi 10 +set alert.mqtt 60 +set alert on +alert test +``` + +Subscribers running a MeshCore companion app should add a channel with the same PSK; alerts will appear in that channel's chat view. (Pick any local name for it — the sender of incoming alert messages is the repeater's node name.) + +### Sample messages + +``` +MyObserver: WiFi down 47m (reason 201) +MyObserver: WiFi recovered after 1h3m +MyObserver: MQTT slot 1 (analyzer-us) down 4h12m +MyObserver: MQTT slot 1 (analyzer-us) recovered after 4h45m +``` + +### Notes + +- A reboot during an outage resets the timer; the alert won't double-fire because `millis()` starts at 0 at boot. The fault must persist `alert.wifi` / `alert.mqtt` minutes from boot. +- Fault state is stored in RAM only — no persistence across reboots. +- The MQTT-slot watcher uses a separate per-slot `current_outage_started_ms` field that is reset on each reconnect, distinct from the `first_disconnect_time` shown in `mqttN.diag` (which remains a "first disconnect since boot" counter for diagnostics). +- WiFi-down alerts can only be delivered if the LoRa radio is up. There is no fallback path. +- The default Public PSK is **rejected** at both `set alert.psk` and at the alert-send path, so even if you somehow set it via a saved config file, the firmware will silently refuse to broadcast on it. + ## SNMP Monitoring Observer nodes include an optional SNMP v2c agent that exposes radio stats, MQTT connectivity, memory usage, and network information to standard monitoring tools. See [MQTT_SNMP.md](MQTT_SNMP.md) for setup and OID reference. diff --git a/examples/simple_repeater/MyMesh.cpp b/examples/simple_repeater/MyMesh.cpp index ecad3cd67f..3e0a2f0eae 100644 --- a/examples/simple_repeater/MyMesh.cpp +++ b/examples/simple_repeater/MyMesh.cpp @@ -947,6 +947,19 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc #endif _prefs.radio_watchdog_minutes = 5; // 5 minutes default + // Alert channel defaults — disabled by default, and the channel is left + // unconfigured so a freshly-flashed observer never broadcasts on the + // well-known Public hashtag. Operators must explicitly pick a private + // key (`set alert.psk`) or a hashtag (`set alert.hashtag`) before alerts + // can fire. The sender prefix on outgoing alert messages is always the + // node name (`set name ...`), so there's no separate `alert.name`. + _prefs.alert_enabled = 0; + _prefs.alert_psk_b64[0] = '\0'; + _prefs.alert_hashtag[0] = '\0'; + _prefs.alert_wifi_minutes = 30; // 30 minutes + _prefs.alert_mqtt_minutes = 240; // 4 hours + _prefs.alert_min_interval_min = 60; // re-arm window: 1 hour + // bridge defaults _prefs.bridge_enabled = 1; // enabled _prefs.bridge_delay = 500; // milliseconds @@ -1074,6 +1087,12 @@ void MyMesh::begin(FILESYSTEM *fs) { } #endif + // Wire fault-alert reporter. begin() is safe regardless of bridge state. + _alerter.begin(&_prefs, this); +#if defined(WITH_MQTT_BRIDGE) + _alerter.setBridge(bridge); +#endif + radio_driver.setParams(_prefs.freq, _prefs.bw, _prefs.sf, _prefs.cr); radio_driver.setTxPower(_prefs.tx_power_dbm); @@ -1425,6 +1444,8 @@ void MyMesh::loop() { uptime_millis += now - last_millis; last_millis = now; + _alerter.onLoop(now); + #ifdef WITH_SNMP // Push radio stats to SNMP agent every 2 seconds if (_snmp_agent.isRunning()) { diff --git a/examples/simple_repeater/MyMesh.h b/examples/simple_repeater/MyMesh.h index 4d3e6d4e36..a4cb6fa73d 100644 --- a/examples/simple_repeater/MyMesh.h +++ b/examples/simple_repeater/MyMesh.h @@ -34,6 +34,7 @@ #endif #include +#include #include #include #include @@ -130,6 +131,7 @@ class MyMesh : public mesh::Mesh, public CommonCLICallbacks { #ifdef WITH_SNMP MeshSNMPAgent _snmp_agent; #endif + AlertReporter _alerter; void putNeighbour(const mesh::Identity& id, uint32_t timestamp, float snr); uint8_t handleLoginReq(const mesh::Identity& sender, const uint8_t* secret, uint32_t sender_timestamp, const uint8_t* data, bool is_flood); @@ -211,6 +213,9 @@ class MyMesh : public mesh::Mesh, public CommonCLICallbacks { // CommonCLICallbacks void applyTempRadioParams(float freq, float bw, uint8_t sf, uint8_t cr, int timeout_mins) override; + + void onAlertConfigChanged() override { _alerter.onConfigChanged(); } + bool sendAlertText(const char* text) override { return _alerter.sendText(text); } bool formatFileSystem() override; void sendSelfAdvertisement(int delay_millis, bool flood) override; void updateAdvertTimer() override; @@ -265,10 +270,16 @@ class MyMesh : public mesh::Mesh, public CommonCLICallbacks { bridge->setStatsSources(this, _radio, _cli.getBoard(), _ms); #endif bridge->begin(); +#ifdef WITH_MQTT_BRIDGE + _alerter.setBridge(bridge); +#endif } else { bridge->end(); +#ifdef WITH_MQTT_BRIDGE + _alerter.setBridge(nullptr); +#endif } } diff --git a/examples/simple_room_server/MyMesh.cpp b/examples/simple_room_server/MyMesh.cpp index fe2af05353..9df3efaaca 100644 --- a/examples/simple_room_server/MyMesh.cpp +++ b/examples/simple_room_server/MyMesh.cpp @@ -675,6 +675,15 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc _prefs.gps_interval = 0; _prefs.advert_loc_policy = ADVERT_LOC_PREFS; + // Alert channel defaults (same as repeater; off by default and unconfigured). + // Operator must pick `set alert.psk` or `set alert.hashtag` before alerts fire. + _prefs.alert_enabled = 0; + _prefs.alert_psk_b64[0] = '\0'; + _prefs.alert_hashtag[0] = '\0'; + _prefs.alert_wifi_minutes = 30; + _prefs.alert_mqtt_minutes = 240; + _prefs.alert_min_interval_min = 60; + // bridge defaults (same as repeater) _prefs.bridge_enabled = 1; // enabled _prefs.bridge_delay = 500; // milliseconds diff --git a/src/helpers/AlertReporter.cpp b/src/helpers/AlertReporter.cpp new file mode 100644 index 0000000000..c192a5671e --- /dev/null +++ b/src/helpers/AlertReporter.cpp @@ -0,0 +1,292 @@ +#include "AlertReporter.h" + +#include +#include +#include +#include + +// Minimal base64 decoder — kept local to avoid dragging the densaugeo/base64 +// PlatformIO dependency into every repeater env that doesn't otherwise need +// it (only chat builds with MAX_GROUP_CHANNELS pulled it in via BaseChatMesh). +// Returns the number of decoded bytes, or 0 on error. Output buffer must be +// at least (in_len * 3 / 4) bytes. +static int alert_decode_base64(const char* in, size_t in_len, uint8_t* out) { + static const int8_t TBL[128] = { + -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, + -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, + -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,62,-1,-1,-1,63, + 52,53,54,55,56,57,58,59,60,61,-1,-1,-1, 0,-1,-1, + -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14, + 15,16,17,18,19,20,21,22,23,24,25,-1,-1,-1,-1,-1, + -1,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40, + 41,42,43,44,45,46,47,48,49,50,51,-1,-1,-1,-1,-1 + }; + size_t pad = 0; + while (in_len > 0 && in[in_len - 1] == '=') { in_len--; pad++; } + if ((in_len + pad) % 4 != 0) return 0; + if (pad > 2) return 0; + + size_t out_pos = 0; + uint32_t buffer = 0; + int bits = 0; + for (size_t i = 0; i < in_len; i++) { + unsigned char c = (unsigned char)in[i]; + if (c >= 128) return 0; + int v = TBL[c]; + if (v < 0) return 0; + buffer = (buffer << 6) | (uint32_t)v; + bits += 6; + if (bits >= 8) { + bits -= 8; + out[out_pos++] = (uint8_t)((buffer >> bits) & 0xFF); + } + } + return (int)out_pos; +} + +// Header layout for PAYLOAD_TYPE_GRP_TXT before encryption: +// [0..3] timestamp (uint32_t LE) — also helps make packet_hash unique +// [4] TXT_TYPE_PLAIN +// [5..] ": " (null-terminated by sender for legacy parsers) +#ifndef MAX_ALERT_TEXT_LEN +// Conservative ceiling: matches BaseChatMesh::MAX_TEXT_LEN (10 * 16 = 160) and +// stays under MAX_PACKET_PAYLOAD - 4(timestamp) - 1(type) - CIPHER_MAC_SIZE - 1. +#define MAX_ALERT_TEXT_LEN 160 +#endif + +#ifndef ALERT_TXT_TYPE_PLAIN +#define ALERT_TXT_TYPE_PLAIN 0 +#endif + +#ifdef MQTT_DEBUG +#include +#define ALERT_DEBUG_PRINTLN(...) Serial.printf("Alert: " __VA_ARGS__); Serial.println() +#else +#define ALERT_DEBUG_PRINTLN(...) do {} while (0) +#endif + +AlertReporter::AlertReporter() + : _prefs(nullptr), _mesh(nullptr), +#ifdef WITH_MQTT_BRIDGE + _bridge(nullptr), +#endif + _next_check_ms(0) { +#ifdef WITH_MQTT_BRIDGE + memset(&_wifi, 0, sizeof(_wifi)); + memset(&_mqtt, 0, sizeof(_mqtt)); +#endif +} + +void AlertReporter::begin(NodePrefs* prefs, mesh::Mesh* mesh) { + _prefs = prefs; + _mesh = mesh; + onConfigChanged(); +} + +#ifdef WITH_MQTT_BRIDGE +void AlertReporter::setBridge(MQTTBridge* bridge) { + _bridge = bridge; +} +#endif + +// Decoded bytes of the well-known PUBLIC group PSK ("izOH6cXN6mrJ5e26oRXNcg=="). +// We refuse to use this key for fault alerts so that infrastructure alarms +// never spam every node subscribed to the default Public channel. +static const uint8_t ALERT_PUBLIC_PSK_BYTES[16] = { + 0x8B, 0x33, 0x87, 0xE9, 0xC5, 0xCD, 0xEA, 0x6A, + 0xC9, 0xE5, 0xED, 0xBA, 0xA1, 0x15, 0xCD, 0x72 +}; + +bool AlertReporter::resolveChannel(mesh::GroupChannel& out) const { + if (!_prefs) return false; + + // alert_psk_b64 is the single source of truth — `set alert.hashtag` + // pre-derives the base64 PSK from sha256("#name")[0..15] at CLI time. + const char* psk = _prefs->alert_psk_b64; + size_t psk_len = strlen(psk); + if (psk_len == 0 || psk_len >= sizeof(_prefs->alert_psk_b64)) return false; + + memset(out.secret, 0, sizeof(out.secret)); + int len = alert_decode_base64(psk, psk_len, out.secret); + if (len != 32 && len != 16) return false; + + // Hard refuse the well-known PUBLIC PSK regardless of how it was supplied, + // belt-and-suspenders against an operator pasting it into alert.psk or a + // hashtag whose hash somehow collides (astronomically improbable). + if (len == 16 && memcmp(out.secret, ALERT_PUBLIC_PSK_BYTES, 16) == 0) { + ALERT_DEBUG_PRINTLN("refused PUBLIC PSK for alert channel"); + return false; + } + + // PATH_HASH_SIZE bytes — same scheme used by addChannel(). + mesh::Utils::sha256(out.hash, sizeof(out.hash), out.secret, len); + return true; +} + +void AlertReporter::onConfigChanged() { + // Reset transient state so a config change re-arms the edge detector. +#ifdef WITH_MQTT_BRIDGE + _wifi.state = OK; + _wifi.fired_at_ms = 0; + for (size_t i = 0; i < sizeof(_mqtt) / sizeof(_mqtt[0]); i++) { + _mqtt[i].state = OK; + _mqtt[i].fired_at_ms = 0; + } +#endif +} + +bool AlertReporter::sendChannel(const char* text) { + if (!_mesh || !_prefs) return false; + + mesh::GroupChannel channel; + if (!resolveChannel(channel)) return false; + + // Build ": " plaintext payload. Sender = node name (current). + uint8_t buf[5 + MAX_ALERT_TEXT_LEN + 32]; + uint32_t timestamp = _mesh->getRTCClock()->getCurrentTime(); + memcpy(buf, ×tamp, 4); + buf[4] = ALERT_TXT_TYPE_PLAIN; + + const char* sender = _prefs->node_name[0] ? _prefs->node_name : "node"; + int n = snprintf((char*)&buf[5], MAX_ALERT_TEXT_LEN, "%s: %s", sender, text); + if (n < 0) return false; + if (n >= MAX_ALERT_TEXT_LEN) n = MAX_ALERT_TEXT_LEN - 1; + + mesh::Packet* pkt = _mesh->createGroupDatagram(PAYLOAD_TYPE_GRP_TXT, channel, + buf, 5 + (size_t)n); + if (!pkt) { + ALERT_DEBUG_PRINTLN("createGroupDatagram failed (pool empty?)"); + return false; + } + _mesh->sendFlood(pkt); + ALERT_DEBUG_PRINTLN("sent: %s", text); + return true; +} + +bool AlertReporter::sendText(const char* text) { + // sendText() is the manual entry point (`alert test` CLI). Deliberately + // does NOT check alert_enabled so operators can verify the PSK / hashtag + // setup without enabling automatic fault firing. + if (!_prefs || !text || !*text) return false; + return sendChannel(text); +} + +void AlertReporter::formatAge(unsigned long age_ms, char* out, size_t out_size) const { + unsigned long secs = age_ms / 1000UL; + unsigned long h = secs / 3600UL; + unsigned long m = (secs % 3600UL) / 60UL; + if (h > 0) { + snprintf(out, out_size, "%luh%lum", h, m); + } else { + snprintf(out, out_size, "%lum", m); + } +} + +void AlertReporter::onLoop(unsigned long now_ms) { + if (!_prefs || !_prefs->alert_enabled) return; + if (!_mesh) return; + + // Throttle: ~5 s cadence. The thresholds are minutes-scale so this is fine. + if ((long)(now_ms - _next_check_ms) < 0) return; + _next_check_ms = now_ms + 5000UL; + +#ifdef WITH_MQTT_BRIDGE + // Clamp to a 60-minute floor regardless of what's in NodePrefs. The CLI + // already enforces this on set, but a stale prefs file or future field + // tweak shouldn't be able to drag the floor below 1 hour and let a + // flapping link spam the mesh. + uint16_t cfg_min = _prefs->alert_min_interval_min; + if (cfg_min < 60) cfg_min = 60; + unsigned long min_interval_ms = (unsigned long)cfg_min * 60000UL; + + // -------- WiFi fault -------- + if (_prefs->alert_wifi_minutes > 0) { + unsigned long wifi_disc_ms = MQTTBridge::getLastWifiDisconnectTime(); + unsigned long wifi_conn_ms = MQTTBridge::getWifiConnectedAtMillis(); + bool wifi_down = (wifi_disc_ms != 0 && wifi_conn_ms == 0); + unsigned long down_ms = wifi_down ? (now_ms - wifi_disc_ms) : 0; + unsigned long thresh_ms = (unsigned long)_prefs->alert_wifi_minutes * 60000UL; + + if (_wifi.state == OK) { + if (wifi_down && down_ms >= thresh_ms && + (now_ms - _wifi.fired_at_ms) >= min_interval_ms) { + char age[16]; + formatAge(down_ms, age, sizeof(age)); + uint8_t reason = MQTTBridge::getLastWifiDisconnectReason(); + char text[80]; + if (reason != 0) { + snprintf(text, sizeof(text), "WiFi down %s (reason %u)", age, (unsigned)reason); + } else { + snprintf(text, sizeof(text), "WiFi down %s", age); + } + if (sendChannel(text)) { + _wifi.state = FIRING; + _wifi.fired_at_ms = now_ms; + _wifi.last_outage_started_ms = wifi_disc_ms; + } + } + } else { // FIRING + if (!wifi_down) { + unsigned long total = (wifi_conn_ms != 0 && _wifi.last_outage_started_ms != 0) + ? (wifi_conn_ms - _wifi.last_outage_started_ms) : 0; + char age[16]; + formatAge(total, age, sizeof(age)); + char text[80]; + snprintf(text, sizeof(text), "WiFi recovered after %s", age); + sendChannel(text); + _wifi.state = OK; + } + } + } else if (_wifi.state == FIRING) { + _wifi.state = OK; // threshold disabled mid-fault: silently re-arm + } + + // -------- MQTT slot faults -------- + if (_prefs->alert_mqtt_minutes > 0 && _bridge != nullptr) { + int n = MQTTBridge::getRuntimeSlotCount(); + if (n > (int)(sizeof(_mqtt) / sizeof(_mqtt[0]))) n = (int)(sizeof(_mqtt) / sizeof(_mqtt[0])); + unsigned long thresh_ms = (unsigned long)_prefs->alert_mqtt_minutes * 60000UL; + + for (int i = 0; i < n; i++) { + Fault& f = _mqtt[i]; + if (!_bridge->isSlotEnabledAndAttempted(i)) { + if (f.state == FIRING) f.state = OK; // slot disabled mid-fault + continue; + } + unsigned long outage_start = _bridge->getSlotCurrentOutageStartMs(i); + bool down = (outage_start != 0); + unsigned long down_ms = down ? (now_ms - outage_start) : 0; + + if (f.state == OK) { + if (down && down_ms >= thresh_ms && + (now_ms - f.fired_at_ms) >= min_interval_ms) { + char age[16]; + formatAge(down_ms, age, sizeof(age)); + char text[100]; + snprintf(text, sizeof(text), "MQTT slot %d (%s) down %s", + i + 1, _bridge->getSlotPresetName(i), age); + if (sendChannel(text)) { + f.state = FIRING; + f.fired_at_ms = now_ms; + f.last_outage_started_ms = outage_start; + } + } + } else { // FIRING + if (!down) { + unsigned long total = (f.last_outage_started_ms != 0) + ? (now_ms - f.last_outage_started_ms) : 0; + char age[16]; + formatAge(total, age, sizeof(age)); + char text[100]; + snprintf(text, sizeof(text), "MQTT slot %d (%s) recovered after %s", + i + 1, _bridge->getSlotPresetName(i), age); + sendChannel(text); + f.state = OK; + } + } + } + } +#else + (void)now_ms; +#endif +} diff --git a/src/helpers/AlertReporter.h b/src/helpers/AlertReporter.h new file mode 100644 index 0000000000..e1981a86a8 --- /dev/null +++ b/src/helpers/AlertReporter.h @@ -0,0 +1,89 @@ +#pragma once + +#include +#include +#include "CommonCLI.h" + +#ifdef WITH_MQTT_BRIDGE +#include "bridges/MQTTBridge.h" +#endif + +/** + * \brief Send-only group-channel "fault alert" reporter for repeater/observer + * builds. + * + * Polls WiFi and per-MQTT-slot outage timers from MQTTBridge. When any timer + * exceeds its configured threshold, floods a single PAYLOAD_TYPE_GRP_TXT + * message on the configured alert channel ("WiFi down 47m — MyObserver"), + * then arms a "recovered" message for the next state transition. + * + * The alert channel must be explicitly configured to either a private base64 + * PSK (`set alert.psk`) or a hashtag name (`set alert.hashtag`); the + * well-known PUBLIC group key is rejected on purpose, since fault alerts + * would otherwise spam every node subscribed to the default Public channel. + * + * Edge-triggered + rate-limited via NodePrefs::alert_min_interval_min so a + * flapping link cannot spam the channel. + * + * Designed to compile and run on any repeater build: + * - The channel-send path uses only mesh::Mesh primitives that already + * exist in the Dispatcher hierarchy (createGroupDatagram + sendFlood). + * - WiFi/MQTT polling is #ifdef WITH_MQTT_BRIDGE-gated; without it, the + * reporter still supports manual `alert test` sends. + */ +class AlertReporter { +public: + AlertReporter(); + + /** + * Wire up the reporter. Must be called from MyMesh::begin() after prefs + * are loaded. \a node_name is captured by reference so subsequent rename + * (set name) is reflected automatically. + */ + void begin(NodePrefs* prefs, mesh::Mesh* mesh); + +#ifdef WITH_MQTT_BRIDGE + /** Bridge can be (re)created lazily; pass nullptr to detach. */ + void setBridge(MQTTBridge* bridge); +#endif + + /** + * Re-derive the cached GroupChannel from \a alert_psk_b64. Call from the + * CLI hot-reload hook after `set alert.psk` / `set alert.hashtag` / `set alert on|off`. + */ + void onConfigChanged(); + + /** + * Cooperative tick. Fast: returns immediately if disabled, throttled + * internally to ~5 s checks. Safe to call every loop(). + */ + void onLoop(unsigned long now_ms); + + /** + * Send an arbitrary text immediately (used by `alert test` CLI). Returns + * false when disabled, PSK invalid, or the underlying flood-send fails. + * Bypasses the rate limiter and edge logic. + */ + bool sendText(const char* text); + +private: + bool resolveChannel(mesh::GroupChannel& out) const; + bool sendChannel(const char* text); + void formatAge(unsigned long age_ms, char* out, size_t out_size) const; + + enum FaultState { OK, FIRING }; + struct Fault { + FaultState state; + unsigned long fired_at_ms; // millis() when we last sent a "down" alert + unsigned long last_outage_started_ms; // remembered so the recovered msg can quote duration + }; + + NodePrefs* _prefs; + mesh::Mesh* _mesh; +#ifdef WITH_MQTT_BRIDGE + MQTTBridge* _bridge; + Fault _wifi; + Fault _mqtt[RUNTIME_MQTT_SLOTS]; +#endif + unsigned long _next_check_ms; +}; diff --git a/src/helpers/CommonCLI.cpp b/src/helpers/CommonCLI.cpp index 39ef6eb8fb..fd3c6c9ad6 100644 --- a/src/helpers/CommonCLI.cpp +++ b/src/helpers/CommonCLI.cpp @@ -3,6 +3,29 @@ #include "TxtDataHelpers.h" #include "AdvertDataHelpers.h" #include +#include + +// Tiny base64 encoder used by `set alert.hashtag` to render a derived 16-byte +// key into NodePrefs::alert_psk_b64. Kept inline so we don't drag the +// densaugeo/base64 PlatformIO dep into every CLI-using build. +static size_t alert_encode_base64(const uint8_t* in, size_t in_len, char* out, size_t out_size) { + static const char TBL[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; + size_t needed = ((in_len + 2) / 3) * 4 + 1; + if (out_size < needed) return 0; + size_t o = 0; + for (size_t i = 0; i < in_len; i += 3) { + uint32_t v = (uint32_t)in[i] << 16; + int rem = (int)(in_len - i); + if (rem > 1) v |= (uint32_t)in[i + 1] << 8; + if (rem > 2) v |= (uint32_t)in[i + 2]; + out[o++] = TBL[(v >> 18) & 0x3F]; + out[o++] = TBL[(v >> 12) & 0x3F]; + out[o++] = (rem > 1) ? TBL[(v >> 6) & 0x3F] : '='; + out[o++] = (rem > 2) ? TBL[v & 0x3F] : '='; + } + out[o] = '\0'; + return o; +} #ifndef BRIDGE_MAX_BAUD #define BRIDGE_MAX_BAUD 115200 @@ -278,7 +301,28 @@ void CommonCLI::loadPrefsInt(FILESYSTEM* fs, const char* filename) { if (file.available() >= (int)sizeof(_prefs->radio_watchdog_minutes)) { file.read((uint8_t *)&_prefs->radio_watchdog_minutes, sizeof(_prefs->radio_watchdog_minutes)); // 316 } - // next: 317 + // Alert channel fields (appended; older files won't have them — defaults from MyMesh ctor remain) + if (file.available() >= (int)sizeof(_prefs->alert_enabled)) { + file.read((uint8_t *)&_prefs->alert_enabled, sizeof(_prefs->alert_enabled)); + } + if (file.available() >= (int)sizeof(_prefs->alert_psk_b64)) { + file.read((uint8_t *)&_prefs->alert_psk_b64, sizeof(_prefs->alert_psk_b64)); + } + if (file.available() >= (int)sizeof(_prefs->alert_wifi_minutes)) { + file.read((uint8_t *)&_prefs->alert_wifi_minutes, sizeof(_prefs->alert_wifi_minutes)); + } + if (file.available() >= (int)sizeof(_prefs->alert_mqtt_minutes)) { + file.read((uint8_t *)&_prefs->alert_mqtt_minutes, sizeof(_prefs->alert_mqtt_minutes)); + } + if (file.available() >= (int)sizeof(_prefs->alert_min_interval_min)) { + file.read((uint8_t *)&_prefs->alert_min_interval_min, sizeof(_prefs->alert_min_interval_min)); + } + if (file.available() >= (int)sizeof(_prefs->alert_hashtag)) { + file.read((uint8_t *)&_prefs->alert_hashtag, sizeof(_prefs->alert_hashtag)); + } + // ensure null termination after raw read + _prefs->alert_psk_b64[sizeof(_prefs->alert_psk_b64) - 1] = '\0'; + _prefs->alert_hashtag[sizeof(_prefs->alert_hashtag) - 1] = '\0'; // sanitise bad pref values _prefs->rx_delay_base = constrain(_prefs->rx_delay_base, 0, 20.0f); @@ -401,7 +445,13 @@ void CommonCLI::savePrefs(FILESYSTEM* fs) { file.write((uint8_t *)&_prefs->snmp_enabled, sizeof(_prefs->snmp_enabled)); // 291 file.write((uint8_t *)&_prefs->snmp_community, sizeof(_prefs->snmp_community)); // 292 file.write((uint8_t *)&_prefs->radio_watchdog_minutes, sizeof(_prefs->radio_watchdog_minutes)); // 316 - // next: 317 + // Alert channel fields (appended) + file.write((uint8_t *)&_prefs->alert_enabled, sizeof(_prefs->alert_enabled)); + file.write((uint8_t *)&_prefs->alert_psk_b64, sizeof(_prefs->alert_psk_b64)); + file.write((uint8_t *)&_prefs->alert_wifi_minutes, sizeof(_prefs->alert_wifi_minutes)); + file.write((uint8_t *)&_prefs->alert_mqtt_minutes, sizeof(_prefs->alert_mqtt_minutes)); + file.write((uint8_t *)&_prefs->alert_min_interval_min, sizeof(_prefs->alert_min_interval_min)); + file.write((uint8_t *)&_prefs->alert_hashtag, sizeof(_prefs->alert_hashtag)); file.close(); } @@ -808,6 +858,21 @@ void CommonCLI::handleCommand(uint32_t sender_timestamp, char* command, char* re } else if (memcmp(command, "clear stats", 11) == 0) { _callbacks->clearStats(); strcpy(reply, "(OK - stats reset)"); + } else if (memcmp(command, "alert test", 10) == 0 && (command[10] == 0 || command[10] == ' ')) { + // Send a one-off test alert on the configured alert channel. + const char* extra = command[10] == ' ' ? &command[11] : ""; + char text[120]; + if (*extra) { + snprintf(text, sizeof(text), "[test] %s", extra); + } else { + strcpy(text, "[test] alert channel ok"); + } + if (!_prefs->alert_psk_b64[0]) { + strcpy(reply, "Error: alert channel not configured (set alert.psk or set alert.hashtag)"); + } else { + bool ok = _callbacks->sendAlertText(text); + strcpy(reply, ok ? "OK - alert sent" : "Error: alert send failed (bad PSK or PUBLIC key refused?)"); + } } else if (memcmp(command, "get ", 4) == 0) { handleGetCmd(sender_timestamp, command, reply); } else if (memcmp(command, "set ", 4) == 0) { @@ -1528,6 +1593,132 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep savePrefs(); strcpy(reply, "OK"); #endif + } else if (memcmp(config, "alert ", 6) == 0) { + // set alert on|off + const char* val = &config[6]; + if (memcmp(val, "on", 2) == 0 && (val[2] == 0 || val[2] == ' ')) { + _prefs->alert_enabled = 1; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alerts on"); + } else if (memcmp(val, "off", 3) == 0 && (val[3] == 0 || val[3] == ' ')) { + _prefs->alert_enabled = 0; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alerts off"); + } else { + strcpy(reply, "Error: usage set alert on|off"); + } + } else if (memcmp(config, "alert.psk", 9) == 0 && (config[9] == 0 || config[9] == ' ')) { + // `set alert.psk` with no argument clears the field (alerts then disabled + // until a new psk/hashtag is configured). + const char* val = (config[9] == ' ') ? &config[10] : ""; + while (*val == ' ') val++; + size_t len = strlen(val); + if (len == 0) { + _prefs->alert_psk_b64[0] = '\0'; + _prefs->alert_hashtag[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.psk cleared (alerts disabled until configured)"); + } else if (len >= sizeof(_prefs->alert_psk_b64)) { + strcpy(reply, "Error: PSK too long (max 47 chars)"); + } else if (val[0] == '#') { + strcpy(reply, "Error: use 'set alert.hashtag' for hashtag channels"); + } else if (len != 24 && len != 44) { + // Quick character-count check before storing; AlertReporter will redo the + // full base64 decode and reject anything that doesn't yield 16 or 32 bytes. + strcpy(reply, "Error: PSK must be 24 chars (16-byte) or 44 chars (32-byte) base64"); + } else if (strcmp(val, "izOH6cXN6mrJ5e26oRXNcg==") == 0 || + strcmp(val, "izOH6cXN6mrJ5e26oRXNcg") == 0) { + // Refuse the well-known PUBLIC group PSK — fault alerts must not spam + // every node on the default Public channel. + strcpy(reply, "Error: refusing PUBLIC PSK; pick a private key or hashtag"); + } else { + StrHelper::strncpy(_prefs->alert_psk_b64, val, sizeof(_prefs->alert_psk_b64)); + // The new PSK is operator-supplied, so any previously-derived hashtag + // name is no longer accurate provenance — drop it. + _prefs->alert_hashtag[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.psk updated"); + } + } else if (memcmp(config, "alert.hashtag", 13) == 0 && (config[13] == 0 || config[13] == ' ')) { + const char* val = (config[13] == ' ') ? &config[14] : ""; + while (*val == ' ') val++; + size_t in_len = strlen(val); + if (in_len == 0) { + _prefs->alert_psk_b64[0] = '\0'; + _prefs->alert_hashtag[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.hashtag cleared (alerts disabled until configured)"); + } else { + // Canonical stored form is "#name" because the leading '#' is part of + // the sha256 input (matching the companion-app hashtag-channel + // derivation in docs/companion_protocol.md). Accept the user typing + // either "alerts" or "#alerts". + char hashtag[sizeof(_prefs->alert_hashtag)]; + size_t need = (val[0] == '#') ? in_len : in_len + 1; + if (need >= sizeof(hashtag)) { + strcpy(reply, "Error: hashtag too long"); + } else { + if (val[0] == '#') { + StrHelper::strncpy(hashtag, val, sizeof(hashtag)); + } else { + hashtag[0] = '#'; + StrHelper::strncpy(&hashtag[1], val, sizeof(hashtag) - 1); + } + + // Derive the channel key once: first 16 bytes of sha256("#name"), + // then base64-encode and store in alert_psk_b64. We don't re-derive + // on every send — operators can later override with `set alert.psk` + // without leaving stale hashtag text behind. + uint8_t digest[32]; + mesh::Utils::sha256(digest, sizeof(digest), + (const uint8_t*)hashtag, (int)strlen(hashtag)); + char b64[48]; + size_t b64_len = alert_encode_base64(digest, 16, b64, sizeof(b64)); + if (b64_len == 0 || b64_len >= sizeof(_prefs->alert_psk_b64)) { + strcpy(reply, "Error: failed to derive PSK from hashtag"); + } else { + StrHelper::strncpy(_prefs->alert_hashtag, hashtag, sizeof(_prefs->alert_hashtag)); + StrHelper::strncpy(_prefs->alert_psk_b64, b64, sizeof(_prefs->alert_psk_b64)); + savePrefs(); + _callbacks->onAlertConfigChanged(); + sprintf(reply, "OK - alert.hashtag: %s", _prefs->alert_hashtag); + } + } + } + } else if (memcmp(config, "alert.wifi ", 11) == 0) { + int mins = (int)_atoi(&config[11]); + if (mins < 0 || mins > 1440) { + strcpy(reply, "Error: alert.wifi must be 0-1440 minutes (0=off)"); + } else { + _prefs->alert_wifi_minutes = (uint16_t)mins; + savePrefs(); + sprintf(reply, "OK - alert.wifi %d min%s", mins, mins == 0 ? " (disabled)" : ""); + } + } else if (memcmp(config, "alert.mqtt ", 11) == 0) { + int mins = (int)_atoi(&config[11]); + if (mins < 0 || mins > 10080) { + strcpy(reply, "Error: alert.mqtt must be 0-10080 minutes (0=off)"); + } else { + _prefs->alert_mqtt_minutes = (uint16_t)mins; + savePrefs(); + sprintf(reply, "OK - alert.mqtt %d min%s", mins, mins == 0 ? " (disabled)" : ""); + } + } else if (memcmp(config, "alert.interval ", 15) == 0) { + int mins = (int)_atoi(&config[15]); + // Floor at 60 min: faster re-fires would let a flapping link spam the + // mesh with a fresh GRP_TXT flood every minute — terrible for airtime. + if (mins < 60 || mins > 10080) { + strcpy(reply, "Error: alert.interval must be 60-10080 minutes"); + } else { + _prefs->alert_min_interval_min = (uint16_t)mins; + savePrefs(); + sprintf(reply, "OK - alert.interval %d min", mins); + } } else if (memcmp(config, "adc.multiplier ", 15) == 0) { _prefs->adc_multiplier = atof(&config[15]); if (_board->setAdcMultiplier(_prefs->adc_multiplier)) { @@ -1835,6 +2026,20 @@ void CommonCLI::handleGetCmd(uint32_t sender_timestamp, char* command, char* rep #else strcpy(reply, "ERROR: unsupported"); #endif + } else if (memcmp(config, "alert.hashtag", 13) == 0) { + sprintf(reply, "> %s", _prefs->alert_hashtag[0] ? _prefs->alert_hashtag : "(unset)"); + } else if (memcmp(config, "alert.psk", 9) == 0) { + sprintf(reply, "> %s", _prefs->alert_psk_b64[0] ? _prefs->alert_psk_b64 : "(unset)"); + } else if (memcmp(config, "alert.wifi", 10) == 0) { + sprintf(reply, "> %u min%s", (unsigned)_prefs->alert_wifi_minutes, + _prefs->alert_wifi_minutes == 0 ? " (disabled)" : ""); + } else if (memcmp(config, "alert.mqtt", 10) == 0) { + sprintf(reply, "> %u min%s", (unsigned)_prefs->alert_mqtt_minutes, + _prefs->alert_mqtt_minutes == 0 ? " (disabled)" : ""); + } else if (memcmp(config, "alert.interval", 14) == 0) { + sprintf(reply, "> %u min", (unsigned)_prefs->alert_min_interval_min); + } else if (memcmp(config, "alert", 5) == 0 && (config[5] == 0 || config[5] == '\n' || config[5] == '\r')) { + sprintf(reply, "> %s", _prefs->alert_enabled ? "on" : "off"); } else if (memcmp(config, "adc.multiplier", 14) == 0) { float adc_mult = _board->getAdcMultiplier(); if (adc_mult == 0.0f) { diff --git a/src/helpers/CommonCLI.h b/src/helpers/CommonCLI.h index aaa7a77e0b..6e38e8c0b7 100644 --- a/src/helpers/CommonCLI.h +++ b/src/helpers/CommonCLI.h @@ -104,6 +104,20 @@ struct NodePrefs { // persisted to file uint8_t snmp_enabled; // boolean: 0=off, 1=on char snmp_community[24]; // community string (default "public") uint8_t radio_watchdog_minutes; // 0=disabled, 1-120 minutes + + // Fault alert channel (LoRa group-channel "observer status" message on prolonged WiFi/MQTT outage). + // Sent over the radio (NOT over MQTT) so the alert still works while the MQTT path is broken. + // All fields are appended at the end of NodePrefs for binary-compatible upgrades. + uint8_t alert_enabled; // 0 = off (default), 1 = on + char alert_psk_b64[48]; // base64 PSK; empty = alerts disabled. PUBLIC_GROUP_PSK is rejected. + uint16_t alert_wifi_minutes; // WiFi-down threshold in minutes (0 = disabled), default 30 + uint16_t alert_mqtt_minutes; // MQTT-down threshold in minutes (0 = disabled), default 240 (4 h) + uint16_t alert_min_interval_min; // min minutes between alerts for the same fault, default 60, floor 60 + // When the operator configures via `set alert.hashtag `, we derive + // alert_psk_b64 from sha256("#name")[0..15] once and remember the hashtag + // text here purely for `get alert.hashtag` readback. A subsequent + // `set alert.psk` clears this field so it doesn't lie about provenance. + char alert_hashtag[24]; }; #ifdef WITH_MQTT_BRIDGE @@ -274,6 +288,16 @@ class CommonCLICallbacks { virtual void setRxBoostedGain(bool enable) { // no op by default }; + + // Fault-alert channel hooks (see NodePrefs::alert_*). The default no-op + // implementations keep CLI commands harmless on builds that don't wire up + // an AlertReporter. + virtual void onAlertConfigChanged() { + // no op by default + } + virtual bool sendAlertText(const char* /*text*/) { + return false; // no op by default + } }; class CommonCLI { diff --git a/src/helpers/bridges/MQTTBridge.cpp b/src/helpers/bridges/MQTTBridge.cpp index e43ee162df..8a251b484b 100644 --- a/src/helpers/bridges/MQTTBridge.cpp +++ b/src/helpers/bridges/MQTTBridge.cpp @@ -198,6 +198,25 @@ void MQTTBridge::formatMqttStatusReply(char* buf, size_t bufsize, const NodePref uint8_t MQTTBridge::getLastWifiDisconnectReason() { return s_wifi_disconnect_reason; } unsigned long MQTTBridge::getLastWifiDisconnectTime() { return s_wifi_disconnect_time; } +unsigned long MQTTBridge::getSlotCurrentOutageStartMs(int slot_index) const { + if (slot_index < 0 || slot_index >= RUNTIME_MQTT_SLOTS) return 0; + return _slots[slot_index].current_outage_started_ms; +} + +bool MQTTBridge::isSlotEnabledAndAttempted(int slot_index) const { + if (slot_index < 0 || slot_index >= RUNTIME_MQTT_SLOTS) return false; + const MQTTSlot& s = _slots[slot_index]; + return s.enabled && s.initial_connect_done; +} + +const char* MQTTBridge::getSlotPresetName(int slot_index) const { + if (slot_index < 0 || slot_index >= RUNTIME_MQTT_SLOTS) return "?"; + const MQTTSlot& s = _slots[slot_index]; + if (s.preset && s.preset->name) return s.preset->name; + if (!s.enabled) return MQTT_PRESET_NONE; + return MQTT_PRESET_CUSTOM; +} + const char* MQTTBridge::wifiReasonStr(uint8_t reason) { switch (reason) { case 2: return "auth expired"; @@ -997,6 +1016,7 @@ void MQTTBridge::initSlotClients() { _slots[index].last_tls_stack_err = 0; _slots[index].last_sock_errno = 0; _slots[index].last_error_time = 0; + _slots[index].current_outage_started_ms = 0; // clear current-outage timer for AlertReporter updateCachedConnectionStatus(); publishStatusToSlot(index); }); @@ -1006,6 +1026,9 @@ void MQTTBridge::initSlotClients() { if (_slots[index].first_disconnect_time == 0) { _slots[index].first_disconnect_time = millis(); } + if (_slots[index].current_outage_started_ms == 0) { + _slots[index].current_outage_started_ms = millis(); + } _slots[index].connected = false; updateCachedConnectionStatus(); }); diff --git a/src/helpers/bridges/MQTTBridge.h b/src/helpers/bridges/MQTTBridge.h index e633e1dea5..c9a511d2ed 100644 --- a/src/helpers/bridges/MQTTBridge.h +++ b/src/helpers/bridges/MQTTBridge.h @@ -96,6 +96,12 @@ class MQTTBridge : public BridgeBase { unsigned long last_error_time; // millis() of last error uint32_t disconnect_count; // Number of disconnect callbacks since boot unsigned long first_disconnect_time; // millis() of first disconnect after boot + + // Current-outage timer (used by AlertReporter to fire faults after a sustained + // outage). Reset to 0 on each successful connect, set to millis() on first + // disconnect-after-connect. first_disconnect_time is intentionally separate + // so the existing 'mqttN.diag' "first_disc" semantics don't change. + unsigned long current_outage_started_ms; }; MQTTSlot _slots[RUNTIME_MQTT_SLOTS]; @@ -381,6 +387,23 @@ class MQTTBridge : public BridgeBase { bool isReady() const; static unsigned long getWifiConnectedAtMillis(); + + /** + * Per-slot outage accessors used by AlertReporter to detect prolonged + * MQTT broker outages. Indices are 0..RUNTIME_MQTT_SLOTS-1. + * + * - getSlotCurrentOutageStartMs(): millis() of the current outage start + * (0 when the slot is connected). Reset on each reconnect. + * - isSlotEnabledAndAttempted(): true when the slot is enabled (preset + * != "none") and has reached at least one connect attempt — i.e. it is + * meaningful to alarm on its connection state. + * - getSlotPresetName(): preset name for friendly status text. Returns + * "custom"/"none"/preset->name; never null. + */ + unsigned long getSlotCurrentOutageStartMs(int slot_index) const; + bool isSlotEnabledAndAttempted(int slot_index) const; + const char* getSlotPresetName(int slot_index) const; + static int getRuntimeSlotCount() { return RUNTIME_MQTT_SLOTS; } /** Resolved origin for MQTT JSON: node_name when mqtt_origin is empty, else mqtt_origin (with quote stripping). */ static void getEffectiveMqttOrigin(const NodePrefs* prefs, char* buf, size_t buf_size); static void formatMqttStatusReply(char* buf, size_t bufsize, const NodePrefs* prefs); From 16dc49fa1022fe9fe2036ca2b736eeeab5f0df1d Mon Sep 17 00:00:00 2001 From: agessaman Date: Mon, 11 May 2026 13:45:01 -0700 Subject: [PATCH 2/7] Enhance fault alert system with region-based scoping and banned channels Updated the fault alert functionality to include an optional region name for scoping alert floods, allowing operators to override the default scope. Introduced a list of banned channels (e.g., Public PSK, `#test`, `#bot`) to prevent spamming community channels with alerts. The implementation ensures that alerts are only sent to private PSKs or non-banned hashtags. Relevant changes were made across multiple files, including updates to the CLI for setting and retrieving the new `alert.region` preference. --- MQTT_IMPLEMENTATION.md | 29 ++++++-- examples/simple_repeater/MyMesh.cpp | 24 ++++++- examples/simple_repeater/MyMesh.h | 1 + examples/simple_room_server/MyMesh.cpp | 19 ++++++ examples/simple_room_server/MyMesh.h | 1 + src/helpers/AlertReporter.cpp | 95 ++++++++++++++++++++++---- src/helpers/AlertReporter.h | 30 ++++++-- src/helpers/CommonCLI.cpp | 66 ++++++++++++++---- src/helpers/CommonCLI.h | 13 ++++ 9 files changed, 239 insertions(+), 39 deletions(-) diff --git a/MQTT_IMPLEMENTATION.md b/MQTT_IMPLEMENTATION.md index 6d9efef610..159d6fab79 100644 --- a/MQTT_IMPLEMENTATION.md +++ b/MQTT_IMPLEMENTATION.md @@ -684,7 +684,23 @@ The repeater can broadcast a one-line fault notification on a configured group c The alert is sent over **LoRa** as a `PAYLOAD_TYPE_GRP_TXT` flood packet on the configured channel (with sender = device name) — *not* over MQTT. This is intentional: the MQTT path is what's broken, so the only working delivery is the mesh itself. Anyone in radio range subscribed to the same channel/hashtag in their companion app will see the alert inline with normal channel chat. -> **The default Public channel is intentionally NOT supported.** Fault alerts are operator-infrastructure noise — broadcasting them on the well-known Public PSK would spam every node in the area. The implementation explicitly rejects the Public PSK (`izOH6cXN6mrJ5e26oRXNcg==`) at both the CLI validation step and the alert-send path. You must explicitly point alerts at a **private PSK** (`set alert.psk`) or a **hashtag channel** (`set alert.hashtag`) before alerts can fire. +> **A small list of community channels is intentionally NOT supported.** Fault alerts are operator-infrastructure noise — broadcasting them on shared community channels would spam every node in the area (and on `#test` / `#bot` would amplify via well-known auto-responders). The currently banned destinations are: +> +> - The well-known **Public** group PSK (`izOH6cXN6mrJ5e26oRXNcg==`) +> - **`#test`** (`sha256("#test")[0..15]`) +> - **`#bot`** (`sha256("#bot")[0..15]`) +> +> The list lives in `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); adding a new entry is one line (label + 32 hex chars). The matcher runs at both the CLI validation step (`set alert.psk`, `set alert.hashtag`) and the alert-send path, so a saved-config bypass is still refused at runtime. You must point alerts at a **private PSK** (`set alert.psk`) or a non-banned **hashtag channel** (`set alert.hashtag`) before alerts can fire. + +### Scope and routing + +Alert floods ride the **repeater's default scope** by default (the same TransportKey used for adverts and channel broadcasts — set via `region default ...`). Operators can override on a per-alert-feature basis with `set alert.region `: + +- If `alert.region` is set and the name resolves via `RegionMap`, that region's TransportKey is used. +- If `alert.region` is unset, or the name doesn't resolve, the repeater's `default_scope` is used. +- If both are null, the alert is sent unscoped (matches the pre-scoped firmware's behavior). + +`alert.region` is stored as-is — it does **not** create the region. Use `region put ` first if it doesn't exist. ### What triggers an alert @@ -700,6 +716,7 @@ A "recovered" message is sent once when the underlying connection comes back. Af | `alert` | `off` | Master enable for automatic fault alerts | | `alert.psk` | *(unset)* | Private base64 PSK (24 or 44 chars). The active channel key. | | `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | +| `alert.region` | *(unset)* | Optional region name; overrides the repeater's `default_scope` for alert sends only. Empty = use `default_scope`. Looked up lazily via `RegionMap`; unknown names silently fall back to `default_scope`. | | `alert.wifi` | `30` (min) | 0 disables WiFi alerts | | `alert.mqtt` | `240` (min) | 0 disables MQTT alerts | | `alert.interval` | `60` (min) | Minutes between repeat alerts of the same fault. **Hard floor of 60 min** so a flapping link can't spam the mesh; the CLI rejects lower values and AlertReporter clamps stale prefs at runtime. | @@ -712,14 +729,17 @@ Get: - `get alert` — master on/off - `get alert.psk` — the active base64 PSK (or `(unset)`) - `get alert.hashtag` — the originating hashtag (or `(unset)`, e.g. after `set alert.psk` overrides the hashtag-derived key) +- `get alert.region` — alert-only scope override (or `(unset, using default scope)`) - `get alert.wifi` / `get alert.mqtt` / `get alert.interval` Set: - `set alert on` / `set alert off` -- `set alert.psk ` — 24-char (16-byte) or 44-char (32-byte) base64; rejects the well-known Public PSK. Clears `alert.hashtag` since the new key is operator-supplied. +- `set alert.psk ` — 24-char (16-byte) or 44-char (32-byte) base64; rejects banned channels (Public, `#test`, `#bot`). Clears `alert.hashtag` since the new key is operator-supplied. - `set alert.psk` (no argument) — clears both `alert.psk` and `alert.hashtag` -- `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). +- `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). Refuses banned hashtag names. - `set alert.hashtag` (no argument) — clears both `alert.psk` and `alert.hashtag` +- `set alert.region ` — alert-only scope override (no region-map mutation; unknown names silently fall back to `default_scope`) +- `set alert.region` (no argument) — clear override, use `default_scope` - `set alert.wifi ` (0–1440; 0 = disabled) - `set alert.mqtt ` (0–10080; 0 = disabled) - `set alert.interval ` (60–10080; 60-minute floor to protect mesh airtime) @@ -769,7 +789,8 @@ MyObserver: MQTT slot 1 (analyzer-us) recovered after 4h45m - Fault state is stored in RAM only — no persistence across reboots. - The MQTT-slot watcher uses a separate per-slot `current_outage_started_ms` field that is reset on each reconnect, distinct from the `first_disconnect_time` shown in `mqttN.diag` (which remains a "first disconnect since boot" counter for diagnostics). - WiFi-down alerts can only be delivered if the LoRa radio is up. There is no fallback path. -- The default Public PSK is **rejected** at both `set alert.psk` and at the alert-send path, so even if you somehow set it via a saved config file, the firmware will silently refuse to broadcast on it. +- Banned channels (Public, `#test`, `#bot`) are **rejected** at both `set alert.psk` / `set alert.hashtag` and at the alert-send path, so even if you somehow set one via a saved config file, the firmware will silently refuse to broadcast on it. To add another banned channel, append a row to `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); the format is `{ "label", "32-lowercase-hex-chars" }` (compute as `printf '#name' | openssl dgst -sha256 | cut -c1-32`). +- Alerts are sent via `sendFlood` with the resolved TransportKey codes attached, so they appear on the configured scope just like other broadcast traffic. Operators monitoring a specific region need to be subscribed to that region's scope to hear alerts. ## SNMP Monitoring diff --git a/examples/simple_repeater/MyMesh.cpp b/examples/simple_repeater/MyMesh.cpp index 3e0a2f0eae..37d1da5ced 100644 --- a/examples/simple_repeater/MyMesh.cpp +++ b/examples/simple_repeater/MyMesh.cpp @@ -956,6 +956,7 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc _prefs.alert_enabled = 0; _prefs.alert_psk_b64[0] = '\0'; _prefs.alert_hashtag[0] = '\0'; + _prefs.alert_region[0] = '\0'; // empty = use default_scope _prefs.alert_wifi_minutes = 30; // 30 minutes _prefs.alert_mqtt_minutes = 240; // 4 hours _prefs.alert_min_interval_min = 60; // re-arm window: 1 hour @@ -1088,7 +1089,10 @@ void MyMesh::begin(FILESYSTEM *fs) { #endif // Wire fault-alert reporter. begin() is safe regardless of bridge state. - _alerter.begin(&_prefs, this); + // Passing `this` as the callbacks lets the reporter resolve a TransportKey + // scope (alert.region override, falling back to default_scope) so alert + // floods ride the same scope as adverts/channel messages. + _alerter.begin(&_prefs, this, this); #if defined(WITH_MQTT_BRIDGE) _alerter.setBridge(bridge); #endif @@ -1119,6 +1123,24 @@ void MyMesh::sendFloodScoped(const TransportKey& scope, mesh::Packet* pkt, uint3 } } +bool MyMesh::resolveAlertScope(TransportKey& dest) { + // Prefer an explicit alert.region override; look it up lazily via + // RegionMap so the operator can name a region that doesn't exist yet + // without polluting region_map state — we just silently fall through + // to default_scope on miss. + if (_prefs.alert_region[0]) { + auto r = region_map.findByNamePrefix(_prefs.alert_region); + if (r && region_map.getTransportKeysFor(*r, &dest, 1) > 0 && !dest.isNull()) { + return true; + } + } + if (!default_scope.isNull()) { + dest = default_scope; + return true; + } + return false; +} + void MyMesh::applyTempRadioParams(float freq, float bw, uint8_t sf, uint8_t cr, int timeout_mins) { set_radio_at = futureMillis(2000); // give CLI reply some time to be sent back, before applying temp radio params pending_freq = freq; diff --git a/examples/simple_repeater/MyMesh.h b/examples/simple_repeater/MyMesh.h index a4cb6fa73d..57c7a877b9 100644 --- a/examples/simple_repeater/MyMesh.h +++ b/examples/simple_repeater/MyMesh.h @@ -216,6 +216,7 @@ class MyMesh : public mesh::Mesh, public CommonCLICallbacks { void onAlertConfigChanged() override { _alerter.onConfigChanged(); } bool sendAlertText(const char* text) override { return _alerter.sendText(text); } + bool resolveAlertScope(TransportKey& dest) override; bool formatFileSystem() override; void sendSelfAdvertisement(int delay_millis, bool flood) override; void updateAdvertTimer() override; diff --git a/examples/simple_room_server/MyMesh.cpp b/examples/simple_room_server/MyMesh.cpp index 9df3efaaca..34c4258b5b 100644 --- a/examples/simple_room_server/MyMesh.cpp +++ b/examples/simple_room_server/MyMesh.cpp @@ -680,6 +680,7 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc _prefs.alert_enabled = 0; _prefs.alert_psk_b64[0] = '\0'; _prefs.alert_hashtag[0] = '\0'; + _prefs.alert_region[0] = '\0'; _prefs.alert_wifi_minutes = 30; _prefs.alert_mqtt_minutes = 240; _prefs.alert_min_interval_min = 60; @@ -804,6 +805,24 @@ void MyMesh::sendFloodScoped(const TransportKey& scope, mesh::Packet* pkt, uint3 } } +bool MyMesh::resolveAlertScope(TransportKey& dest) { + // Same resolution policy as simple_repeater: alert.region > default_scope. + // The room server doesn't currently embed an AlertReporter, but keeping + // the override in lockstep means the callback path works the same on both + // builds and we won't get caught out if/when it does. + if (_prefs.alert_region[0]) { + auto r = region_map.findByNamePrefix(_prefs.alert_region); + if (r && region_map.getTransportKeysFor(*r, &dest, 1) > 0 && !dest.isNull()) { + return true; + } + } + if (!default_scope.isNull()) { + dest = default_scope; + return true; + } + return false; +} + void MyMesh::sendFloodReply(mesh::Packet* packet, unsigned long delay_millis, uint8_t path_hash_size) { if (recv_pkt_region && !recv_pkt_region->isWildcard()) { // if _request_ packet scope is known, send reply with same scope TransportKey scope; diff --git a/examples/simple_room_server/MyMesh.h b/examples/simple_room_server/MyMesh.h index 74e57e808a..11c2dba2b4 100644 --- a/examples/simple_room_server/MyMesh.h +++ b/examples/simple_room_server/MyMesh.h @@ -198,6 +198,7 @@ class MyMesh : public mesh::Mesh, public CommonCLICallbacks { // CommonCLICallbacks void applyTempRadioParams(float freq, float bw, uint8_t sf, uint8_t cr, int timeout_mins) override; + bool resolveAlertScope(TransportKey& dest) override; bool formatFileSystem() override; void sendSelfAdvertisement(int delay_millis, bool flood) override; void updateAdvertTimer() override; diff --git a/src/helpers/AlertReporter.cpp b/src/helpers/AlertReporter.cpp index c192a5671e..19082f9b16 100644 --- a/src/helpers/AlertReporter.cpp +++ b/src/helpers/AlertReporter.cpp @@ -66,7 +66,7 @@ static int alert_decode_base64(const char* in, size_t in_len, uint8_t* out) { #endif AlertReporter::AlertReporter() - : _prefs(nullptr), _mesh(nullptr), + : _prefs(nullptr), _mesh(nullptr), _callbacks(nullptr), #ifdef WITH_MQTT_BRIDGE _bridge(nullptr), #endif @@ -77,9 +77,10 @@ AlertReporter::AlertReporter() #endif } -void AlertReporter::begin(NodePrefs* prefs, mesh::Mesh* mesh) { +void AlertReporter::begin(NodePrefs* prefs, mesh::Mesh* mesh, CommonCLICallbacks* callbacks) { _prefs = prefs; _mesh = mesh; + _callbacks = callbacks; onConfigChanged(); } @@ -89,14 +90,60 @@ void AlertReporter::setBridge(MQTTBridge* bridge) { } #endif -// Decoded bytes of the well-known PUBLIC group PSK ("izOH6cXN6mrJ5e26oRXNcg=="). -// We refuse to use this key for fault alerts so that infrastructure alarms -// never spam every node subscribed to the default Public channel. -static const uint8_t ALERT_PUBLIC_PSK_BYTES[16] = { - 0x8B, 0x33, 0x87, 0xE9, 0xC5, 0xCD, 0xEA, 0x6A, - 0xC9, 0xE5, 0xED, 0xBA, 0xA1, 0x15, 0xCD, 0x72 +// Channels banned as fault-alert destinations. Fault alerts are noisy +// operator-infrastructure messages; routing them to community channels would +// flood every nearby companion app (and amplify via well-known auto-responder +// bots), so the firmware refuses these keys at both CLI set-time and at +// runtime in resolveChannel. +// +// Provenance for each row can be re-derived with: +// printf '#name' | openssl dgst -sha256 | cut -c1-32 +// or for raw b64 PSKs: +// echo 'izOH6cXN6mrJ5e26oRXNcg==' | base64 -d | xxd -p +// +// To ban an additional channel: append one new row; no other code changes +// required. The matcher converts the candidate's 16-byte secret to a +// lowercase hex string and does a linear strcmp — N is tiny. +struct BannedAlertChannel { + const char* label; + const char* secret_hex; // 32 lowercase hex chars (no 0x, no separators) }; +static const BannedAlertChannel BANNED_ALERT_CHANNELS[] = { + // Public group PSK ("izOH6cXN6mrJ5e26oRXNcg==") + { "PUBLIC", "8b3387e9c5cdea6ac9e5edbaa115cd72" }, + // sha256("#test")[0..15] — auto-responders in many regions + { "#test", "9cd8fcf22a47333b591d96a2b848b73f" }, + // sha256("#bot")[0..15] — generic bot channel, frequent auto-responders + { "#bot", "eb50a1bcb3e4e5d7bf69a57c9dada211" }, +}; + +const char* alertReporterBannedChannelMatch(const uint8_t* secret16) { + char hex[33]; + static const char* H = "0123456789abcdef"; + for (int i = 0; i < 16; i++) { + hex[i*2] = H[(secret16[i] >> 4) & 0xF]; + hex[i*2+1] = H[secret16[i] & 0xF]; + } + hex[32] = '\0'; + for (size_t i = 0; i < sizeof(BANNED_ALERT_CHANNELS) / sizeof(BANNED_ALERT_CHANNELS[0]); i++) { + if (strcmp(hex, BANNED_ALERT_CHANNELS[i].secret_hex) == 0) { + return BANNED_ALERT_CHANNELS[i].label; + } + } + return nullptr; +} + +const char* alertReporterBannedChannelMatchB64(const char* psk_b64) { + if (!psk_b64) return nullptr; + size_t len_in = strlen(psk_b64); + if (len_in == 0) return nullptr; + uint8_t secret[32]; + int n = alert_decode_base64(psk_b64, len_in, secret); + if (n != 16) return nullptr; // banned table only contains 16-byte secrets + return alertReporterBannedChannelMatch(secret); +} + bool AlertReporter::resolveChannel(mesh::GroupChannel& out) const { if (!_prefs) return false; @@ -110,12 +157,15 @@ bool AlertReporter::resolveChannel(mesh::GroupChannel& out) const { int len = alert_decode_base64(psk, psk_len, out.secret); if (len != 32 && len != 16) return false; - // Hard refuse the well-known PUBLIC PSK regardless of how it was supplied, - // belt-and-suspenders against an operator pasting it into alert.psk or a - // hashtag whose hash somehow collides (astronomically improbable). - if (len == 16 && memcmp(out.secret, ALERT_PUBLIC_PSK_BYTES, 16) == 0) { - ALERT_DEBUG_PRINTLN("refused PUBLIC PSK for alert channel"); - return false; + // Belt-and-suspenders against an operator pasting a banned PSK directly + // into alert.psk, or a hashtag whose hash somehow collides with one of the + // banned 16-byte secrets (astronomically improbable, but free to check). + if (len == 16) { + const char* banned = alertReporterBannedChannelMatch(out.secret); + if (banned) { + ALERT_DEBUG_PRINTLN("refused banned channel '%s' for alert", banned); + return false; + } } // PATH_HASH_SIZE bytes — same scheme used by addChannel(). @@ -158,7 +208,22 @@ bool AlertReporter::sendChannel(const char* text) { ALERT_DEBUG_PRINTLN("createGroupDatagram failed (pool empty?)"); return false; } - _mesh->sendFlood(pkt); + + // Ride the repeater's default scope (or `alert.region` override) when the + // host MyMesh provides one — same path MyMesh uses for adverts and + // broadcast channel messages. Falls back to plain (unscoped) flood when + // no callbacks are wired or no scope is configured, matching the + // pre-scoped behavior on builds without RegionMap. + TransportKey scope; + bool have_scope = _callbacks && _callbacks->resolveAlertScope(scope) && !scope.isNull(); + if (have_scope) { + uint16_t codes[2]; + codes[0] = scope.calcTransportCode(pkt); + codes[1] = 0; + _mesh->sendFlood(pkt, codes); + } else { + _mesh->sendFlood(pkt); + } ALERT_DEBUG_PRINTLN("sent: %s", text); return true; } diff --git a/src/helpers/AlertReporter.h b/src/helpers/AlertReporter.h index e1981a86a8..eec488020c 100644 --- a/src/helpers/AlertReporter.h +++ b/src/helpers/AlertReporter.h @@ -8,6 +8,23 @@ #include "bridges/MQTTBridge.h" #endif +/** + * Returns the label of a banned alert channel if \a secret16 matches one of + * the channels in the BANNED_ALERT_CHANNELS table (e.g. "PUBLIC", "#test", + * "#bot"), or nullptr otherwise. Centralized here so both AlertReporter and + * the CommonCLI `set alert.psk` / `set alert.hashtag` handlers can share one + * source of truth — adding a new banned channel is a one-line table edit. + */ +const char* alertReporterBannedChannelMatch(const uint8_t* secret16); + +/** + * Convenience: base64-decodes \a psk_b64 and forwards to + * alertReporterBannedChannelMatch. Returns nullptr if not banned (or if + * the input doesn't decode to a 16-byte key — non-16-byte keys cannot + * match any banned entry anyway). + */ +const char* alertReporterBannedChannelMatchB64(const char* psk_b64); + /** * \brief Send-only group-channel "fault alert" reporter for repeater/observer * builds. @@ -19,8 +36,9 @@ * * The alert channel must be explicitly configured to either a private base64 * PSK (`set alert.psk`) or a hashtag name (`set alert.hashtag`); the - * well-known PUBLIC group key is rejected on purpose, since fault alerts - * would otherwise spam every node subscribed to the default Public channel. + * well-known PUBLIC group key (and a small list of other auto-responder + * channels — see BANNED_ALERT_CHANNELS in AlertReporter.cpp) are rejected on + * purpose so fault alerts never spam community channels. * * Edge-triggered + rate-limited via NodePrefs::alert_min_interval_min so a * flapping link cannot spam the channel. @@ -37,10 +55,11 @@ class AlertReporter { /** * Wire up the reporter. Must be called from MyMesh::begin() after prefs - * are loaded. \a node_name is captured by reference so subsequent rename - * (set name) is reflected automatically. + * are loaded. \a callbacks is optional — when non-null the reporter uses + * it to resolve a TransportKey scope for outgoing alert floods (so the + * packet rides the repeater's default scope or an `alert.region` override). */ - void begin(NodePrefs* prefs, mesh::Mesh* mesh); + void begin(NodePrefs* prefs, mesh::Mesh* mesh, CommonCLICallbacks* callbacks = nullptr); #ifdef WITH_MQTT_BRIDGE /** Bridge can be (re)created lazily; pass nullptr to detach. */ @@ -80,6 +99,7 @@ class AlertReporter { NodePrefs* _prefs; mesh::Mesh* _mesh; + CommonCLICallbacks* _callbacks; #ifdef WITH_MQTT_BRIDGE MQTTBridge* _bridge; Fault _wifi; diff --git a/src/helpers/CommonCLI.cpp b/src/helpers/CommonCLI.cpp index fd3c6c9ad6..32417a6fdd 100644 --- a/src/helpers/CommonCLI.cpp +++ b/src/helpers/CommonCLI.cpp @@ -2,6 +2,7 @@ #include "CommonCLI.h" #include "TxtDataHelpers.h" #include "AdvertDataHelpers.h" +#include "AlertReporter.h" // for alertReporterBannedChannelMatch() #include #include @@ -320,9 +321,13 @@ void CommonCLI::loadPrefsInt(FILESYSTEM* fs, const char* filename) { if (file.available() >= (int)sizeof(_prefs->alert_hashtag)) { file.read((uint8_t *)&_prefs->alert_hashtag, sizeof(_prefs->alert_hashtag)); } + if (file.available() >= (int)sizeof(_prefs->alert_region)) { + file.read((uint8_t *)&_prefs->alert_region, sizeof(_prefs->alert_region)); + } // ensure null termination after raw read _prefs->alert_psk_b64[sizeof(_prefs->alert_psk_b64) - 1] = '\0'; _prefs->alert_hashtag[sizeof(_prefs->alert_hashtag) - 1] = '\0'; + _prefs->alert_region[sizeof(_prefs->alert_region) - 1] = '\0'; // sanitise bad pref values _prefs->rx_delay_base = constrain(_prefs->rx_delay_base, 0, 20.0f); @@ -452,6 +457,7 @@ void CommonCLI::savePrefs(FILESYSTEM* fs) { file.write((uint8_t *)&_prefs->alert_mqtt_minutes, sizeof(_prefs->alert_mqtt_minutes)); file.write((uint8_t *)&_prefs->alert_min_interval_min, sizeof(_prefs->alert_min_interval_min)); file.write((uint8_t *)&_prefs->alert_hashtag, sizeof(_prefs->alert_hashtag)); + file.write((uint8_t *)&_prefs->alert_region, sizeof(_prefs->alert_region)); file.close(); } @@ -1629,11 +1635,11 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep // Quick character-count check before storing; AlertReporter will redo the // full base64 decode and reject anything that doesn't yield 16 or 32 bytes. strcpy(reply, "Error: PSK must be 24 chars (16-byte) or 44 chars (32-byte) base64"); - } else if (strcmp(val, "izOH6cXN6mrJ5e26oRXNcg==") == 0 || - strcmp(val, "izOH6cXN6mrJ5e26oRXNcg") == 0) { - // Refuse the well-known PUBLIC group PSK — fault alerts must not spam - // every node on the default Public channel. - strcpy(reply, "Error: refusing PUBLIC PSK; pick a private key or hashtag"); + } else if (const char* banned = alertReporterBannedChannelMatchB64(val)) { + // Refuse any key on the banned channel list (Public PSK, well-known + // auto-responder hashtags like #test/#bot, etc.). Fault alerts on those + // channels would spam every node in the area. + sprintf(reply, "Error: refusing banned channel '%s'; pick a private key or hashtag", banned); } else { StrHelper::strncpy(_prefs->alert_psk_b64, val, sizeof(_prefs->alert_psk_b64)); // The new PSK is operator-supplied, so any previously-derived hashtag @@ -1677,19 +1683,49 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep uint8_t digest[32]; mesh::Utils::sha256(digest, sizeof(digest), (const uint8_t*)hashtag, (int)strlen(hashtag)); - char b64[48]; - size_t b64_len = alert_encode_base64(digest, 16, b64, sizeof(b64)); - if (b64_len == 0 || b64_len >= sizeof(_prefs->alert_psk_b64)) { - strcpy(reply, "Error: failed to derive PSK from hashtag"); + if (const char* banned = alertReporterBannedChannelMatch(digest)) { + // Hashtag derives to a banned key (e.g. `set alert.hashtag test` + // hits the #test entry). Refuse before clobbering existing config. + sprintf(reply, "Error: refusing banned channel '%s'", banned); } else { - StrHelper::strncpy(_prefs->alert_hashtag, hashtag, sizeof(_prefs->alert_hashtag)); - StrHelper::strncpy(_prefs->alert_psk_b64, b64, sizeof(_prefs->alert_psk_b64)); - savePrefs(); - _callbacks->onAlertConfigChanged(); - sprintf(reply, "OK - alert.hashtag: %s", _prefs->alert_hashtag); + char b64[48]; + size_t b64_len = alert_encode_base64(digest, 16, b64, sizeof(b64)); + if (b64_len == 0 || b64_len >= sizeof(_prefs->alert_psk_b64)) { + strcpy(reply, "Error: failed to derive PSK from hashtag"); + } else { + StrHelper::strncpy(_prefs->alert_hashtag, hashtag, sizeof(_prefs->alert_hashtag)); + StrHelper::strncpy(_prefs->alert_psk_b64, b64, sizeof(_prefs->alert_psk_b64)); + savePrefs(); + _callbacks->onAlertConfigChanged(); + sprintf(reply, "OK - alert.hashtag: %s", _prefs->alert_hashtag); + } } } } + } else if (memcmp(config, "alert.region", 12) == 0 && (config[12] == 0 || config[12] == ' ')) { + // `set alert.region ` overrides the repeater's default_scope for + // alert sends only. `set alert.region` (no arg) clears it. The name is + // looked up lazily via RegionMap at send time; we deliberately don't + // mutate the region map here, so naming an unknown region is allowed + // but will silently fall back to default_scope until the operator runs + // `region put` for it. + const char* val = (config[12] == ' ') ? &config[13] : ""; + while (*val == ' ') val++; + size_t len = strlen(val); + if (len == 0) { + _prefs->alert_region[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.region cleared (using default scope)"); + } else if (len >= sizeof(_prefs->alert_region)) { + strcpy(reply, "Error: alert.region too long"); + } else { + StrHelper::strncpy(_prefs->alert_region, val, sizeof(_prefs->alert_region)); + StrHelper::stripSurroundingQuotes(_prefs->alert_region, sizeof(_prefs->alert_region)); + savePrefs(); + _callbacks->onAlertConfigChanged(); + sprintf(reply, "OK - alert.region: %s", _prefs->alert_region); + } } else if (memcmp(config, "alert.wifi ", 11) == 0) { int mins = (int)_atoi(&config[11]); if (mins < 0 || mins > 1440) { @@ -2030,6 +2066,8 @@ void CommonCLI::handleGetCmd(uint32_t sender_timestamp, char* command, char* rep sprintf(reply, "> %s", _prefs->alert_hashtag[0] ? _prefs->alert_hashtag : "(unset)"); } else if (memcmp(config, "alert.psk", 9) == 0) { sprintf(reply, "> %s", _prefs->alert_psk_b64[0] ? _prefs->alert_psk_b64 : "(unset)"); + } else if (memcmp(config, "alert.region", 12) == 0) { + sprintf(reply, "> %s", _prefs->alert_region[0] ? _prefs->alert_region : "(unset, using default scope)"); } else if (memcmp(config, "alert.wifi", 10) == 0) { sprintf(reply, "> %u min%s", (unsigned)_prefs->alert_wifi_minutes, _prefs->alert_wifi_minutes == 0 ? " (disabled)" : ""); diff --git a/src/helpers/CommonCLI.h b/src/helpers/CommonCLI.h index 6e38e8c0b7..e20ba93846 100644 --- a/src/helpers/CommonCLI.h +++ b/src/helpers/CommonCLI.h @@ -118,6 +118,11 @@ struct NodePrefs { // persisted to file // text here purely for `get alert.hashtag` readback. A subsequent // `set alert.psk` clears this field so it doesn't lie about provenance. char alert_hashtag[24]; + // Optional region name (e.g. "us", "eu"); empty = use the repeater's + // default_scope. Looked up lazily via RegionMap::findByNamePrefix at send + // time, so the operator can name a region that doesn't exist yet without + // polluting region_map state. Falls back to default_scope on miss. + char alert_region[31]; }; #ifdef WITH_MQTT_BRIDGE @@ -298,6 +303,14 @@ class CommonCLICallbacks { virtual bool sendAlertText(const char* /*text*/) { return false; // no op by default } + // Resolve the TransportKey scope to use for outgoing fault-alert floods. + // Implementations should consult NodePrefs::alert_region first (look up via + // RegionMap), then fall back to the repeater's default_scope, then return + // false if neither yields a usable key. AlertReporter falls back to an + // unscoped flood when this returns false. + virtual bool resolveAlertScope(TransportKey& /*dest*/) { + return false; // no op by default + } }; class CommonCLI { From 6a3ed5d430f35e4f2a7a040d14d1ff35dbc3a76b Mon Sep 17 00:00:00 2001 From: agessaman Date: Mon, 11 May 2026 16:22:29 -0700 Subject: [PATCH 3/7] Update AlertReporter to include path hash size in flood messages Modified the sendFlood method in AlertReporter to incorporate path hash size, ensuring compatibility with the repeater's configured path.hash.mode. This change enhances the handling of alert floods by accommodating different regional mesh configurations. --- src/helpers/AlertReporter.cpp | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/src/helpers/AlertReporter.cpp b/src/helpers/AlertReporter.cpp index 19082f9b16..356c7aacd1 100644 --- a/src/helpers/AlertReporter.cpp +++ b/src/helpers/AlertReporter.cpp @@ -214,15 +214,20 @@ bool AlertReporter::sendChannel(const char* text) { // broadcast channel messages. Falls back to plain (unscoped) flood when // no callbacks are wired or no scope is configured, matching the // pre-scoped behavior on builds without RegionMap. + // + // path_hash_size must honor the repeater's configured path.hash.mode (1, 2, + // or 3-byte hashes); the Mesh.h default of 1 would silently downgrade + // observers running on 2/3-byte regional meshes. + const uint8_t path_hash_size = (uint8_t)(_prefs->path_hash_mode + 1); TransportKey scope; bool have_scope = _callbacks && _callbacks->resolveAlertScope(scope) && !scope.isNull(); if (have_scope) { uint16_t codes[2]; codes[0] = scope.calcTransportCode(pkt); codes[1] = 0; - _mesh->sendFlood(pkt, codes); + _mesh->sendFlood(pkt, codes, 0, path_hash_size); } else { - _mesh->sendFlood(pkt); + _mesh->sendFlood(pkt, 0, path_hash_size); } ALERT_DEBUG_PRINTLN("sent: %s", text); return true; From a865d0a3030957d54a304f8292ee475c72962f91 Mon Sep 17 00:00:00 2001 From: agessaman Date: Tue, 12 May 2026 21:05:56 -0700 Subject: [PATCH 4/7] Update alert PSK handling to accept both base64 and hex formats Enhanced the CLI command for setting the alert PSK to support both base64 (24/44 chars) and hex (32/64 chars) formats. This change allows for greater flexibility in key input, accommodating the mobile app's "share" output. Updated error messages for clarity and ensured that previously derived hashtags are cleared when a new PSK is set. Relevant adjustments made in CommonCLI and MQTT implementation documentation. --- MQTT_IMPLEMENTATION.md | 6 ++-- src/helpers/CommonCLI.cpp | 63 +++++++++++++++++++++++++++++---------- 2 files changed, 50 insertions(+), 19 deletions(-) diff --git a/MQTT_IMPLEMENTATION.md b/MQTT_IMPLEMENTATION.md index 159d6fab79..7c1d7880e5 100644 --- a/MQTT_IMPLEMENTATION.md +++ b/MQTT_IMPLEMENTATION.md @@ -714,7 +714,7 @@ A "recovered" message is sent once when the underlying connection comes back. Af | Setting | Default | Notes | |---------|---------|-------| | `alert` | `off` | Master enable for automatic fault alerts | -| `alert.psk` | *(unset)* | Private base64 PSK (24 or 44 chars). The active channel key. | +| `alert.psk` | *(unset)* | Private channel secret. Accepts either base64 (24 or 44 chars, like `BaseChatMesh::addChannel`) or hex (32 or 64 chars, like the mobile app's "Share Channel" output). Stored internally as base64. | | `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | | `alert.region` | *(unset)* | Optional region name; overrides the repeater's `default_scope` for alert sends only. Empty = use `default_scope`. Looked up lazily via `RegionMap`; unknown names silently fall back to `default_scope`. | | `alert.wifi` | `30` (min) | 0 disables WiFi alerts | @@ -734,7 +734,7 @@ Get: Set: - `set alert on` / `set alert off` -- `set alert.psk ` — 24-char (16-byte) or 44-char (32-byte) base64; rejects banned channels (Public, `#test`, `#bot`). Clears `alert.hashtag` since the new key is operator-supplied. +- `set alert.psk ` — 24-/44-char base64 **or** 32-/64-char hex (16- or 32-byte secret); rejects banned channels (Public, `#test`, `#bot`). Clears `alert.hashtag` since the new key is operator-supplied. The mobile app's "share" output is hex; either format is accepted. - `set alert.psk` (no argument) — clears both `alert.psk` and `alert.hashtag` - `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). Refuses banned hashtag names. - `set alert.hashtag` (no argument) — clears both `alert.psk` and `alert.hashtag` @@ -765,7 +765,7 @@ Anyone running a companion app and subscribed to the `#ops-alerts` hashtag chann Generate a 16-byte random PSK and base64-encode it (24 chars), or use the companion app's "Add channel" feature to create one and copy the secret. Then: ```bash -set alert.psk +set alert.psk # base64 (24/44 chars) or hex (32/64 chars) — mobile "share" works as-is set alert.wifi 10 set alert.mqtt 60 set alert on diff --git a/src/helpers/CommonCLI.cpp b/src/helpers/CommonCLI.cpp index 32417a6fdd..b327a3bcb3 100644 --- a/src/helpers/CommonCLI.cpp +++ b/src/helpers/CommonCLI.cpp @@ -1631,23 +1631,54 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep strcpy(reply, "Error: PSK too long (max 47 chars)"); } else if (val[0] == '#') { strcpy(reply, "Error: use 'set alert.hashtag' for hashtag channels"); - } else if (len != 24 && len != 44) { - // Quick character-count check before storing; AlertReporter will redo the - // full base64 decode and reject anything that doesn't yield 16 or 32 bytes. - strcpy(reply, "Error: PSK must be 24 chars (16-byte) or 44 chars (32-byte) base64"); - } else if (const char* banned = alertReporterBannedChannelMatchB64(val)) { - // Refuse any key on the banned channel list (Public PSK, well-known - // auto-responder hashtags like #test/#bot, etc.). Fault alerts on those - // channels would spam every node in the area. - sprintf(reply, "Error: refusing banned channel '%s'; pick a private key or hashtag", banned); } else { - StrHelper::strncpy(_prefs->alert_psk_b64, val, sizeof(_prefs->alert_psk_b64)); - // The new PSK is operator-supplied, so any previously-derived hashtag - // name is no longer accurate provenance — drop it. - _prefs->alert_hashtag[0] = '\0'; - savePrefs(); - _callbacks->onAlertConfigChanged(); - strcpy(reply, "OK - alert.psk updated"); + // Accept either: + // - base64: 24 chars (16-byte secret) or 44 chars (32-byte secret) — + // the format `BaseChatMesh::addChannel` already expects. + // - hex: 32 chars (16-byte secret) or 64 chars (32-byte secret) — + // what the MeshCore mobile app's "share" button emits for both + // private channels and hashtag channels. + // Hex input is converted to base64 once at CLI time so the on-disk + // representation (and AlertReporter's decode path) stays unchanged. + char canonical_b64[48]; + const char* store = nullptr; + + if (len == 32 || len == 64) { + bool all_hex = true; + for (size_t i = 0; i < len; i++) { + if (!mesh::Utils::isHexChar(val[i])) { all_hex = false; break; } + } + if (all_hex) { + uint8_t raw[32]; + int raw_len = (int)(len / 2); + if (mesh::Utils::fromHex(raw, raw_len, val)) { + size_t b64_len = alert_encode_base64(raw, (size_t)raw_len, canonical_b64, sizeof(canonical_b64)); + if (b64_len > 0 && b64_len < sizeof(_prefs->alert_psk_b64)) { + store = canonical_b64; + } + } + } + } + if (!store && (len == 24 || len == 44)) { + store = val; + } + + if (!store) { + strcpy(reply, "Error: PSK must be 32/64 hex chars or 24/44 chars base64 (16- or 32-byte secret)"); + } else if (const char* banned = alertReporterBannedChannelMatchB64(store)) { + // Refuse any key on the banned channel list (Public PSK, well-known + // auto-responder hashtags like #test/#bot, etc.). Fault alerts on + // those channels would spam every node in the area. + sprintf(reply, "Error: refusing banned channel '%s'; pick a private key or hashtag", banned); + } else { + StrHelper::strncpy(_prefs->alert_psk_b64, store, sizeof(_prefs->alert_psk_b64)); + // The new PSK is operator-supplied, so any previously-derived hashtag + // name is no longer accurate provenance — drop it. + _prefs->alert_hashtag[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.psk updated"); + } } } else if (memcmp(config, "alert.hashtag", 13) == 0 && (config[13] == 0 || config[13] == ' ')) { const char* val = (config[13] == ' ') ? &config[14] : ""; From a8668840594b351e355183489ab3c2216a48d429 Mon Sep 17 00:00:00 2001 From: agessaman Date: Tue, 12 May 2026 21:18:58 -0700 Subject: [PATCH 5/7] Refactor alert PSK handling to exclusively use hex format Updated the alert PSK implementation to remove base64 support, now requiring a 32-character hex format for private channel secrets. Adjusted related CLI commands, error messages, and internal handling to ensure consistency with the new format. This change enhances clarity and aligns with the mobile app's "Share Channel" output. Relevant updates made across multiple files, including documentation and preference handling. --- MQTT_IMPLEMENTATION.md | 10 +- examples/simple_repeater/MyMesh.cpp | 2 +- examples/simple_room_server/MyMesh.cpp | 2 +- src/helpers/AlertReporter.cpp | 93 ++++------------- src/helpers/AlertReporter.h | 14 +-- src/helpers/CommonCLI.cpp | 138 +++++++++---------------- src/helpers/CommonCLI.h | 4 +- 7 files changed, 86 insertions(+), 177 deletions(-) diff --git a/MQTT_IMPLEMENTATION.md b/MQTT_IMPLEMENTATION.md index 7c1d7880e5..d575ea93ec 100644 --- a/MQTT_IMPLEMENTATION.md +++ b/MQTT_IMPLEMENTATION.md @@ -714,7 +714,7 @@ A "recovered" message is sent once when the underlying connection comes back. Af | Setting | Default | Notes | |---------|---------|-------| | `alert` | `off` | Master enable for automatic fault alerts | -| `alert.psk` | *(unset)* | Private channel secret. Accepts either base64 (24 or 44 chars, like `BaseChatMesh::addChannel`) or hex (32 or 64 chars, like the mobile app's "Share Channel" output). Stored internally as base64. | +| `alert.psk` | *(unset)* | Private channel secret as **32 hex chars** (16-byte channel key) — the same format the mobile app's "Share Channel" emits, and what every other secret-shaped CLI command (e.g. `prv.key`) uses. | | `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | | `alert.region` | *(unset)* | Optional region name; overrides the repeater's `default_scope` for alert sends only. Empty = use `default_scope`. Looked up lazily via `RegionMap`; unknown names silently fall back to `default_scope`. | | `alert.wifi` | `30` (min) | 0 disables WiFi alerts | @@ -727,14 +727,14 @@ A "recovered" message is sent once when the underlying connection comes back. Af Get: - `get alert` — master on/off -- `get alert.psk` — the active base64 PSK (or `(unset)`) +- `get alert.psk` — the active 32-hex-char PSK (or `(unset)`) - `get alert.hashtag` — the originating hashtag (or `(unset)`, e.g. after `set alert.psk` overrides the hashtag-derived key) - `get alert.region` — alert-only scope override (or `(unset, using default scope)`) - `get alert.wifi` / `get alert.mqtt` / `get alert.interval` Set: - `set alert on` / `set alert off` -- `set alert.psk ` — 24-/44-char base64 **or** 32-/64-char hex (16- or 32-byte secret); rejects banned channels (Public, `#test`, `#bot`). Clears `alert.hashtag` since the new key is operator-supplied. The mobile app's "share" output is hex; either format is accepted. +- `set alert.psk ` — 32 hex chars (16-byte channel secret); rejects banned channels (Public, `#test`, `#bot`). Paste the mobile app's "Share Channel" output as-is. Clears `alert.hashtag` since the new key is operator-supplied. - `set alert.psk` (no argument) — clears both `alert.psk` and `alert.hashtag` - `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). Refuses banned hashtag names. - `set alert.hashtag` (no argument) — clears both `alert.psk` and `alert.hashtag` @@ -762,10 +762,10 @@ Anyone running a companion app and subscribed to the `#ops-alerts` hashtag chann ### Example: dedicated alerts channel with a private PSK -Generate a 16-byte random PSK and base64-encode it (24 chars), or use the companion app's "Add channel" feature to create one and copy the secret. Then: +Generate a 16-byte random PSK as 32 hex chars (`openssl rand -hex 16`), or use the companion app's "Add channel" feature and copy the "Share Channel" output. Then: ```bash -set alert.psk # base64 (24/44 chars) or hex (32/64 chars) — mobile "share" works as-is +set alert.psk <32_hex_chars> # 16-byte channel secret; mobile "Share Channel" pastes in directly set alert.wifi 10 set alert.mqtt 60 set alert on diff --git a/examples/simple_repeater/MyMesh.cpp b/examples/simple_repeater/MyMesh.cpp index 37d1da5ced..0153ed9edc 100644 --- a/examples/simple_repeater/MyMesh.cpp +++ b/examples/simple_repeater/MyMesh.cpp @@ -954,7 +954,7 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc // can fire. The sender prefix on outgoing alert messages is always the // node name (`set name ...`), so there's no separate `alert.name`. _prefs.alert_enabled = 0; - _prefs.alert_psk_b64[0] = '\0'; + _prefs.alert_psk_hex[0] = '\0'; _prefs.alert_hashtag[0] = '\0'; _prefs.alert_region[0] = '\0'; // empty = use default_scope _prefs.alert_wifi_minutes = 30; // 30 minutes diff --git a/examples/simple_room_server/MyMesh.cpp b/examples/simple_room_server/MyMesh.cpp index 34c4258b5b..de7e3e1111 100644 --- a/examples/simple_room_server/MyMesh.cpp +++ b/examples/simple_room_server/MyMesh.cpp @@ -678,7 +678,7 @@ MyMesh::MyMesh(mesh::MainBoard &board, mesh::Radio &radio, mesh::MillisecondCloc // Alert channel defaults (same as repeater; off by default and unconfigured). // Operator must pick `set alert.psk` or `set alert.hashtag` before alerts fire. _prefs.alert_enabled = 0; - _prefs.alert_psk_b64[0] = '\0'; + _prefs.alert_psk_hex[0] = '\0'; _prefs.alert_hashtag[0] = '\0'; _prefs.alert_region[0] = '\0'; _prefs.alert_wifi_minutes = 30; diff --git a/src/helpers/AlertReporter.cpp b/src/helpers/AlertReporter.cpp index 356c7aacd1..df38d6fc5e 100644 --- a/src/helpers/AlertReporter.cpp +++ b/src/helpers/AlertReporter.cpp @@ -5,45 +5,6 @@ #include #include -// Minimal base64 decoder — kept local to avoid dragging the densaugeo/base64 -// PlatformIO dependency into every repeater env that doesn't otherwise need -// it (only chat builds with MAX_GROUP_CHANNELS pulled it in via BaseChatMesh). -// Returns the number of decoded bytes, or 0 on error. Output buffer must be -// at least (in_len * 3 / 4) bytes. -static int alert_decode_base64(const char* in, size_t in_len, uint8_t* out) { - static const int8_t TBL[128] = { - -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, - -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1, - -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,62,-1,-1,-1,63, - 52,53,54,55,56,57,58,59,60,61,-1,-1,-1, 0,-1,-1, - -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,10,11,12,13,14, - 15,16,17,18,19,20,21,22,23,24,25,-1,-1,-1,-1,-1, - -1,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40, - 41,42,43,44,45,46,47,48,49,50,51,-1,-1,-1,-1,-1 - }; - size_t pad = 0; - while (in_len > 0 && in[in_len - 1] == '=') { in_len--; pad++; } - if ((in_len + pad) % 4 != 0) return 0; - if (pad > 2) return 0; - - size_t out_pos = 0; - uint32_t buffer = 0; - int bits = 0; - for (size_t i = 0; i < in_len; i++) { - unsigned char c = (unsigned char)in[i]; - if (c >= 128) return 0; - int v = TBL[c]; - if (v < 0) return 0; - buffer = (buffer << 6) | (uint32_t)v; - bits += 6; - if (bits >= 8) { - bits -= 8; - out[out_pos++] = (uint8_t)((buffer >> bits) & 0xFF); - } - } - return (int)out_pos; -} - // Header layout for PAYLOAD_TYPE_GRP_TXT before encryption: // [0..3] timestamp (uint32_t LE) — also helps make packet_hash unique // [4] TXT_TYPE_PLAIN @@ -98,12 +59,12 @@ void AlertReporter::setBridge(MQTTBridge* bridge) { // // Provenance for each row can be re-derived with: // printf '#name' | openssl dgst -sha256 | cut -c1-32 -// or for raw b64 PSKs: -// echo 'izOH6cXN6mrJ5e26oRXNcg==' | base64 -d | xxd -p +// or for the Public PSK: +// echo 'izOH6cXN6mrJ5e26oRXNcg==' | base64 -d | xxd -p -c 16 // // To ban an additional channel: append one new row; no other code changes -// required. The matcher converts the candidate's 16-byte secret to a -// lowercase hex string and does a linear strcmp — N is tiny. +// required. Both the table entries and `alert_psk_hex` are 32 lowercase hex +// chars (16-byte secret), so the matcher is a direct strcmp. struct BannedAlertChannel { const char* label; const char* secret_hex; // 32 lowercase hex chars (no 0x, no separators) @@ -120,12 +81,7 @@ static const BannedAlertChannel BANNED_ALERT_CHANNELS[] = { const char* alertReporterBannedChannelMatch(const uint8_t* secret16) { char hex[33]; - static const char* H = "0123456789abcdef"; - for (int i = 0; i < 16; i++) { - hex[i*2] = H[(secret16[i] >> 4) & 0xF]; - hex[i*2+1] = H[secret16[i] & 0xF]; - } - hex[32] = '\0'; + mesh::Utils::toHex(hex, secret16, 16); for (size_t i = 0; i < sizeof(BANNED_ALERT_CHANNELS) / sizeof(BANNED_ALERT_CHANNELS[0]); i++) { if (strcmp(hex, BANNED_ALERT_CHANNELS[i].secret_hex) == 0) { return BANNED_ALERT_CHANNELS[i].label; @@ -134,42 +90,37 @@ const char* alertReporterBannedChannelMatch(const uint8_t* secret16) { return nullptr; } -const char* alertReporterBannedChannelMatchB64(const char* psk_b64) { - if (!psk_b64) return nullptr; - size_t len_in = strlen(psk_b64); - if (len_in == 0) return nullptr; - uint8_t secret[32]; - int n = alert_decode_base64(psk_b64, len_in, secret); - if (n != 16) return nullptr; // banned table only contains 16-byte secrets +const char* alertReporterBannedChannelMatchHex(const char* psk_hex) { + if (!psk_hex || strlen(psk_hex) != 32) return nullptr; + uint8_t secret[16]; + if (!mesh::Utils::fromHex(secret, 16, psk_hex)) return nullptr; return alertReporterBannedChannelMatch(secret); } bool AlertReporter::resolveChannel(mesh::GroupChannel& out) const { if (!_prefs) return false; - // alert_psk_b64 is the single source of truth — `set alert.hashtag` - // pre-derives the base64 PSK from sha256("#name")[0..15] at CLI time. - const char* psk = _prefs->alert_psk_b64; - size_t psk_len = strlen(psk); - if (psk_len == 0 || psk_len >= sizeof(_prefs->alert_psk_b64)) return false; + // alert_psk_hex is the single source of truth — `set alert.hashtag` + // pre-derives the hex-encoded PSK from sha256("#name")[0..15] at CLI time. + // Only 16-byte secrets (32 hex chars) are supported; 32-byte channel keys + // are not used anywhere in MeshCore practice and not represented in the + // banned table either. + const char* psk = _prefs->alert_psk_hex; + if (strlen(psk) != 32) return false; memset(out.secret, 0, sizeof(out.secret)); - int len = alert_decode_base64(psk, psk_len, out.secret); - if (len != 32 && len != 16) return false; + if (!mesh::Utils::fromHex(out.secret, 16, psk)) return false; // Belt-and-suspenders against an operator pasting a banned PSK directly // into alert.psk, or a hashtag whose hash somehow collides with one of the // banned 16-byte secrets (astronomically improbable, but free to check). - if (len == 16) { - const char* banned = alertReporterBannedChannelMatch(out.secret); - if (banned) { - ALERT_DEBUG_PRINTLN("refused banned channel '%s' for alert", banned); - return false; - } + const char* banned = alertReporterBannedChannelMatch(out.secret); + if (banned) { + ALERT_DEBUG_PRINTLN("refused banned channel '%s' for alert", banned); + return false; } - // PATH_HASH_SIZE bytes — same scheme used by addChannel(). - mesh::Utils::sha256(out.hash, sizeof(out.hash), out.secret, len); + mesh::Utils::sha256(out.hash, sizeof(out.hash), out.secret, 16); return true; } diff --git a/src/helpers/AlertReporter.h b/src/helpers/AlertReporter.h index eec488020c..dbbd154f65 100644 --- a/src/helpers/AlertReporter.h +++ b/src/helpers/AlertReporter.h @@ -18,12 +18,12 @@ const char* alertReporterBannedChannelMatch(const uint8_t* secret16); /** - * Convenience: base64-decodes \a psk_b64 and forwards to - * alertReporterBannedChannelMatch. Returns nullptr if not banned (or if - * the input doesn't decode to a 16-byte key — non-16-byte keys cannot - * match any banned entry anyway). + * Convenience: hex-decodes \a psk_hex (32 lowercase/uppercase hex chars) and + * forwards to alertReporterBannedChannelMatch. Returns nullptr if not banned + * (or if the input isn't a valid 32-char hex string — only 16-byte secrets + * are present in the banned table). */ -const char* alertReporterBannedChannelMatchB64(const char* psk_b64); +const char* alertReporterBannedChannelMatchHex(const char* psk_hex); /** * \brief Send-only group-channel "fault alert" reporter for repeater/observer @@ -34,7 +34,7 @@ const char* alertReporterBannedChannelMatchB64(const char* psk_b64); * message on the configured alert channel ("WiFi down 47m — MyObserver"), * then arms a "recovered" message for the next state transition. * - * The alert channel must be explicitly configured to either a private base64 + * The alert channel must be explicitly configured to either a private hex * PSK (`set alert.psk`) or a hashtag name (`set alert.hashtag`); the * well-known PUBLIC group key (and a small list of other auto-responder * channels — see BANNED_ALERT_CHANNELS in AlertReporter.cpp) are rejected on @@ -67,7 +67,7 @@ class AlertReporter { #endif /** - * Re-derive the cached GroupChannel from \a alert_psk_b64. Call from the + * Re-derive the cached GroupChannel from \a alert_psk_hex. Call from the * CLI hot-reload hook after `set alert.psk` / `set alert.hashtag` / `set alert on|off`. */ void onConfigChanged(); diff --git a/src/helpers/CommonCLI.cpp b/src/helpers/CommonCLI.cpp index b327a3bcb3..b6ac08904b 100644 --- a/src/helpers/CommonCLI.cpp +++ b/src/helpers/CommonCLI.cpp @@ -6,28 +6,6 @@ #include #include -// Tiny base64 encoder used by `set alert.hashtag` to render a derived 16-byte -// key into NodePrefs::alert_psk_b64. Kept inline so we don't drag the -// densaugeo/base64 PlatformIO dep into every CLI-using build. -static size_t alert_encode_base64(const uint8_t* in, size_t in_len, char* out, size_t out_size) { - static const char TBL[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"; - size_t needed = ((in_len + 2) / 3) * 4 + 1; - if (out_size < needed) return 0; - size_t o = 0; - for (size_t i = 0; i < in_len; i += 3) { - uint32_t v = (uint32_t)in[i] << 16; - int rem = (int)(in_len - i); - if (rem > 1) v |= (uint32_t)in[i + 1] << 8; - if (rem > 2) v |= (uint32_t)in[i + 2]; - out[o++] = TBL[(v >> 18) & 0x3F]; - out[o++] = TBL[(v >> 12) & 0x3F]; - out[o++] = (rem > 1) ? TBL[(v >> 6) & 0x3F] : '='; - out[o++] = (rem > 2) ? TBL[v & 0x3F] : '='; - } - out[o] = '\0'; - return o; -} - #ifndef BRIDGE_MAX_BAUD #define BRIDGE_MAX_BAUD 115200 #endif @@ -306,8 +284,8 @@ void CommonCLI::loadPrefsInt(FILESYSTEM* fs, const char* filename) { if (file.available() >= (int)sizeof(_prefs->alert_enabled)) { file.read((uint8_t *)&_prefs->alert_enabled, sizeof(_prefs->alert_enabled)); } - if (file.available() >= (int)sizeof(_prefs->alert_psk_b64)) { - file.read((uint8_t *)&_prefs->alert_psk_b64, sizeof(_prefs->alert_psk_b64)); + if (file.available() >= (int)sizeof(_prefs->alert_psk_hex)) { + file.read((uint8_t *)&_prefs->alert_psk_hex, sizeof(_prefs->alert_psk_hex)); } if (file.available() >= (int)sizeof(_prefs->alert_wifi_minutes)) { file.read((uint8_t *)&_prefs->alert_wifi_minutes, sizeof(_prefs->alert_wifi_minutes)); @@ -325,7 +303,7 @@ void CommonCLI::loadPrefsInt(FILESYSTEM* fs, const char* filename) { file.read((uint8_t *)&_prefs->alert_region, sizeof(_prefs->alert_region)); } // ensure null termination after raw read - _prefs->alert_psk_b64[sizeof(_prefs->alert_psk_b64) - 1] = '\0'; + _prefs->alert_psk_hex[sizeof(_prefs->alert_psk_hex) - 1] = '\0'; _prefs->alert_hashtag[sizeof(_prefs->alert_hashtag) - 1] = '\0'; _prefs->alert_region[sizeof(_prefs->alert_region) - 1] = '\0'; @@ -452,7 +430,7 @@ void CommonCLI::savePrefs(FILESYSTEM* fs) { file.write((uint8_t *)&_prefs->radio_watchdog_minutes, sizeof(_prefs->radio_watchdog_minutes)); // 316 // Alert channel fields (appended) file.write((uint8_t *)&_prefs->alert_enabled, sizeof(_prefs->alert_enabled)); - file.write((uint8_t *)&_prefs->alert_psk_b64, sizeof(_prefs->alert_psk_b64)); + file.write((uint8_t *)&_prefs->alert_psk_hex, sizeof(_prefs->alert_psk_hex)); file.write((uint8_t *)&_prefs->alert_wifi_minutes, sizeof(_prefs->alert_wifi_minutes)); file.write((uint8_t *)&_prefs->alert_mqtt_minutes, sizeof(_prefs->alert_mqtt_minutes)); file.write((uint8_t *)&_prefs->alert_min_interval_min, sizeof(_prefs->alert_min_interval_min)); @@ -873,7 +851,7 @@ void CommonCLI::handleCommand(uint32_t sender_timestamp, char* command, char* re } else { strcpy(text, "[test] alert channel ok"); } - if (!_prefs->alert_psk_b64[0]) { + if (!_prefs->alert_psk_hex[0]) { strcpy(reply, "Error: alert channel not configured (set alert.psk or set alert.hashtag)"); } else { bool ok = _callbacks->sendAlertText(text); @@ -1622,62 +1600,46 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep while (*val == ' ') val++; size_t len = strlen(val); if (len == 0) { - _prefs->alert_psk_b64[0] = '\0'; + _prefs->alert_psk_hex[0] = '\0'; _prefs->alert_hashtag[0] = '\0'; savePrefs(); _callbacks->onAlertConfigChanged(); strcpy(reply, "OK - alert.psk cleared (alerts disabled until configured)"); - } else if (len >= sizeof(_prefs->alert_psk_b64)) { - strcpy(reply, "Error: PSK too long (max 47 chars)"); } else if (val[0] == '#') { strcpy(reply, "Error: use 'set alert.hashtag' for hashtag channels"); + } else if (len != 32) { + // 16-byte channel secret = 32 hex chars. This is what the mobile app's + // "Share Channel" emits, what `set alert.hashtag` derives, and what the + // BANNED_ALERT_CHANNELS table holds. 32-byte channels aren't used + // anywhere in MeshCore practice. + strcpy(reply, "Error: PSK must be 32 hex chars (16-byte channel secret)"); } else { - // Accept either: - // - base64: 24 chars (16-byte secret) or 44 chars (32-byte secret) — - // the format `BaseChatMesh::addChannel` already expects. - // - hex: 32 chars (16-byte secret) or 64 chars (32-byte secret) — - // what the MeshCore mobile app's "share" button emits for both - // private channels and hashtag channels. - // Hex input is converted to base64 once at CLI time so the on-disk - // representation (and AlertReporter's decode path) stays unchanged. - char canonical_b64[48]; - const char* store = nullptr; - - if (len == 32 || len == 64) { - bool all_hex = true; - for (size_t i = 0; i < len; i++) { - if (!mesh::Utils::isHexChar(val[i])) { all_hex = false; break; } - } - if (all_hex) { - uint8_t raw[32]; - int raw_len = (int)(len / 2); - if (mesh::Utils::fromHex(raw, raw_len, val)) { - size_t b64_len = alert_encode_base64(raw, (size_t)raw_len, canonical_b64, sizeof(canonical_b64)); - if (b64_len > 0 && b64_len < sizeof(_prefs->alert_psk_b64)) { - store = canonical_b64; - } - } - } - } - if (!store && (len == 24 || len == 44)) { - store = val; + // Validate all-hex, then normalize via fromHex/toHex so the stored + // form is always lowercase regardless of input case. + uint8_t raw[16]; + bool all_hex = true; + for (size_t i = 0; i < len; i++) { + if (!mesh::Utils::isHexChar(val[i])) { all_hex = false; break; } } - - if (!store) { - strcpy(reply, "Error: PSK must be 32/64 hex chars or 24/44 chars base64 (16- or 32-byte secret)"); - } else if (const char* banned = alertReporterBannedChannelMatchB64(store)) { - // Refuse any key on the banned channel list (Public PSK, well-known - // auto-responder hashtags like #test/#bot, etc.). Fault alerts on - // those channels would spam every node in the area. - sprintf(reply, "Error: refusing banned channel '%s'; pick a private key or hashtag", banned); + if (!all_hex || !mesh::Utils::fromHex(raw, 16, val)) { + strcpy(reply, "Error: PSK must be 32 hex chars (16-byte channel secret)"); } else { - StrHelper::strncpy(_prefs->alert_psk_b64, store, sizeof(_prefs->alert_psk_b64)); - // The new PSK is operator-supplied, so any previously-derived hashtag - // name is no longer accurate provenance — drop it. - _prefs->alert_hashtag[0] = '\0'; - savePrefs(); - _callbacks->onAlertConfigChanged(); - strcpy(reply, "OK - alert.psk updated"); + char normalized[33]; + mesh::Utils::toHex(normalized, raw, 16); + if (const char* banned = alertReporterBannedChannelMatchHex(normalized)) { + // Refuse any key on the banned channel list (Public PSK, well-known + // auto-responder hashtags like #test/#bot, etc.). Fault alerts on + // those channels would spam every node in the area. + sprintf(reply, "Error: refusing banned channel '%s'; pick a private key or hashtag", banned); + } else { + StrHelper::strncpy(_prefs->alert_psk_hex, normalized, sizeof(_prefs->alert_psk_hex)); + // The new PSK is operator-supplied, so any previously-derived + // hashtag name is no longer accurate provenance — drop it. + _prefs->alert_hashtag[0] = '\0'; + savePrefs(); + _callbacks->onAlertConfigChanged(); + strcpy(reply, "OK - alert.psk updated"); + } } } } else if (memcmp(config, "alert.hashtag", 13) == 0 && (config[13] == 0 || config[13] == ' ')) { @@ -1685,7 +1647,7 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep while (*val == ' ') val++; size_t in_len = strlen(val); if (in_len == 0) { - _prefs->alert_psk_b64[0] = '\0'; + _prefs->alert_psk_hex[0] = '\0'; _prefs->alert_hashtag[0] = '\0'; savePrefs(); _callbacks->onAlertConfigChanged(); @@ -1708,9 +1670,9 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep } // Derive the channel key once: first 16 bytes of sha256("#name"), - // then base64-encode and store in alert_psk_b64. We don't re-derive - // on every send — operators can later override with `set alert.psk` - // without leaving stale hashtag text behind. + // store hex-encoded in alert_psk_hex. We don't re-derive on every + // send — operators can later override with `set alert.psk` without + // leaving stale hashtag text behind. uint8_t digest[32]; mesh::Utils::sha256(digest, sizeof(digest), (const uint8_t*)hashtag, (int)strlen(hashtag)); @@ -1719,17 +1681,13 @@ void CommonCLI::handleSetCmd(uint32_t sender_timestamp, char* command, char* rep // hits the #test entry). Refuse before clobbering existing config. sprintf(reply, "Error: refusing banned channel '%s'", banned); } else { - char b64[48]; - size_t b64_len = alert_encode_base64(digest, 16, b64, sizeof(b64)); - if (b64_len == 0 || b64_len >= sizeof(_prefs->alert_psk_b64)) { - strcpy(reply, "Error: failed to derive PSK from hashtag"); - } else { - StrHelper::strncpy(_prefs->alert_hashtag, hashtag, sizeof(_prefs->alert_hashtag)); - StrHelper::strncpy(_prefs->alert_psk_b64, b64, sizeof(_prefs->alert_psk_b64)); - savePrefs(); - _callbacks->onAlertConfigChanged(); - sprintf(reply, "OK - alert.hashtag: %s", _prefs->alert_hashtag); - } + char hex[33]; + mesh::Utils::toHex(hex, digest, 16); + StrHelper::strncpy(_prefs->alert_hashtag, hashtag, sizeof(_prefs->alert_hashtag)); + StrHelper::strncpy(_prefs->alert_psk_hex, hex, sizeof(_prefs->alert_psk_hex)); + savePrefs(); + _callbacks->onAlertConfigChanged(); + sprintf(reply, "OK - alert.hashtag: %s", _prefs->alert_hashtag); } } } @@ -2096,7 +2054,7 @@ void CommonCLI::handleGetCmd(uint32_t sender_timestamp, char* command, char* rep } else if (memcmp(config, "alert.hashtag", 13) == 0) { sprintf(reply, "> %s", _prefs->alert_hashtag[0] ? _prefs->alert_hashtag : "(unset)"); } else if (memcmp(config, "alert.psk", 9) == 0) { - sprintf(reply, "> %s", _prefs->alert_psk_b64[0] ? _prefs->alert_psk_b64 : "(unset)"); + sprintf(reply, "> %s", _prefs->alert_psk_hex[0] ? _prefs->alert_psk_hex : "(unset)"); } else if (memcmp(config, "alert.region", 12) == 0) { sprintf(reply, "> %s", _prefs->alert_region[0] ? _prefs->alert_region : "(unset, using default scope)"); } else if (memcmp(config, "alert.wifi", 10) == 0) { diff --git a/src/helpers/CommonCLI.h b/src/helpers/CommonCLI.h index e20ba93846..0d50aa7904 100644 --- a/src/helpers/CommonCLI.h +++ b/src/helpers/CommonCLI.h @@ -109,12 +109,12 @@ struct NodePrefs { // persisted to file // Sent over the radio (NOT over MQTT) so the alert still works while the MQTT path is broken. // All fields are appended at the end of NodePrefs for binary-compatible upgrades. uint8_t alert_enabled; // 0 = off (default), 1 = on - char alert_psk_b64[48]; // base64 PSK; empty = alerts disabled. PUBLIC_GROUP_PSK is rejected. + char alert_psk_hex[33]; // 32 lowercase hex chars (16-byte channel secret) + null; empty = alerts disabled. Banned keys (Public/#test/#bot) are rejected. uint16_t alert_wifi_minutes; // WiFi-down threshold in minutes (0 = disabled), default 30 uint16_t alert_mqtt_minutes; // MQTT-down threshold in minutes (0 = disabled), default 240 (4 h) uint16_t alert_min_interval_min; // min minutes between alerts for the same fault, default 60, floor 60 // When the operator configures via `set alert.hashtag `, we derive - // alert_psk_b64 from sha256("#name")[0..15] once and remember the hashtag + // alert_psk_hex from sha256("#name")[0..15] once and remember the hashtag // text here purely for `get alert.hashtag` readback. A subsequent // `set alert.psk` clears this field so it doesn't lie about provenance. char alert_hashtag[24]; From f717d6735bfdc42b74e414bd696df9cce8ff8afb Mon Sep 17 00:00:00 2001 From: agessaman Date: Sun, 17 May 2026 21:19:53 -0700 Subject: [PATCH 6/7] Update alert PSK handling to restrict access based on sender timestamp Modified the alert PSK command handling to only respond when the sender timestamp is zero, ensuring that the PSK is processed exclusively from the serial command line. This change enhances security and clarity in the command processing logic. --- src/helpers/CommonCLI.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/helpers/CommonCLI.cpp b/src/helpers/CommonCLI.cpp index b6ac08904b..2544e16b90 100644 --- a/src/helpers/CommonCLI.cpp +++ b/src/helpers/CommonCLI.cpp @@ -2053,7 +2053,7 @@ void CommonCLI::handleGetCmd(uint32_t sender_timestamp, char* command, char* rep #endif } else if (memcmp(config, "alert.hashtag", 13) == 0) { sprintf(reply, "> %s", _prefs->alert_hashtag[0] ? _prefs->alert_hashtag : "(unset)"); - } else if (memcmp(config, "alert.psk", 9) == 0) { + } else if (sender_timestamp == 0 && memcmp(config, "alert.psk", 9) == 0) { // from serial command line only sprintf(reply, "> %s", _prefs->alert_psk_hex[0] ? _prefs->alert_psk_hex : "(unset)"); } else if (memcmp(config, "alert.region", 12) == 0) { sprintf(reply, "> %s", _prefs->alert_region[0] ? _prefs->alert_region : "(unset, using default scope)"); From c41805d0705aaac515fe6d69d4c03cb7b8d2d4bd Mon Sep 17 00:00:00 2001 From: agessaman Date: Sun, 17 May 2026 21:30:31 -0700 Subject: [PATCH 7/7] Remove detailed fault alert documentation from MQTT_IMPLEMENTATION.md. This change streamlines the file by eliminating extensive sections on fault alerts, including configuration, triggers, and examples, while retaining essential information for clarity and usability. --- ALERTS.md | 115 +++++++++++++++++++++++++++++++++++++ MQTT_IMPLEMENTATION.md | 127 ++--------------------------------------- 2 files changed, 119 insertions(+), 123 deletions(-) create mode 100644 ALERTS.md diff --git a/ALERTS.md b/ALERTS.md new file mode 100644 index 0000000000..92b0974a1c --- /dev/null +++ b/ALERTS.md @@ -0,0 +1,115 @@ +# Fault Alerts (Group Channel) + +This document describes MeshCore repeater fault alerts, including configuration, CLI commands, and operational behavior. + +The repeater can broadcast a one-line fault notification on a configured group channel when WiFi or any active MQTT slot has been disconnected longer than a configurable threshold. + +The alert is sent over **LoRa** as a `PAYLOAD_TYPE_GRP_TXT` flood packet on the configured channel (with sender = device name) - *not* over MQTT. This is intentional: the MQTT path is what's broken, so the only working delivery is the mesh itself. Anyone in radio range subscribed to the same channel/hashtag in their companion app will see the alert inline with normal channel chat. + +> **A small list of community channels is intentionally NOT supported.** Fault alerts are operator-infrastructure noise - broadcasting them on shared community channels would spam every node in the area (and on `#test` / `#bot` would amplify via well-known auto-responders). The currently banned destinations are: +> +> - The well-known **Public** group PSK (`izOH6cXN6mrJ5e26oRXNcg==`) +> - **`#test`** (`sha256("#test")[0..15]`) +> - **`#bot`** (`sha256("#bot")[0..15]`) +> +> The list lives in `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); adding a new entry is one line (label + 32 hex chars). The matcher runs at both the CLI validation step (`set alert.psk`, `set alert.hashtag`) and the alert-send path, so a saved-config bypass is still refused at runtime. You must point alerts at a **private PSK** (`set alert.psk`) or a non-banned **hashtag channel** (`set alert.hashtag`) before alerts can fire. + +## Scope and routing + +Alert floods ride the **repeater's default scope** by default (the same TransportKey used for adverts and channel broadcasts - set via `region default ...`). Operators can override on a per-alert-feature basis with `set alert.region `: + +- If `alert.region` is set and the name resolves via `RegionMap`, that region's TransportKey is used. +- If `alert.region` is unset, or the name doesn't resolve, the repeater's `default_scope` is used. +- If both are null, the alert is sent unscoped (matches the pre-scoped firmware's behavior). + +`alert.region` is stored as-is - it does **not** create the region. Use `region put ` first if it doesn't exist. + +## What triggers an alert + +- **WiFi**: continuously down for at least `alert.wifi` minutes (default 30) +- **MQTT slot N**: enabled, has connected at least once since boot, and has been disconnected for at least `alert.mqtt` minutes (default 240, i.e. 4 h) + +A "recovered" message is sent once when the underlying connection comes back. After firing, a fault is rate-limited by `alert.interval` (default 60 minutes) before it can re-fire - this prevents flapping links from spamming the channel. + +## Defaults + +| Setting | Default | Notes | +|---------|---------|-------| +| `alert` | `off` | Master enable for automatic fault alerts | +| `alert.psk` | *(unset)* | Private channel secret as **32 hex chars** (16-byte channel key) - the same format the mobile app's "Share Channel" emits, and what every other secret-shaped CLI command (e.g. `prv.key`) uses. | +| `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | +| `alert.region` | *(unset)* | Optional region name; overrides the repeater's `default_scope` for alert sends only. Empty = use `default_scope`. Looked up lazily via `RegionMap`; unknown names silently fall back to `default_scope`. | +| `alert.wifi` | `30` (min) | 0 disables WiFi alerts | +| `alert.mqtt` | `240` (min) | 0 disables MQTT alerts | +| `alert.interval` | `60` (min) | Minutes between repeat alerts of the same fault. **Hard floor of 60 min** so a flapping link can't spam the mesh; the CLI rejects lower values and AlertReporter clamps stale prefs at runtime. | + +> `alert.psk` is unset on a fresh flash. **Alerts cannot fire and `alert test` will refuse to send until you configure either `alert.psk` directly or `alert.hashtag` (which derives one).** The sender shown on outgoing alert messages is always the node name (`set name ...`); there is no separate `alert.name`. + +## CLI + +Get: +- `get alert` - master on/off +- `get alert.psk` - the active 32-hex-char PSK (or `(unset)`) (**serial console only**) +- `get alert.hashtag` - the originating hashtag (or `(unset)`, e.g. after `set alert.psk` overrides the hashtag-derived key) +- `get alert.region` - alert-only scope override (or `(unset, using default scope)`) +- `get alert.wifi` / `get alert.mqtt` / `get alert.interval` + +Set: +- `set alert on` / `set alert off` +- `set alert.psk ` - 32 hex chars (16-byte channel secret); rejects banned channels (Public, `#test`, `#bot`). Paste the mobile app's "Share Channel" output as-is. Clears `alert.hashtag` since the new key is operator-supplied. +- `set alert.psk` (no argument) - clears both `alert.psk` and `alert.hashtag` +- `set alert.hashtag ` - derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). Refuses banned hashtag names. +- `set alert.hashtag` (no argument) - clears both `alert.psk` and `alert.hashtag` +- `set alert.region ` - alert-only scope override (no region-map mutation; unknown names silently fall back to `default_scope`) +- `set alert.region` (no argument) - clear override, use `default_scope` +- `set alert.wifi ` (0-1440; 0 = disabled) +- `set alert.mqtt ` (0-10080; 0 = disabled) +- `set alert.interval ` (60-10080; 60-minute floor to protect mesh airtime) + +Action: +- `alert test` - send a one-off `[test] alert channel ok` immediately on the configured channel; ignores `alert on/off` so operators can verify the channel before enabling fault firing. Returns an error if no channel is configured. +- `alert test ` - send a custom test message: `[test] `. + +## Example: dedicated hashtag channel (recommended for operator groups) + +```bash +set alert.hashtag ops-alerts # stored as "#ops-alerts"; key = sha256("#ops-alerts")[0..15] +set alert.wifi 10 # tighter for ops monitoring +set alert.mqtt 60 +set alert on +alert test +``` + +Anyone running a companion app and subscribed to the `#ops-alerts` hashtag channel will see the alerts inline. + +## Example: dedicated alerts channel with a private PSK + +Generate a 16-byte random PSK as 32 hex chars (`openssl rand -hex 16`), or use the companion app's "Add channel" feature and copy the "Share Channel" output. Then: + +```bash +set alert.psk <32_hex_chars> # 16-byte channel secret; mobile "Share Channel" pastes in directly +set alert.wifi 10 +set alert.mqtt 60 +set alert on +alert test +``` + +Subscribers running a MeshCore companion app should add a channel with the same PSK; alerts will appear in that channel's chat view. (Pick any local name for it - the sender of incoming alert messages is the repeater's node name.) + +## Sample messages + +``` +MyObserver: WiFi down 47m (reason 201) +MyObserver: WiFi recovered after 1h3m +MyObserver: MQTT slot 1 (analyzer-us) down 4h12m +MyObserver: MQTT slot 1 (analyzer-us) recovered after 4h45m +``` + +## Notes + +- A reboot during an outage resets the timer; the alert won't double-fire because `millis()` starts at 0 at boot. The fault must persist `alert.wifi` / `alert.mqtt` minutes from boot. +- Fault state is stored in RAM only - no persistence across reboots. +- The MQTT-slot watcher uses a separate per-slot `current_outage_started_ms` field that is reset on each reconnect, distinct from the `first_disconnect_time` shown in `mqttN.diag` (which remains a "first disconnect since boot" counter for diagnostics). +- WiFi-down alerts can only be delivered if the LoRa radio is up. There is no fallback path. +- Banned channels (Public, `#test`, `#bot`) are **rejected** at both `set alert.psk` / `set alert.hashtag` and at the alert-send path, so even if you somehow set one via a saved config file, the firmware will silently refuse to broadcast on it. To add another banned channel, append a row to `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); the format is `{ "label", "32-lowercase-hex-chars" }` (compute as `printf '#name' | openssl dgst -sha256 | cut -c1-32`). +- Alerts are sent via `sendFlood` with the resolved TransportKey codes attached, so they appear on the configured scope just like other broadcast traffic. Operators monitoring a specific region need to be subscribed to that region's scope to hear alerts. diff --git a/MQTT_IMPLEMENTATION.md b/MQTT_IMPLEMENTATION.md index d575ea93ec..d08363d9cb 100644 --- a/MQTT_IMPLEMENTATION.md +++ b/MQTT_IMPLEMENTATION.md @@ -678,131 +678,12 @@ set timezone EST # Abbreviation set timezone UTC-5 # UTC offset ``` -## Fault Alerts (Group Channel) - -The repeater can broadcast a one-line fault notification on a configured group channel when WiFi or any active MQTT slot has been disconnected longer than a configurable threshold. - -The alert is sent over **LoRa** as a `PAYLOAD_TYPE_GRP_TXT` flood packet on the configured channel (with sender = device name) — *not* over MQTT. This is intentional: the MQTT path is what's broken, so the only working delivery is the mesh itself. Anyone in radio range subscribed to the same channel/hashtag in their companion app will see the alert inline with normal channel chat. - -> **A small list of community channels is intentionally NOT supported.** Fault alerts are operator-infrastructure noise — broadcasting them on shared community channels would spam every node in the area (and on `#test` / `#bot` would amplify via well-known auto-responders). The currently banned destinations are: -> -> - The well-known **Public** group PSK (`izOH6cXN6mrJ5e26oRXNcg==`) -> - **`#test`** (`sha256("#test")[0..15]`) -> - **`#bot`** (`sha256("#bot")[0..15]`) -> -> The list lives in `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); adding a new entry is one line (label + 32 hex chars). The matcher runs at both the CLI validation step (`set alert.psk`, `set alert.hashtag`) and the alert-send path, so a saved-config bypass is still refused at runtime. You must point alerts at a **private PSK** (`set alert.psk`) or a non-banned **hashtag channel** (`set alert.hashtag`) before alerts can fire. - -### Scope and routing - -Alert floods ride the **repeater's default scope** by default (the same TransportKey used for adverts and channel broadcasts — set via `region default ...`). Operators can override on a per-alert-feature basis with `set alert.region `: - -- If `alert.region` is set and the name resolves via `RegionMap`, that region's TransportKey is used. -- If `alert.region` is unset, or the name doesn't resolve, the repeater's `default_scope` is used. -- If both are null, the alert is sent unscoped (matches the pre-scoped firmware's behavior). - -`alert.region` is stored as-is — it does **not** create the region. Use `region put ` first if it doesn't exist. - -### What triggers an alert - -- **WiFi**: continuously down for at least `alert.wifi` minutes (default 30) -- **MQTT slot N**: enabled, has connected at least once since boot, and has been disconnected for at least `alert.mqtt` minutes (default 240, i.e. 4 h) - -A "recovered" message is sent once when the underlying connection comes back. After firing, a fault is rate-limited by `alert.interval` (default 60 minutes) before it can re-fire — this prevents flapping links from spamming the channel. - -### Defaults - -| Setting | Default | Notes | -|---------|---------|-------| -| `alert` | `off` | Master enable for automatic fault alerts | -| `alert.psk` | *(unset)* | Private channel secret as **32 hex chars** (16-byte channel key) — the same format the mobile app's "Share Channel" emits, and what every other secret-shaped CLI command (e.g. `prv.key`) uses. | -| `alert.hashtag` | *(unset)* | Informational only; set via `set alert.hashtag` to pre-derive `alert.psk` from `sha256("#name")[0..15]`. Cleared when `alert.psk` is set directly. | -| `alert.region` | *(unset)* | Optional region name; overrides the repeater's `default_scope` for alert sends only. Empty = use `default_scope`. Looked up lazily via `RegionMap`; unknown names silently fall back to `default_scope`. | -| `alert.wifi` | `30` (min) | 0 disables WiFi alerts | -| `alert.mqtt` | `240` (min) | 0 disables MQTT alerts | -| `alert.interval` | `60` (min) | Minutes between repeat alerts of the same fault. **Hard floor of 60 min** so a flapping link can't spam the mesh; the CLI rejects lower values and AlertReporter clamps stale prefs at runtime. | - -> `alert.psk` is unset on a fresh flash. **Alerts cannot fire and `alert test` will refuse to send until you configure either `alert.psk` directly or `alert.hashtag` (which derives one).** The sender shown on outgoing alert messages is always the node name (`set name ...`); there is no separate `alert.name`. - -### CLI - -Get: -- `get alert` — master on/off -- `get alert.psk` — the active 32-hex-char PSK (or `(unset)`) -- `get alert.hashtag` — the originating hashtag (or `(unset)`, e.g. after `set alert.psk` overrides the hashtag-derived key) -- `get alert.region` — alert-only scope override (or `(unset, using default scope)`) -- `get alert.wifi` / `get alert.mqtt` / `get alert.interval` - -Set: -- `set alert on` / `set alert off` -- `set alert.psk ` — 32 hex chars (16-byte channel secret); rejects banned channels (Public, `#test`, `#bot`). Paste the mobile app's "Share Channel" output as-is. Clears `alert.hashtag` since the new key is operator-supplied. -- `set alert.psk` (no argument) — clears both `alert.psk` and `alert.hashtag` -- `set alert.hashtag ` — derives the 16-byte key from `sha256("#name")` *once*, stores it as `alert.psk`, and remembers the hashtag for `get alert.hashtag`. `#` prefix is added if omitted (so `alerts` and `#alerts` are equivalent). Refuses banned hashtag names. -- `set alert.hashtag` (no argument) — clears both `alert.psk` and `alert.hashtag` -- `set alert.region ` — alert-only scope override (no region-map mutation; unknown names silently fall back to `default_scope`) -- `set alert.region` (no argument) — clear override, use `default_scope` -- `set alert.wifi ` (0–1440; 0 = disabled) -- `set alert.mqtt ` (0–10080; 0 = disabled) -- `set alert.interval ` (60–10080; 60-minute floor to protect mesh airtime) - -Action: -- `alert test` — send a one-off `[test] alert channel ok` immediately on the configured channel; ignores `alert on/off` so operators can verify the channel before enabling fault firing. Returns an error if no channel is configured. -- `alert test ` — send a custom test message: `[test] `. - -### Example: dedicated hashtag channel (recommended for operator groups) - -```bash -set alert.hashtag ops-alerts # stored as "#ops-alerts"; key = sha256("#ops-alerts")[0..15] -set alert.wifi 10 # tighter for ops monitoring -set alert.mqtt 60 -set alert on -alert test -``` - -Anyone running a companion app and subscribed to the `#ops-alerts` hashtag channel will see the alerts inline. - -### Example: dedicated alerts channel with a private PSK - -Generate a 16-byte random PSK as 32 hex chars (`openssl rand -hex 16`), or use the companion app's "Add channel" feature and copy the "Share Channel" output. Then: - -```bash -set alert.psk <32_hex_chars> # 16-byte channel secret; mobile "Share Channel" pastes in directly -set alert.wifi 10 -set alert.mqtt 60 -set alert on -alert test -``` - -Subscribers running a MeshCore companion app should add a channel with the same PSK; alerts will appear in that channel's chat view. (Pick any local name for it — the sender of incoming alert messages is the repeater's node name.) - -### Sample messages - -``` -MyObserver: WiFi down 47m (reason 201) -MyObserver: WiFi recovered after 1h3m -MyObserver: MQTT slot 1 (analyzer-us) down 4h12m -MyObserver: MQTT slot 1 (analyzer-us) recovered after 4h45m -``` - -### Notes - -- A reboot during an outage resets the timer; the alert won't double-fire because `millis()` starts at 0 at boot. The fault must persist `alert.wifi` / `alert.mqtt` minutes from boot. -- Fault state is stored in RAM only — no persistence across reboots. -- The MQTT-slot watcher uses a separate per-slot `current_outage_started_ms` field that is reset on each reconnect, distinct from the `first_disconnect_time` shown in `mqttN.diag` (which remains a "first disconnect since boot" counter for diagnostics). -- WiFi-down alerts can only be delivered if the LoRa radio is up. There is no fallback path. -- Banned channels (Public, `#test`, `#bot`) are **rejected** at both `set alert.psk` / `set alert.hashtag` and at the alert-send path, so even if you somehow set one via a saved config file, the firmware will silently refuse to broadcast on it. To add another banned channel, append a row to `BANNED_ALERT_CHANNELS[]` in [src/helpers/AlertReporter.cpp](src/helpers/AlertReporter.cpp); the format is `{ "label", "32-lowercase-hex-chars" }` (compute as `printf '#name' | openssl dgst -sha256 | cut -c1-32`). -- Alerts are sent via `sendFlood` with the resolved TransportKey codes attached, so they appear on the configured scope just like other broadcast traffic. Operators monitoring a specific region need to be subscribed to that region's scope to hear alerts. - ## SNMP Monitoring Observer nodes include an optional SNMP v2c agent that exposes radio stats, MQTT connectivity, memory usage, and network information to standard monitoring tools. See [MQTT_SNMP.md](MQTT_SNMP.md) for setup and OID reference. -## Dependencies -- **PsychicMqttClient**: MQTT client library (supports WSS and direct MQTT) -- **ArduinoJson**: JSON message formatting -- **NTPClient**: Network time protocol client -- **Timezone**: Timezone conversion library (JChristensen/Timezone) -- **WiFi**: ESP32 WiFi functionality -- **Ed25519**: Cryptographic library for JWT token signing -- **JWTHelper**: Custom JWT token generation for device authentication -- **SNMP_Agent**: Optional SNMPv2c agent (0neblock/SNMP_Agent, observer builds only) +## Fault Alerts + +Fault alerts broadcast LoRa group-channel notifications when WiFi or configured MQTT links stay down past configured thresholds, with optional recovery notices and rate limiting to avoid spam. +For configuration, CLI commands, examples, and operational notes, see [ALERTS.md](ALERTS.md).