Skip to content

support saving tlogs on server#33

Merged
tridge merged 8 commits intoArduPilot:mainfrom
tridge:pr-save-tlog
May 10, 2026
Merged

support saving tlogs on server#33
tridge merged 8 commits intoArduPilot:mainfrom
tridge:pr-save-tlog

Conversation

@tridge
Copy link
Copy Markdown
Contributor

@tridge tridge commented May 10, 2026

users can download tlogs if enabled

tridge and others added 8 commits May 10, 2026 17:56
Append-only schema bump on KeyEntry: a uint32_t flags bit
(KEY_FLAG_TLOG) plus a float tlog_retention_days field for per-entry
tlog cleanup (0 = keep forever; fractional values supported so tests
can exercise sub-day retention without real wall-clock waits) plus a
uint32_t reserved[16] for future fields. PACK_FORMAT becomes 168
bytes; legacy 104-byte records zero-extend on read.

set_flag(... 'tlog') on a fresh entry seeds tlog_retention_days = 7.0
so first-enable matches the documented default. Also adds a
'setretention PORT2 DAYS' CLI subcommand and surfaces tlog state in
KeyEntry.__str__ / 'keydb.py list'.

Tests cover round-trip, legacy zero-extend, future-tail preservation,
flag mechanics, and CLI behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When KEY_FLAG_TLOG is set on an entry, the per-port-pair child opens
logs/<port2>/<YYYY-MM-DD>/sessionN.tlog (lowest unused N) lazily on
the first received frame, and writes one MAVProxy-format record
(8-byte big-endian usec timestamp + raw MAVLink frame) per packet
forwarded in either direction. The TlogWriter file is unbuffered so
the tail is readable while the session is live and survives child
termination without a clean fclose.

Capture taps the parsed mavlink_message_t at the four
receive_message() success sites in supportproxy.cpp main_loop and
re-encodes via mavlink_msg_to_send_buffer; this means parser-rejected
frames (e.g. unknown msgids — see mavlink.cpp:202) are not in the
tlog. End-to-end forwarding of unknown msgids is intentionally
deferred to a separate PR (likely involving pymavlink changes).

Adds a long-lived tlog cleanup child forked once at parent startup
(reaped + respawned via check_children). Each pass walks keys.tdb
and removes .tlog files older than tlog_retention_days*86400 seconds
by mtime, then prunes empty date dirs. Interval defaults to 3600s
and honours SUPPORTPROXY_CLEANUP_INTERVAL (float seconds) for tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OwnerEditForm + AdminEditForm gain a 'Record telemetry logs' checkbox
and a float 'Tlog retention (days)' field. The owner field is capped
at 30 days by the validator with an extra server-side guard in the
route; the admin field is unbounded. First-enable from a fresh-zero
retention seeds 7.0 days to mirror keydb_lib.set_flag's auto-default.

New webadmin/tlogs.py blueprint serves browse + download of the
logs/<port2>/<YYYY-MM-DD>/sessionN.tlog tree:

  GET  /admin/tlogs/<port2>/[<date>]    (admin: any port2)
  GET  /admin/tlogs/<port2>/<date>/<file>
  GET  /me/tlogs/[<date>]               (owner: own port2 only)
  GET  /me/tlogs/<date>/<file>

Downloads use flask.send_from_directory so directory traversal is
rejected at the framework level; the listing pages also pre-validate
date (^YYYY-MM-DD$) and session name (^session\d+\.tlog$) so a
malformed URL bounces with 404 before touching the filesystem.
Logs root is configurable via WEBADMIN_LOGS_DIR / LOGS_DIR config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-row 'kill' button on the owner /me/ and admin /admin/connections
pages drops one connection at a time:

  POST /me/kill/<conn_index>            (owner kills own session)
  POST /admin/<port2>/kill/<conn_index> (admin kills any session)

conn_index 0 is the user side: dropping it ends the whole session
(no proxy without conn1). conn_index >= 1 is an engineer slot — only
that slot is closed, user + other engineers stay up.

Mechanism: webadmin sets CONN_FLAG_DROP_REQUESTED on the matching
ConnEntry in connections.tdb (read-modify-write that preserves any
forward-compat trailing bytes), then SIGUSR1's the child PID. The
PID is validated against /proc/<pid>/comm == 'supportproxy' before
signalling so we never hit a recycled PID. SIGUSR1 sets a sig_atomic_t
flag in the child; main_loop checks it on the next iteration (select
returns EINTR), tdb_traverses for entries belonging to its port2
with the bit set, closes the matching slot, and tdb_deletes the
record so the heartbeat doesn't republish it.

Tests cover both the routing/auth boundary (request_drop sets the
flag on the right record only; admin route is admin-only; PID with
mismatched comm or dead PID is skipped; forward-compat tail bytes
survive the read-modify-write) and the C++ end-to-end drop path
(real proxy + 1 user + 2 signed engineers; drop engineer ArduPilot#2 keeps
others; drop user ends the whole session).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The per-port-pair child writes connections.tdb every 5s now (was
10s) and the four live views — admin_connections, owner, admin_tlogs,
owner_tlogs — auto-refresh on the same cadence via http-equiv. Same
clock for both sides means each refresh sees a fresh snapshot
without stale-read flicker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The deploy rebuild ran serially. -j with no number lets make spawn
unlimited jobs across all available cores; this is a build-from-clean
flow so there's no risk of stomping a partial cache.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The tlog file list now shows each session's last-modified timestamp
formatted in the *viewer's* timezone, not the proxy host's. Server
emits a <time datetime="..."> with ISO 8601 UTC plus a UTC fallback
text inside the tag, and a small client-side rewriter
(static/localtime.js) replaces the text on DOMContentLoaded with
new Date(...).format() in the same YYYY-MM-DD HH:MM:SS shape — same
width as the fallback so the column doesn't reflow on the 5 s
auto-refresh. CSS tabular-nums on table.entries time keeps digit
widths stable across timestamp values too. The <time> title attr
exposes the original UTC stamp on hover.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* Vendored static/ardupilot_logo.png (245x88 RGBA, 7 KiB) and a
  blue (#3a7cb3) header bar that extends to the page edges so the
  transparent logo PNG visually merges with the bar. Logo and title
  share one .brand <a> back to /; nav links are inverted to white;
  the log-out button is white-on-blue with a hover wash.

* Per-passphrase show/hide eye via static/password-toggle.js. JS
  wraps every <input type=password> on the page (idempotent via a
  data-attribute marker) with an SVG eye button that toggles
  input.type between password and text. tabindex=-1 keeps the eye
  out of the form's tab order; aria-label flips with state.

* autocomplete=new-password on every PasswordField except the
  /login one (which keeps current-password). Without this Chrome
  pre-fills the 'New passphrase' fields on /admin/<port2>/ from
  the credential it has stored for /login. spellcheck/autocorrect
  /autocapitalize=off complete the picture.

* SEND_FILE_MAX_AGE_DEFAULT=86400 so static assets carry
  Cache-Control: public, max-age=86400. Without this Flask emits
  no cache header, the meta-refresh flow re-fetches the logo
  every 5 s, and the header flashes mid-paint. Explicit
  width/height + fetchpriority=high on the <img> tag let the
  browser carve out the slot at parse time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tridge tridge merged commit e0dadc0 into ArduPilot:main May 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant