feat(p1): structured audit log persistence (PRD §BR-9)#26
Conversation
PRD §BR-9: 'All events and turn outputs written to a structured,
human-readable audit log.'
bridge/audit_log.py — append-only JSONL log:
- log(kind, **fields) — generic entry
- log_state_change(StateChange) — HookListener integration
- log_turn_complete(turn, rewards, per_city) — BridgeLoop integration
- log_session_added(session_id, name, city_id)
- read_lines(limit=None) — read back for inspection / tests
- thread-safe writes (lock); auto-mkdirs; survives unserializable
values via repr() fallback so a bad log call never crashes the daemon
bridge/bridge_loop.py — BridgeLoop accepts an optional audit_log
argument. When present:
* the listener's on_change callback chains through audit.log_state_change
(preserving any pre-existing on_change), so every hook event lands
in the audit log without changing the listener API.
* process_turn calls audit.log_turn_complete(turn, rewards, per_city)
after the save and before turn_complete callbacks fire.
bridge/daemon.py — production entrypoint now wires AuditLog by
default (./logs/audit.jsonl, configurable via --audit-dir).
bridge/tests/test_audit_log.py — 11 tests:
- Directory paths get DEFAULT_AUDIT_FILENAME appended
- Explicit filenames are respected
- Writes JSONL with timestamp
- Append-only (not overwrite)
- log_state_change records all fields
- log_turn_complete records turn + rewards + per_city
- read_lines with limit
- read_lines skips malformed lines
- Unserializable values fall back to repr() (no crash)
- BridgeLoop integration: state_change + turn_complete entries land
- BridgeLoop without audit_log still works (backward compat)
Total: 191 passing (was 180; adds 11).
Closes audit P1: BR-9. Refs: PR #22, kanban t_26404be3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request implements a structured JSONL audit log system to record events and turn outputs, featuring a new AuditLog class and integration with the BridgeLoop and daemon components. Review feedback suggests optimizing JSON serialization for performance, enhancing memory efficiency in the read_lines method via collections.deque, and addressing side effects in the hook listener callback wrapping to prevent duplicate log entries.
| for k, v in fields.items(): | ||
| try: | ||
| json.dumps(v) | ||
| entry[k] = v | ||
| except (TypeError, ValueError): | ||
| entry[k] = repr(v) | ||
|
|
||
| line = json.dumps(entry) + "\n" |
There was a problem hiding this comment.
The current sanitization logic is inefficient because it calls json.dumps(v) for every field to check serializability, and then calls json.dumps(entry) again for the entire record. This results in redundant processing, especially for larger objects.
Using the default parameter in json.dumps is a more idiomatic and performant way to handle unserializable values. It also allows for a more 'structured' log by preserving the container (e.g., a list or dict) and only stringifying the specific unserializable leaf nodes, rather than replacing the entire top-level field with a repr string.
entry.update(fields)
# Use default=repr to handle unserializable values efficiently and maintain structure.
line = json.dumps(entry, default=repr) + "\n"| out: list[Dict[str, Any]] = [] | ||
| with open(self.path, "r", encoding="utf-8") as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line: | ||
| continue | ||
| try: | ||
| out.append(json.loads(line)) | ||
| except json.JSONDecodeError: | ||
| continue | ||
| if limit is not None: | ||
| return out[-limit:] | ||
| return out |
There was a problem hiding this comment.
The read_lines method reads the entire audit log into memory before applying the limit. As the audit log is append-only and intended to grow over time, this will eventually lead to significant memory consumption and performance degradation.
Consider using collections.deque with a maxlen to efficiently keep only the last N lines in memory while iterating through the file. This also correctly handles the case where limit=0 (which currently returns the entire list due to Python's slice behavior [-0:]).
| out: list[Dict[str, Any]] = [] | |
| with open(self.path, "r", encoding="utf-8") as f: | |
| for line in f: | |
| line = line.strip() | |
| if not line: | |
| continue | |
| try: | |
| out.append(json.loads(line)) | |
| except json.JSONDecodeError: | |
| continue | |
| if limit is not None: | |
| return out[-limit:] | |
| return out | |
| from collections import deque | |
| def _iter_entries(): | |
| with open(self.path, "r", encoding="utf-8") as f: | |
| for line in f: | |
| line = line.strip() | |
| if not line: continue | |
| try: | |
| yield json.loads(line) | |
| except json.JSONDecodeError: | |
| continue | |
| if limit is not None: | |
| return list(deque(_iter_entries(), maxlen=limit)) | |
| return list(_iter_entries()) |
| if self.hook_listener is not None and self.audit_log is not None: | ||
| existing_callback = self.hook_listener.on_change | ||
| def _audit_then_existing(change): | ||
| self.audit_log.log_state_change(change) | ||
| if existing_callback is not None: | ||
| existing_callback(change) | ||
| self.hook_listener.on_change = _audit_then_existing |
There was a problem hiding this comment.
Wiring the audit log by wrapping hook_listener.on_change in the BridgeLoop constructor introduces a side effect on the hook_listener object. If the same HookListener instance is passed to multiple BridgeLoop instances (e.g., in tests or during a component reload), it will result in multiple layers of wrapping and duplicate audit log entries.
It would be safer to check if the listener is already wrapped or to modify HookListener to support multiple subscribers.
Closes audit P1: BR-9. Append-only JSONL audit log wired into HookListener (state changes) and BridgeLoop (turn completions). Daemon spins one up by default at ./logs/audit.jsonl. 11 new tests; 191 total. Refs PR #22, kanban t_26404be3.