Implement baseline persistence for drift detection#10
Conversation
Agent-Logs-Url: https://github.com/canstralian/PassiveOSINTControlPanel/sessions/48a6b056-0085-4df6-823d-26783f04b4b0 Co-authored-by: canstralian <8595080+canstralian@users.noreply.github.com>
Agent-Logs-Url: https://github.com/canstralian/PassiveOSINTControlPanel/sessions/48a6b056-0085-4df6-823d-26783f04b4b0 Co-authored-by: canstralian <8595080+canstralian@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds persistent baseline state so drift detection can compare telemetry across runs (instead of always comparing to an empty baseline), and wires the new drift assessment + signal reporting into the Gradio execution pipeline.
Changes:
- Added
osint_core/baseline.pyto load/save/update a persisted baseline (data/baseline.json) used for cross-run drift detection. - Replaced drift pseudocode with concrete dataclasses + drift checks in
osint_core/drift.pyand integrated assessment output into the UI inapp.py. - Removed legacy drift/correction helpers from
app.pyand now display drift signals, dominant drift type, confidence, and recommended correction.
Reviewed changes
Copilot reviewed 3 out of 9 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
osint_core/baseline.py |
New baseline persistence/update layer used by drift detection across runs. |
osint_core/drift.py |
Implements drift dataclasses and assessment pipeline (policy/adversarial/operational/structural/behavioral/statistical). |
app.py |
Loads baseline at startup, builds telemetry snapshots, assesses drift, updates baseline, and renders drift details in UI output. |
tests/__pycache__/test_drift.cpython-312-pytest-9.0.3.pyc |
Compiled test artifact added to repo (should not be committed). |
osint_core/__pycache__/validators.cpython-312.pyc |
Compiled artifact added to repo (should not be committed). |
osint_core/__pycache__/drift.cpython-312.pyc |
Compiled artifact added to repo (should not be committed). |
osint_core/__pycache__/baseline.cpython-312.pyc |
Compiled artifact added to repo (should not be committed). |
osint_core/__pycache__/__init__.cpython-312.pyc |
Compiled artifact added to repo (should not be committed). |
__pycache__/app.cpython-312.pyc |
Compiled artifact added to repo (should not be committed). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Assess drift using new drift detection system | ||
| global BASELINE | ||
| drift_assessment = assess_drift(telemetry, BASELINE, policy_result) | ||
|
|
||
| # Update baseline after assessment (separation of concerns: detect, then update) | ||
| BASELINE = update_baseline(BASELINE, telemetry, drift_assessment) | ||
|
|
There was a problem hiding this comment.
BASELINE is a global mutable dict that is updated and persisted on every request. In a Gradio app, requests can run concurrently (multiple threads/workers), which can lead to races and baseline.json corruption/lost updates. Consider guarding baseline load/update/save with a lock (e.g., threading.Lock) and/or moving persistence into a single-threaded background task; also consider making writes atomic (write temp file then replace).
There was a problem hiding this comment.
@copilot apply changes based on this feedback @claude[agent] review
| # Update runtime p95 (use simple moving max for now, proper p95 requires history) | ||
| if hasattr(telemetry, "duration_ms"): | ||
| current_p95 = new_baseline.get("runtime_p95_ms", 500) | ||
| new_runtime = telemetry.duration_ms | ||
| # Conservative: only increase p95 if we see a higher value | ||
| new_baseline["runtime_p95_ms"] = max(current_p95, new_runtime) | ||
|
|
There was a problem hiding this comment.
The comment/docstring says the baseline update is EMA-based and mentions runtime P95, but this code currently updates runtime_p95_ms via max() (monotonic increase). This can cause the baseline to only drift upward and never recover after a transient slowdown. Either implement the stated EMA/quantile tracking (with history/windowing) or adjust the naming/docs to reflect the actual behavior.
There was a problem hiding this comment.
To use Codex here, create a Codex account and connect to github.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: WhacktheJacker <8595080+canstralian@users.noreply.github.com>
|
@copilot apply changes based on the comments in this thread |
|
@copilot resolve the merge conflicts in this pull request |
|
@copilot apply changes based on the comments in this thread |
The drift detection system had no cross-run memory. All drift comparisons were against empty baselines, making statistical and behavioral drift undetectable.
Changes
Core Implementation
osint_core/baseline.py(new): Baseline persistence layerload_baseline(): Loads fromdata/baseline.jsonor returns defaultsupdate_baseline(): EMA-based updates (α=0.1) for distributions, output hashes, runtime P95known_output_hashes,input_type_distribution,module_usage_distribution, structural expectationsosint_core/drift.py: Converted pseudocode to PythonDriftVector,DriftSignal,TelemetrySnapshot,DriftAssessmentassess_drift(): Pure function orchestrating six drift checksIntegration
app.py: Wire baseline into execution pipelineBASELINE = load_baseline()TelemetrySnapshotwith output hash, error counts, runtime contextassess_drift(telemetry, BASELINE, policy_result)BASELINE = update_baseline(BASELINE, telemetry, assessment)detect_drift()andchoose_correction()functionsExample
Before: Drift vector always zeros (no baseline comparison)
After:
Statistical drift (input type distribution shift), operational drift (runtime degradation), and structural drift (manifest changes) now detectable across runs.