Conversation
Coverage Report
Per-file coverage (top 40 by missed lines)
|
There was a problem hiding this comment.
Pull request overview
Adds deterministic policy evaluation context and correlation data across tool responses, caching, and audit logs to make outputs reproducible and auditable.
Changes:
- Introduces deterministic
fingerprints(config,policy) inToolResponse/LockfileResponseand updates tests/docs accordingly. - Adds policy snapshot + fingerprinting utilities (
policy_snapshotmodule) and logs richer audit context (snapshot version, enabled checks, evaluation time). - Threads a deterministic
evaluation_timethrough check execution (with optionalSAFE_PKGS_EVALUATION_TIMEoverride).
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/mcp_stdio.rs | Asserts MCP JSON responses include correctly-shaped fingerprints. |
| tests/check_package_mock_http.rs | Adds optional debug printing of JSON responses and asserts fingerprints exist. |
| src/types.rs | Adds DecisionFingerprints and embeds it in response types. |
| src/tests/service.rs | Updates service tests to validate fingerprints and cache key behavior. |
| src/tests/policy_snapshot.rs | Adds tests for stable config fingerprinting and policy fingerprint changes. |
| src/tests/checks.rs | Adds test ensuring enabled check IDs are normalized/sorted deterministically. |
| src/tests/audit_log.rs | Extends audit log tests to cover new deterministic context fields. |
| src/service.rs | Builds per-registry policy snapshots, adds deterministic evaluation clock, updates caching + audit logging, returns fingerprints. |
| src/policy_snapshot.rs | New canonicalization + hashing implementation for deterministic config/policy fingerprints. |
| src/mcp/server.rs | Updates tool descriptions to document fingerprints in outputs. |
| src/main.rs | Registers the new policy_snapshot module. |
| src/custom_rules.rs | Makes custom rule evaluation time deterministic via the execution context. |
| src/checks.rs | Adds run_all_checks_at_time and deterministic enabled-check ID collection. |
| src/audit_log.rs | Adds deterministic policy context fields to audit records. |
| docs/cache-deep-dive.md | Updates cache key format and documents fingerprint calculation/canonicalization. |
| crates/core/src/lib.rs | Extends CheckExecutionContext with evaluation_time. |
| crates/checks/version-age/src/lib.rs | Uses evaluation_time to compute age deterministically. |
| crates/checks/staleness/src/lib.rs | Uses evaluation_time to compute staleness deterministically. |
| crates/checks/popularity/src/lib.rs | Uses evaluation_time to compute package age deterministically. |
| README.md | Documents fingerprints and deterministic evaluation clock env var. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| left.reason.as_deref(), | ||
| ) | ||
| .cmp(&( | ||
| right.id.as_str(), | ||
| right.enabled, | ||
| right.severity, | ||
| right.reason.as_deref(), |
There was a problem hiding this comment.
The custom_rules sort key omits registries and conditions. If two rules share the same (id, enabled, severity, reason), the comparator returns Equal; Rust’s sort_by is unstable for equal keys, which can reorder those entries nondeterministically and change the fingerprint across runs. To guarantee determinism, include additional fields in the ordering key (e.g., registries + a canonical string of conditions) or sort by a fully-canonical serialized representation.
| left.reason.as_deref(), | |
| ) | |
| .cmp(&( | |
| right.id.as_str(), | |
| right.enabled, | |
| right.severity, | |
| right.reason.as_deref(), | |
| left.reason.as_deref(), | |
| &left.registries, | |
| &left.conditions, | |
| ) | |
| .cmp(&( | |
| right.id.as_str(), | |
| right.enabled, | |
| right.severity, | |
| right.reason.as_deref(), | |
| &right.registries, | |
| &right.conditions, |
| let age_days = resolved_version | ||
| .published | ||
| .map(|published| (context.evaluation_time - published).num_days()); | ||
|
|
||
| Ok(run( | ||
| context.package_name, | ||
| resolved_version, | ||
| context.policy.min_version_age_days, | ||
| age_days, |
There was a problem hiding this comment.
age_days can become negative when evaluation_time is earlier than published (possible when SAFE_PKGS_EVALUATION_TIME is set). That would treat a future publish date as “very young” and could trigger incorrect findings. Consider clamping age_days to 0 (or explicitly handling negative durations) before passing it into run.
| let age_days = resolved_version | ||
| .published | ||
| .map(|published| (context.evaluation_time - published).num_days()); | ||
|
|
||
| Ok(run( | ||
| context.package_name, | ||
| resolved_version, | ||
| context.weekly_downloads, | ||
| context.policy.min_weekly_downloads, | ||
| DEFAULT_YOUNG_PACKAGE_AGE_DAYS, | ||
| age_days, | ||
| ) |
There was a problem hiding this comment.
age_days can be negative when evaluation_time is earlier than published (e.g., via SAFE_PKGS_EVALUATION_TIME). That would cause popularity findings to treat a future publish date as “young” and potentially flag incorrectly. Consider clamping to 0 (or otherwise handling negative durations) before using age_days.
| context | ||
| .evaluation_time | ||
| .signed_duration_since(published) | ||
| .num_days() |
There was a problem hiding this comment.
requested_age_days is derived from evaluation_time.signed_duration_since(published).num_days(), which can be negative if evaluation_time is before published (possible with the deterministic clock override). Consider clamping negative values to 0 (or handling them explicitly) so “future” publish dates don’t lead to surprising staleness behavior.
| context | |
| .evaluation_time | |
| .signed_duration_since(published) | |
| .num_days() | |
| let days = context | |
| .evaluation_time | |
| .signed_duration_since(published) | |
| .num_days(); | |
| if days < 0 { 0 } else { days } |
| spec.version.as_deref(), | ||
| registry_key, | ||
| context, | ||
| evaluation_time, |
There was a problem hiding this comment.
evaluation_time is passed by value into evaluate_package_at_time inside the for spec in package_specs loop. chrono::DateTime<Utc> is not Copy, so this will move evaluation_time on the first iteration and fail to compile (or force unintended cloning elsewhere). Consider passing evaluation_time.clone() here, or change evaluate_package_at_time to take &DateTime<Utc> (and clone only where needed).
| evaluation_time, | |
| evaluation_time.clone(), |
| return Ok(None); | ||
| }; | ||
| let raw = raw.to_string_lossy(); | ||
| let parsed = chrono::DateTime::parse_from_rfc3339(raw.as_ref()) |
There was a problem hiding this comment.
SAFE_PKGS_EVALUATION_TIME parsing treats an empty value as an error (e.g., SAFE_PKGS_EVALUATION_TIME=), which can unexpectedly break execution in environments that set empty vars. Consider trimming and returning Ok(None) when the value is empty/whitespace before attempting RFC3339 parsing.
| let parsed = chrono::DateTime::parse_from_rfc3339(raw.as_ref()) | |
| let trimmed = raw.trim(); | |
| if trimmed.is_empty() { | |
| return Ok(None); | |
| } | |
| let parsed = chrono::DateTime::parse_from_rfc3339(trimmed) |
No description provided.