What does this PR do?
Add an env var / parameter to control whether data_diff includes raw sample values (up to 5 rows) in its tool output.
Why
Flagged during the v0.6.0 release review (Chaos Gremlin persona). data_diff currently sends sample diff rows to the LLM provider, which is a PII/PHI/PCI exposure path for regulated environments. v0.6.0 mitigates with a SKILL.md compliance callout ("prefer algorithm: 'profile'"), but a hard env-var / tool-parameter guard is the durable fix.
Proposed
- Add
ALTIMATE_DATA_DIFF_INCLUDE_VALUES env var (default: 1, matches current behavior)
- Add
include_sample_values tool parameter (default: inherit env)
- When disabled, replace
d.values with (values redacted) in sample row output — keep row count and direction (source only / target only / updated)
- Org-wide override: env takes precedence over per-call parameter when set to
0
Deferred from
v0.6.0 release review. Filed at tag time per no-follow-up-PRs release policy.
Type of change
What does this PR do?
Add an env var / parameter to control whether
data_diffincludes raw sample values (up to 5 rows) in its tool output.Why
Flagged during the v0.6.0 release review (Chaos Gremlin persona).
data_diffcurrently sends sample diff rows to the LLM provider, which is a PII/PHI/PCI exposure path for regulated environments. v0.6.0 mitigates with a SKILL.md compliance callout ("prefer algorithm: 'profile'"), but a hard env-var / tool-parameter guard is the durable fix.Proposed
ALTIMATE_DATA_DIFF_INCLUDE_VALUESenv var (default:1, matches current behavior)include_sample_valuestool parameter (default: inherit env)d.valueswith(values redacted)in sample row output — keep row count and direction (source only / target only / updated)0Deferred from
v0.6.0 release review. Filed at tag time per no-follow-up-PRs release policy.
Type of change