purview-v1.11.8
PAX Purview Audit Log Processor v1.11.8
Version 1.11.8 is a major capability release built around a new -Deidentify switch — an opt-in, one-way anonymization mode that replaces every personally- and company-identifying value across PAX's outputs with deterministic, format-preserving tokens, so anonymized data can be shared for reporting without exposing identities. Because the rollup (Python) and raw (PowerShell) engines share one salt and algorithm, the same person always resolves to the same token across every raw and rolled-up file, and anonymization preserves all relationships — the manager / org hierarchy, the user-to-activity joins, and distinct-resource counts — so anonymized data still drives the same dashboards and produces the same aggregate numbers as the identified data. -Deidentify is OFF by default; when it is not supplied, every output is byte-identical to v1.11.7.
Version 1.11.8 also adds a built-in org / manager hierarchy to the AI-in-One (AIO) and AI Business Value (AIBV) rollups, derived automatically from the Entra data PAX already collects and emitted as ready-to-model columns on the Users output, accompanied by a new optional -FillerLabel switch.
What's new in v1.11.8
- Anonymous-style reporting with the new
-Deidentifyswitch. Identifying values — UPNs and email addresses, display names, Entra and mailbox GUIDs, SIDs, resource URLs, file and document names, proxy addresses, IP addresses, and session / token identifiers — are replaced with deterministic salted hashes (HMAC-SHA256) rendered as format-preserving tokens (a UPN stays UPN-shaped, a GUID stays GUID-shaped). The mapping is irreversible — no decode map is ever written. Organizational and analytical attributes (job title, department, division, cost centre, country, company, license status, agent and application names) are intentionally kept so anonymized data remains analytically useful. - Relationships and manager hierarchy preserved under deidentification. Because the hash is deterministic and normalization-stable, the same identity always maps to the same token everywhere it appears — including manager-link columns — so the org hierarchy, the fact-to-Users
UserKeyjoin, cross-run append history, and distinct-resource counts all stay correct. An anonymized dataset yields the same dashboard structure and the same aggregate totals as the identified one. - Full-fidelity raw audit data under deidentification. The nested
AuditData/CopilotEventDataJSON is anonymized in place rather than redacted: only the personal leaves (decided by each field's JSON key path, not by guessing from the value) are replaced with the same tokens used everywhere else. Timestamps are preserved exactly, IP addresses are tokenized to valid-shaped addresses, and any blob that cannot be parsed is fully redacted as a safety net. - Safe-by-default integration with resume, append, and existing switches. The setting is shown in the run's parameter snapshot (terminal, log, and checkpoint), persisted in the checkpoint and restored on
-Resumeso an interrupted anonymized run stays anonymized, and a guard hard-stops any append (-AppendFile/-AppendUserInfo) that would mix anonymized and identified data into one file or re-hash already-anonymized rows. - Built-in org / manager hierarchy for the AIO and AIBV rollups. The Users output now includes each person's level, top-to-them management chain, manager, and team-size measures (direct reports and the total number of people beneath them) — exactly what a Power BI model needs to roll metrics up by team, department, or management chain and to build leader and drill-down views. It is produced on every AIO/AIBV rollup with no extra switch and adds only new columns (all existing rollup, fact, raw audit, and EntraUsers columns are unchanged; the M365 Usage dashboard is unaffected), and it is fully compatible with
-Deidentifyand with incremental-AppendUserInforuns. A new optional-FillerLabelswitch controls how the hierarchy's unused deeper levels are displayed (left blank by default).
The attached script is the v1.11.8 release build. See the documentation in the repository for full configuration details.