Skip to content

primitives transform strategy

Douwe de Vries edited this page Jul 1, 2026 · 1 revision

Transform strategy

Active contributors: Douwe de Vries

Purpose

Transform strategy is the selected method for changing or preserving values in selected columns. It combines user intent, detected data type, and in-run state so repeated source values stay consistent while output remains local-first.

Directory layout

Path Role
crates/csv-anonymizer-core/src/types.rs Defines AnonymizationStrategy, ColumnControl, TransformContext, TransformReport, and process result types.
crates/csv-anonymizer-core/src/strategies/mod.rs Dispatches strategies and detected data types to concrete transformation functions.
crates/csv-anonymizer-core/src/strategies/state.rs Tracks pseudonym maps, opaque token maps, Smart replacements, reuse counts, collisions, and fallbacks.
crates/csv-anonymizer-core/src/strategies/structured.rs Handles email, UUID, timestamp, phone, generic string, and opaque token transformations.
crates/csv-anonymizer-core/src/strategies/numeric.rs Handles numeric ID and numeric value transformations while preserving width and decimal shape.
crates/csv-anonymizer-core/src/strategies/names.rs Handles first name, last name, and full name replacement from local name pools.
crates/csv-anonymizer-core/src/strategies/redaction.rs Selects typed redaction placeholders from data type or privacy evidence.

Key abstractions

  • AnonymizationStrategy supports auto, pseudonymize, tokenize, localAi, mask, redact, and passThrough.
  • ColumnControl carries frontend overrides for type and strategy.
  • TransformContext identifies column name, column index, row index, and empty-value format during transformation.
  • TransformState preserves per-run mapping consistency and accumulates TransformReport counters.
  • TransformReport feeds Privacy report counters for reuse, tokens, Smart replacement values, rejections, and fallbacks.
  • STRUCTURED_SCALAR_REDACTION_WARNING explains cases where redaction placeholders can change JSON or YAML scalar types.

How it works

graph TD
    Row[Input row] --> Selected{Column selected?}
    Selected -->|no| Original[Original value]
    Selected -->|yes| Strategy{Strategy}
    Strategy -->|passThrough| Original
    Strategy -->|mask| Mask[Mask characters]
    Strategy -->|redact| Placeholder[Typed placeholder]
    Strategy -->|tokenize| Token[Opaque token]
    Strategy -->|localAi| Smart{Map hit?}
    Smart -->|yes| Replacement[Smart replacement]
    Smart -->|no| Fallback[Rule-based fallback]
    Strategy -->|auto or pseudonymize| ByType[Detected-type transformer]
    Fallback --> ByType
Loading

transform_row_with_state walks each cell and only transforms selected columns. Empty values pass through. Explicit strategies run first. localAi uses TransformState::smart_replacement; a missing replacement records a fallback and then continues through rule-based pseudonymization. auto and pseudonymize dispatch by detected data type, while Boolean, currency, percentage, country code, and enum currently pass through.

Integration points

Entry points for modification

  • Add a strategy enum value in crates/csv-anonymizer-core/src/types.rs and mirror it in frontend/src/types.ts.
  • Change strategy dispatch in crates/csv-anonymizer-core/src/strategies/mod.rs.
  • Change per-run reuse, report counters, token generation, or Smart replacement fallback counts in crates/csv-anonymizer-core/src/strategies/state.rs.
  • Change structured value behavior in crates/csv-anonymizer-core/src/strategies/structured.rs.
  • Change numeric shape preservation in crates/csv-anonymizer-core/src/strategies/numeric.rs.
  • Change local name pools or name-token handling in crates/csv-anonymizer-core/src/strategies/names.rs.
  • Change redaction placeholder selection in crates/csv-anonymizer-core/src/strategies/redaction.rs.

Key source files

  • crates/csv-anonymizer-core/src/types.rs
  • crates/csv-anonymizer-core/src/strategies/mod.rs
  • crates/csv-anonymizer-core/src/strategies/state.rs
  • crates/csv-anonymizer-core/src/strategies/structured.rs
  • crates/csv-anonymizer-core/src/strategies/numeric.rs
  • crates/csv-anonymizer-core/src/strategies/names.rs
  • crates/csv-anonymizer-core/src/strategies/redaction.rs

Clone this wiki locally