Skip to content

feat(testing): bulk fixture-driven corpus tests for every masking rule #54

@millerjp

Description

@millerjp

Summary

Add a fixture-file corpus test harness to the repository so that every masking rule can be regression-tested against an arbitrarily large set of (input, expected_output) pairs without touching Go source code. New edge-cases reported by users or found during audits can be added by appending a single line to a text file — no code change required.


Background and motivation

The current test pyramid is solid:

  • Unit tests (rules_*_test.go) — cover the happy path and a curated set of fail-closed cases per rule.
  • BDD scenarios (tests/bdd/features/*.feature) — machine-readable specification examples from the requirements document; run with godog under the bdd build tag.
  • Matrix tests (rules_matrices_test.go) — cross-cutting idempotency and mask-character-override contracts for identity, financial, and health categories.
  • Fuzz targets (rules_fuzz_test.go) — invariant-checking (no panic, valid UTF-8, fail-closed) for parsing-heavy rules including email_address, phone_number, url, iban, postal_code, jwt_token, ipv6_address.

What is missing is a high-volume, human-editable fixture corpus. The BDD scenarios are excellent specifications, but they are deliberately minimal — each scenario demonstrates a distinct behavioural rule, not exhaustive format coverage. The unit tests are comprehensive for the inputs the original author thought of, but real-world data quickly surfaces formats nobody anticipated:

  • Phone numbers: 00352 vs +352, 352 bare, leading zeros, extension suffixes (+1-212-555-0100 x42), ITU-T vs NANP notation, dot separators (+1.212.555.0100), parentheses variants ((0044) 7911 123456), non-ASCII digits (Arabic-Indic ٠٧٩١١).
  • IBANs: space-separated vs compact, lowercase country codes, 2-char vs 34-char bodies across different countries.
  • Payment card PANs: 13-digit Visa, 15-digit Amex, 19-digit Maestro, formatted with spaces/hyphens/none.
  • Postal codes: every country's regex variants, leading zeros, extended ZIP+4, lowercase, mixed case.
  • IP addresses: leading-zero octets, IPv6 abbreviations, zone IDs, mapped addresses.
  • Dates of birth: all three spec formats (ISO, slash, month-name), partial dates, edge years.

When a new real-world input is reported as incorrectly masked, there is currently no lightweight way to pin the fix. The fix goes into the unit test, which requires a Go code change, a PR, and a review cycle. A plain-text fixture file lowers this friction to near zero.


Proposed design

Directory layout

tests/
  corpus/
    phone_number.txt
    mobile_phone_number.txt
    email_address.txt
    payment_card_pan.txt
    payment_card_pan_first6.txt
    payment_card_pan_last4.txt
    payment_card_cvv.txt
    payment_card_pin.txt
    iban.txt
    swift_bic.txt
    bank_account_number.txt
    uk_sort_code.txt
    us_aba_routing_number.txt
    monetary_amount.txt
    us_ssn.txt
    ca_sin.txt
    uk_nino.txt
    in_aadhaar.txt
    in_pan.txt
    au_medicare_number.txt
    sg_nric_fin.txt
    br_cpf.txt
    br_cnpj.txt
    mx_curp.txt
    mx_rfc.txt
    cn_resident_id.txt
    za_national_id.txt
    es_dni_nif_nie.txt
    ipv4_address.txt
    ipv6_address.txt
    mac_address.txt
    hostname.txt
    url.txt
    url_credentials.txt
    jwt_token.txt
    api_key.txt
    password.txt
    uuid.txt
    imei.txt
    imsi.txt
    msisdn.txt
    postal_code.txt
    geo_latitude.txt
    geo_longitude.txt
    geo_coordinates.txt
    date_of_birth.txt
    person_name.txt
    given_name.txt
    family_name.txt
    street_address.txt
    username.txt
    passport_number.txt
    driver_license_number.txt
    generic_national_id.txt
    tax_identifier.txt
    medical_record_number.txt
    health_plan_beneficiary_id.txt
    medical_device_identifier.txt
    diagnosis_code.txt
    prescription_text.txt

One file per rule. The rule name is the filename stem — the loader derives the rule name directly from the stem, so no mapping table is needed.

Fixture file format

Plain UTF-8 text, one pair per line:

# comment lines (start with #) and blank lines are ignored
<TAB-separated>  input<TAB>expected_output

Example — phone_number.txt:

# E.164 — international dialling prefix
+44 7911 123456	+44 **** **3456
+1-800-555-0199	+1-***-***-0199
+33 1 42 86 83 26	+33 * ** ** **26
+352 26 12 34	+352 ** **34

# NANP local format
(555) 123-4567	(***) ***-4567
555-123-4567	***-***-4567

# UK domestic (no + prefix)
07911 123456	***** **3456
0044 7911 123456	**** **** **3456

# 00-prefix international — with spaces (no + prefix, fail-closed)
00352 26 12 34	**************
0033 1 42 86 83 26	******************

# 00-prefix international — compact no spaces (also fail-closed — no + prefix)
00352261234	***********
00441234567890	**************

# Dot separator
+1.212.555.0100	+1.***.***.0100

# Fail-closed cases
1-800-FLOWERS	*************
+	*
+44	***
	
arabic-indic ٠٧٩١١	*********

Rules:

  • Lines beginning with # (after optional whitespace) are ignored.
  • Blank lines are ignored.
  • Fields are separated by a single tab (\t). Tab was chosen over comma because masked output can contain commas; it is chosen over pipe because the library targets log-scrubbing where pipes appear in DSNs and connection strings.
  • The input field may be empty (an empty string before the tab) — this encodes the """" contract.
  • The expected-output field may also be empty.
  • A line with no tab at all is a format error and must fail the test loudly (not silently skip).
  • Trailing newline is optional; \r\n line endings are normalised.

Go test runner

New file: tests/corpus/corpus_test.go

//go:build corpus

package corpus_test

import (
    "bufio"
    "os"
    "path/filepath"
    "strings"
    "testing"
    "unicode/utf8"

    "github.com/axonops/mask"
)

// TestCorpus iterates every tests/corpus/*.txt file, derives the rule
// name from the filename stem, and asserts that mask.Apply(rule, input)
// == expected for each non-comment, non-blank line.
func TestCorpus(t *testing.T) {
    t.Parallel()

    files, err := filepath.Glob("*.txt")
    if err != nil || len(files) == 0 {
        t.Fatal("no corpus files found — run from tests/corpus/")
    }

    for _, path := range files {
        path := path
        rule := strings.TrimSuffix(filepath.Base(path), ".txt")

        t.Run(rule, func(t *testing.T) {
            t.Parallel()
            runCorpusFile(t, rule, path)
        })
    }
}

func runCorpusFile(t *testing.T, rule, path string) {
    t.Helper()

    f, err := os.Open(path)
    if err != nil {
        t.Fatalf("open %s: %v", path, err)
    }
    defer f.Close()

    lineNo := 0
    passed, failed := 0, 0

    sc := bufio.NewScanner(f)
    for sc.Scan() {
        lineNo++
        raw := sc.Text()

        // Normalise Windows line endings.
        raw = strings.TrimRight(raw, "\r")

        // Skip blank lines and comment lines.
        trimmed := strings.TrimSpace(raw)
        if trimmed == "" || strings.HasPrefix(trimmed, "#") {
            continue
        }

        parts := strings.SplitN(raw, "\t", 2)
        if len(parts) != 2 {
            t.Errorf("%s:%d: malformed line (no tab separator): %q", path, lineNo, raw)
            failed++
            continue
        }

        input, want := parts[0], parts[1]
        got := mask.Apply(rule, input)

        if got != want {
            t.Errorf("%s:%d:\n  rule:  %s\n  input: %q\n  want:  %q\n  got:   %q",
                path, lineNo, rule, input, want, got)
            failed++
        } else {
            passed++
        }

        // Invariant: output is always valid UTF-8.
        if !utf8.ValidString(got) {
            t.Errorf("%s:%d: rule %s produced invalid UTF-8 for input %q: % x",
                path, lineNo, rule, input, []byte(got))
        }
    }
    if err := sc.Err(); err != nil {
        t.Fatalf("%s: scanner error: %v", path, err)
    }

    t.Logf("corpus %s: %d passed, %d failed (total %d)", rule, passed, failed, passed+failed)

    if failed > 0 {
        t.Logf("To update: edit %s and re-run 'make test-corpus'", path)
    }

    if passed+failed == 0 {
        t.Errorf("%s: file contained no test cases — add at least one fixture or delete the file", path)
    }
}

// TestCorpusCompleteness verifies that every rule registered in
// rule_names.go has a corresponding corpus fixture file. This prevents
// adding a new masking rule without a corpus file.
func TestCorpusCompleteness(t *testing.T) {
    files, err := filepath.Glob("*.txt")
    if err != nil {
        t.Fatal(err)
    }
    have := make(map[string]bool, len(files))
    for _, f := range files {
        have[strings.TrimSuffix(filepath.Base(f), ".txt")] = true
    }

    for _, name := range mask.RuleNames() {
        if !have[name] {
            t.Errorf("rule %q has no corpus file — create tests/corpus/%s.txt", name, name)
        }
    }
}

// TestCorpusUnknownRule verifies that a corpus file whose name does not
// match any registered rule fails immediately with a clear error rather
// than silently returning [REDACTED] for every line.
func TestCorpusUnknownRule(t *testing.T) {
    unknown := "zz_unknown_rule_for_test"
    result := mask.Apply(unknown, "anything")
    // mask.Apply returns full-redact for unknown rules. Verify the
    // exact value matches whatever the library currently returns so
    // this test doesn't depend on an unexported constant.
    if result == "anything" {
        t.Fatal("expected full-redact for unknown rule, got original value back")
    }
}

// BenchmarkCorpus_PhoneNumber provides a rough throughput figure for the
// phone_number corpus to catch any unexpected O(n²) behaviour introduced
// when the corpus grows.
func BenchmarkCorpus_PhoneNumber(b *testing.B) {
    f, err := os.Open("phone_number.txt")
    if err != nil {
        b.Skip("phone_number.txt not present")
    }
    defer f.Close()

    type pair struct{ in, want string }
    var cases []pair
    sc := bufio.NewScanner(f)
    for sc.Scan() {
        raw := strings.TrimRight(sc.Text(), "\r")
        trimmed := strings.TrimSpace(raw)
        if trimmed == "" || strings.HasPrefix(trimmed, "#") {
            continue
        }
        parts := strings.SplitN(raw, "\t", 2)
        if len(parts) == 2 {
            cases = append(cases, pair{parts[0], parts[1]})
        }
    }
    if len(cases) == 0 {
        b.Skip("no cases")
    }
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        c := cases[i%len(cases)]
        got := mask.Apply("phone_number", c.in)
        if got != c.want {
            b.Fatalf("mismatch: input=%q want=%q got=%q", c.in, c.want, got)
        }
    }
}

Note on the corpus build tag: This test file is gated behind //go:build corpus, meaning go test ./... will NOT run corpus tests. This is intentional — corpus tests are a separate quality gate. Contributors must use make test-corpus (or make check, which includes it). The CONTRIBUTING.md note (see below) must make this explicit.

Note on TestCorpusCompleteness: This test requires a mask.RuleNames() function that returns all registered built-in rule names. If this function does not already exist, it must be added as part of this issue. It should return a []string of all built-in rule name constants from rule_names.go.

Note on TestCorpusUnknownRule: The original version referenced mask.FullRedactMarker. To avoid depending on an unexported or potentially non-existent constant, the test instead verifies the result is not the original input value. If FullRedactMarker is an exported constant, prefer using it directly.

Makefile targets

CORPUS_PKG := ./tests/corpus/...

.PHONY: test-corpus
test-corpus: ## Run corpus fixture tests
	@if [ -d tests/corpus ]; then \
		cd tests/corpus && $(GO) test -race -count=1 -tags corpus .; \
	else \
		echo "tests/corpus not present yet — skipping corpus run"; \
	fi

.PHONY: check
check: fmt-check vet lint tidy-check test test-bdd test-corpus coverage security

Add to the check target so corpus runs as part of the full quality gate.

CI workflow addition

In .github/workflows/ci.yml, add a step parallel to the existing bdd job:

corpus:
  name: Corpus fixture tests
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - uses: actions/setup-go@v5
      with:
        go-version-file: go.mod
    - name: Run corpus tests
      run: make test-corpus

Ensure the CI step runs from the repo root (the Makefile cd tests/corpus && handles the working directory).


Initial seed fixtures — priority matrix

The table below ranks categories by their real-world format surface area and therefore the return-on-investment for seeding fixtures first.

Priority Category Rule(s) Why high value
🔴 P1 Telecom phone_number, mobile_phone_number International dialling prefixes, 00NNN vs +NNN, dot separators, extensions, domestic formats for every major country code
🔴 P1 Financial payment_card_pan, iban, swift_bic Card scheme length variants (13/15/16/19), space vs hyphen vs no separator; IBAN per-country length (GB=22, DE=22, FR=27, NL=18, etc.)
🔴 P1 Country identity us_ssn, uk_nino, br_cpf, in_aadhaar Dashed vs compact, uppercase vs lowercase, leading zeros
🟠 P2 Technology ipv4_address, ipv6_address, url, jwt_token IPv6 abbreviation forms; URL userinfo with special chars; JWTs with non-standard padding
🟠 P2 Telecom postal_code UK outcodes (1-2 letters + 1-2 digits), US ZIP+4, Canadian FSA+LDU
🟡 P3 Identity date_of_birth, email_address, person_name ISO vs slash vs month-name dates; email punycode domains; CJK names
🟡 P3 Health medical_record_number, health_plan_beneficiary_id Prefix walk on non-ASCII letters (documented edge case)

Phone number seed examples (illustrative — not exhaustive)

The following groups illustrate the kind of cases the phone_number.txt file should cover. This is not the complete file — it is a structured catalogue to guide the initial author.

# ── E.164 (leading +, up to 3-digit country code) ─────────────────────────
+1 212 555 0100         +1 *** ***0100
+1-212-555-0100         +1-***-***0100
+1.212.555.0100         +1.***.***.0100
+44 7911 123456         +44 **** **3456
+44-7911-123456         +44-****-**3456
+352 26 12 34           +352 ** **34
+33 1 42 86 83 26       +33 * ** ** **26
+81 3 1234 5678         +81 * **** **78
+86 138 0013 8000       +86 *** **** **00
+49 89 636 48018        +49 ** *** ***18
+55 11 91234 5678       +55 ** ***** **78
+61 2 9374 4000         +61 * **** **00
+7 495 123-45-67        +7 *** ***-**-67
+34 91 123 45 67        +34 ** *** **67
+39 02 1234 5678        +39 ** **** **78
+27 21 123 4567         +27 ** *** **67
+971 4 123 4567         +971 * *** **67
+82 2 1234 5678         +82 * **** **78
+65 6234 5678           +65 **** **78
+91 98765 43210         +91 ***** **210

# ── 00-prefix with spaces (ITU-T alternative to +) ───────────────────────
# These do NOT match the +NN prefix rule — no + is present.
# Expected output per fail-closed contract: same-length mask.
00352 26 12 34          **************
0044 7911 123456        ****************
001 212 555 0100        ****************

# ── 00-prefix compact (no spaces, no + prefix) ───────────────────────────
# Also fail-closed — pins the behaviour for the compact variant explicitly.
# If the rule is later taught to treat 00 as equivalent to +, these fixtures
# will surface the change as a deliberate, visible regression.
00352261234             ***********
00441234567890          **************
0013105551234           *************

# ── NANP domestic ─────────────────────────────────────────────────────────
(555) 123-4567          (***) ***-4567
555-123-4567            ***-***-4567
5551234567              ******4567
555 123 4567            *** ***4567

# ── UK domestic ───────────────────────────────────────────────────────────
07911 123456            ***** **3456
020 7946 0958           *** **** **58
01632 960961            ***** ***61

# ── Fail-closed cases ─────────────────────────────────────────────────────
1-800-FLOWERS           *************
+                       *
+44                     ***
	
٠٧٩١١ ١٢٣٤٥٦          ************

Note on 00NNN vs +NNN: the current phone_number rule requires a literal + to identify the country code prefix. 00352 has no +, so the entire string is treated as a body — if it fails the digit-only check (because it contains spaces), it routes to SameLengthMask. The corpus file pins both the spaced and compact 00-prefix variants so any future change (e.g. teaching the rule to treat 00 as equivalent to +) is a deliberate, visible regression.


IBAN seed examples

The IBAN rule preserves first 4 (country code + check digits) and last 4 non-separator chars. Fixtures should cover all country codes in ISO 13616 because body length varies from 15 (Norway) to 34 (Malta).

Note: The v0.9.0 requirements document states that IBAN should "preserve grouping spaces if present." The implementer must verify whether the current implementation handles space-grouped IBANs. If it does, the space-grouped fixture below should show preserved separators with masking applied. If it does not, the fixture should show fail-closed behaviour. Either way, the corpus pins the actual current behaviour — any future change to add or remove space handling will surface as a regression.

# Compact form
GB82WEST12345698765432  GB82**************5432
DE89370400440532013000  DE89**************3000
FR7630006000011234567890189  FR76*******************89
NL91ABNA0417164300      NL91**********4300
NO9386011117947         NO93*******7947
MT84MALT011000012345MTLCAST001S  MT84**********************001S

# Space-grouped — verify current behaviour and pin it
# If the rule handles spaces: preserve separators with masking
# If the rule does not: fail closed to same-length mask
# The implementer MUST run the rule against these inputs and record
# the actual output as the expected value.
GB82 WEST 1234 5698 7654 32  <VERIFY_AND_PIN_ACTUAL_OUTPUT>

# Lowercase country code (fail closed)
gb82WEST12345698765432  **********************

# Too short (fail closed)
GB82WEST               ********

Minimum fixture count — tiered by rule complexity

Not all rules have the same format surface area. The minimum fixture count is tiered:

Tier Rule type Minimum fixtures Examples
Simple Full-redact or same-length wrappers 10 payment_card_cvv, payment_card_pin, password, private_key_pem, diagnosis_code, prescription_text, monetary_amount
Standard Format-aware with limited variants 20 us_ssn, uk_nino, bank_account_number, mac_address, uuid, username, given_name, family_name
High-variance Multiple separator styles, international formats, complex parsing 50+ phone_number, iban, payment_card_pan, postal_code, email_address, url, ipv6_address, date_of_birth, person_name

CONTRIBUTING.md addition

Add the following section to CONTRIBUTING.md under the testing guidance:

### Corpus Fixture Tests

The `tests/corpus/` directory contains bulk fixture files for regression testing. Each file
is named after a masking rule (e.g., `phone_number.txt`) and contains tab-separated
`input<TAB>expected_output` pairs, one per line.

**To run corpus tests:**

```bash
make test-corpus

Important: go test ./... does NOT run corpus tests — they are gated behind the
corpus build tag. Always use make test-corpus or make check (which includes it).

To add a fixture for a bug report:

  1. Identify the masking rule (e.g., phone_number).
  2. Determine the expected masked output for the problematic input.
  3. Append a line to tests/corpus/<rule_name>.txt:
    +33 (0)1 42 86 83 26	+33 (*)*  ** ** **26
    
  4. Run make test-corpus to confirm the test fails (red).
  5. Fix the rule implementation.
  6. Run make test-corpus to confirm the test passes (green).
  7. Open a PR. The fixture line IS the regression test.

***

## Contribution workflow

Once the harness is in place, the workflow for a reported bug becomes:

1. Receive report: `phone_number` applied to `+33 (0)1 42 86 83 26` returns wrong output.
2. Confirm the bug locally: `echo '+33 (0)1 42 86 83 26\t<expected>' >> tests/corpus/phone_number.txt && make test-corpus`.
3. Fix the rule in `rules_telecom.go`.
4. Re-run `make test-corpus` — green.
5. Open a PR. The fixture line is the regression test. No Go test code written.

***

## Out of scope for this issue

- A generator script that auto-produces fixture lines from public number registries (useful, separate issue).
- Integration with the fuzz seed corpus (`testdata/fuzz/`) — the fuzz targets already have their own seeds and the corpus files serve a different purpose (exact input/output pinning rather than invariant checking).
- A `--update` / golden-file rewrite flag — the corpus format is deliberately simple and human-editable; an update flag would undermine the regression-detection purpose.

***

## Implementation notes

- **`mask.RuleNames()` function:** The `TestCorpusCompleteness` test requires a function that returns all registered built-in rule names as a `[]string`. If this function does not already exist in the public API, it must be added as part of this issue. It should return the names from the `RuleXxx` constants in `rule_names.go`.
- **Working directory:** The Makefile `test-corpus` target uses `cd tests/corpus &&` to set the working directory. The CI step runs from the repo root and relies on the Makefile to handle this. Do not set a custom working directory in the CI workflow step.
- **IBAN space-grouped behaviour:** The implementer must verify whether the current `iban` rule handles space-grouped input or fails closed, and pin whichever behaviour exists. If this reveals a gap between the requirements doc and the implementation, file a separate issue to track the discrepancy.

***

## Acceptance criteria

- [ ] `tests/corpus/` directory exists with one `.txt` file per rule (matching every `RuleXxx` constant in `rule_names.go`).
- [ ] `tests/corpus/corpus_test.go` compiles under `-tags corpus` and passes `go vet`.
- [ ] `TestCorpusCompleteness` exists and verifies every rule in `mask.RuleNames()` has a corresponding `.txt` file.
- [ ] `mask.RuleNames()` function exists in the public API if it does not already (returns `[]string` of all built-in rule names).
- [ ] `make test-corpus` passes on a clean checkout.
- [ ] `make check` includes `test-corpus`.
- [ ] CI `corpus` job is green on `main`.
- [ ] Fixture count meets the tiered minimum: 10 for simple rules, 20 for standard rules, 50+ for high-variance rules.
- [ ] `phone_number.txt` covers `+NNN`, NANP, UK domestic, dot separators, and explicitly pins both spaced and compact `00NNN` fail-closed behaviour.
- [ ] `iban.txt` covers at least GB, DE, FR, NL, NO, and MT (range of body lengths), and pins the space-grouped behaviour (whichever it currently is).
- [ ] `payment_card_pan.txt` covers 13-digit (Visa electron), 15-digit (Amex), 16-digit (Visa/MC), 19-digit (Maestro), all separator variants.
- [ ] `CONTRIBUTING.md` is updated with the corpus fixture section explaining how to add fixtures, with an explicit note that `make test-corpus` (not `go test ./...`) is the command to run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    testingTests and testing infrastructure

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions