Skip to content

Conversation

@ericpsimon
Copy link
Contributor

Summary

This PR implements Social Security Number (SSN) pattern detection for the FormatConstraint in Term, providing Deequ-compatible containsSocialSecurityNumber() functionality.

Changes

🎯 New SSN Detection Feature

  • FormatType::SocialSecurityNumber: New format type variant for SSN validation
  • Pattern Matching: Supports both hyphenated (XXX-XX-XXXX) and non-hyphenated (XXXXXXXXX) formats
  • Invalid SSN Exclusion: Automatically excludes known invalid SSNs:
    • Numbers starting with 000, 666, or 9xx
    • Middle digits of 00
    • Last four digits of 0000
  • Threshold-based Validation: Configurable threshold for percentage of valid SSNs required

🔧 API Additions

  • FormatConstraint::social_security_number() - Direct builder method for SSN validation
  • CheckBuilder::contains_ssn() - Convenience method on Check builder for easy SSN validation
  • Full integration with existing FormatConstraint infrastructure

✅ Comprehensive Testing

  • Valid SSN format testing (hyphenated and non-hyphenated)
  • Invalid pattern rejection testing
  • Mixed data validation with proper null handling
  • Threshold-based validation tests
  • Edge case coverage including historically used example SSNs

Test Plan

  • Test valid SSN formats (XXX-XX-XXXX and XXXXXXXXX)
  • Verify invalid SSN patterns are rejected (000-xx-xxxx, xxx-00-xxxx, 666-xx-xxxx, 9xx-xx-xxxx)
  • Test threshold-based validation behavior
  • Verify null value handling with FormatOptions
  • Test edge cases and mixed data scenarios
  • All tests passing with cargo test test_ssn

Related

  • Linear Ticket: TER-338
  • Implements Deequ-compatible SSN detection for data validation

🤖 Generated with Claude Code

…Constraint

- Add FormatType::SocialSecurityNumber variant for SSN validation
- Support both hyphenated (XXX-XX-XXXX) and non-hyphenated (XXXXXXXXX) formats
- Automatically exclude known invalid SSNs (000-xx-xxxx, 666-xx-xxxx, 9xx-xx-xxxx, etc.)
- Add FormatConstraint::social_security_number() method for direct SSN validation
- Add CheckBuilder::contains_ssn() convenience method for builder pattern
- Add comprehensive test coverage with 5 test cases
- Update documentation in CHANGELOG, constraints reference, and migration guide
- Implement Deequ-compatible containsSocialSecurityNumber() functionality
@ericpsimon ericpsimon merged commit 31cca5d into main Sep 30, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant