feat: add Turkish National ID (TR_NATIONAL_ID) recognizer#1995
Conversation
- Add TrNationalIdRecognizer with NVI checksum validation - Add country_specific/turkey/ directory structure - Add unit tests with valid/invalid TCKN cases - Update default_recognizers.yaml, __init__.py, supported_entities.md, CHANGELOG.md Part of microsoft#1973
|
@microsoft-github-policy-service agree |
|
Thanks for the CLA approval. This PR is now ready for review. Happy to make any changes if needed, especially around naming or checksum validation logic. |
There was a problem hiding this comment.
Pull request overview
Adds a new country-specific predefined recognizer to Presidio Analyzer for detecting Turkish National Identification Numbers (TCKN / TR_NATIONAL_ID), including checksum validation and registration in the default recognizer configuration and public docs.
Changes:
- Introduces
TrNationalIdRecognizerwith regex detection + checksum validation and context terms. - Registers the recognizer in
predefined_recognizers/__init__.pyandconf/default_recognizers.yaml(disabled by default). - Adds unit tests and updates
supported_entities.mdandCHANGELOG.md.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/turkey/tr_national_id_recognizer.py | Implements the TR national ID recognizer (pattern + checksum validation + context). |
| presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/turkey/init.py | Exposes the Turkey-specific recognizer module. |
| presidio-analyzer/presidio_analyzer/predefined_recognizers/init.py | Exports TrNationalIdRecognizer from the predefined recognizers package. |
| presidio-analyzer/presidio_analyzer/conf/default_recognizers.yaml | Adds recognizer to default registry (disabled by default). |
| presidio-analyzer/tests/test_tr_national_id_recognizer.py | Adds unit tests for detection and checksum validation. |
| docs/supported_entities.md | Documents TR_NATIONAL_ID under a new “Turkey” section. |
| CHANGELOG.md | Notes the addition under the Unreleased Analyzer additions. |
|
@copilot Thank you for the review! Regarding the NVI algorithm source: I've added a reference to the official NVI portal (https://tckimlik.nvi.gov.tr/) in the docstring at line 15-16 to provide a concrete source for the checksum validation logic. Regarding the false-positive tests: I've updated the PR description to remove the phone number reference since those test cases aren't included in the test suite. The current tests cover the essential false-positive scenarios (short/long inputs, non-digits, invalid checksums). Happy to make any additional adjustments if needed! |
Adds Turkish National ID (TCKN / TC Kimlik Numarası) recognizer to Presidio Analyzer.
The TCKN is an 11-digit identification number issued to Turkish citizens and foreign residents. It is the primary PII under Turkish KVKK (Personal Data Protection Law).
Features:
Issue reference
Part of #1973
Testing
test_tr_national_id_recognizer.pywith >=90% coverage on changed linesChecklist
default_recognizers.yamlwithenabled: false__init__.pyand__all__