Fix incorrect PESEL checksum validation in PlPeselRecognizer#1520
Closed
BlaiseCz wants to merge 7 commits intomicrosoft:mainfrom
Closed
Fix incorrect PESEL checksum validation in PlPeselRecognizer#1520BlaiseCz wants to merge 7 commits intomicrosoft:mainfrom
BlaiseCz wants to merge 7 commits intomicrosoft:mainfrom
Conversation
Collaborator
|
Thanks! |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
note that we have another issue which affects the CI (to be fixed in #1522), however the pesel recognizer tests are failing too. Would you mind taking a look? |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Collaborator
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Contributor
There was a problem hiding this comment.
Pull Request Overview
The PR fixes an incorrect PESEL checksum validation in the PlPeselRecognizer that was causing valid PESEL numbers to be falsely rejected.
- Added input validation to check for an 11-digit number.
- Updated the checksum calculation to correctly compute the control digit.
- Modified the final check to compare the computed control digit with the last digit of the PESEL.
Collaborator
|
@BlaiseCz this is a great addition. Would you be interested in fixing the tests and merging this into the package? |
Collaborator
|
Closing due to missing tests, feel free to re-open if you're interested in completing the tests or get any support from the maintenance team. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug Description
The PESEL checksum validation in
PlPeselRecognizer.validate_result()is incorrect. The current implementation does not correctly compute the control digit, leading to false negatives, where valid PESEL numbers are incorrectly rejected.This affects Presidio's ability to correctly recognize and validate PESEL numbers, impacting anonymization and sensitive data detection.
To Reproduce
Run the following test:
**Note if unsure, check this: https://kalkulatory.gofin.pl/kalkulatory/sprawdzanie-pesel-weryfikacja-pesel
Observed Behavior
Falsefor a valid PESEL due to incorrect checksum computation.Expected Behavior
True.Root Cause: Incorrect Checksum Calculation
The issue lies in the final checksum validation step. The existing code:
This incorrectly compares
checksumdirectly to the last digit of PESEL instead of computing the correct control digit.Proposed Fix
The correct formula to compute the PESEL checksum is:
Why This Fix Works
Additional Context