Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to validate Patient IDs against Estonian Patient ID schema in conditions for various rules #3168

Closed
gunterze opened this issue Apr 28, 2021 · 1 comment
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@gunterze
Copy link
Member

gunterze commented Apr 28, 2021

Support to validate Patient IDs against Estonian Patient ID schema in conditions for various rules. Configurable by condition property:

PatientID=§PATIENT_ID_EST - true, if Patient ID is compliant with Estonian Patient ID schema
PatientID!=§PATIENT_ID_EST - true, if Patient ID is not compliant with Estonian Patient ID schema

Estonian Patient ID schema:

            final Pattern PATTERN = Pattern.compile("[1-6]\\d\\d(0[1-9]|1[0-2])(0[1-9]|[12]\\d|3[01])\\d\\d\\d\\d");

            @Override
            public boolean test(String s) {
                return PATTERN.matcher(s).matches() && verifyChecksum(s.toCharArray());
            }

            private boolean verifyChecksum(char[] chars) {
                int checksum = checksum(chars, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1);
                if (checksum == 10) {
                    checksum = checksum(chars, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3);
                    if (checksum == 10)
                        checksum = 0;
                }
                return chars[10] - '0' == checksum;
            }

            private int checksum(char[] chars, int... weights) {
                int sum = 0;
                for (int i = 0; i < 10; i++) {
                    sum += (chars[i] - '0') * weights[i];
                }
                return sum % 11;
            }
@gunterze gunterze added the enhancement New feature or request label Apr 28, 2021
@gunterze gunterze added this to the 5.23.3 milestone Apr 28, 2021
@gunterze gunterze self-assigned this Apr 28, 2021
gunterze added a commit that referenced this issue Apr 28, 2021
@petrkalina
Copy link
Collaborator

petrkalina commented May 4, 2021

note: the check can potentially succeed on a random number if:

  • it has 11 numerical characters
  • matches "[1-6]\\d\\d(0[1-9]|1[0-2])(0[1-9]|[12]\\d|3[01])\\d\\d\\d\\d" pattern
  • the last number is valid EE checksum of the previous numbers (on a random number matching the above the probability of this randomly is 10%)

.. in overall, on the space of all PIDs, the P(false-positive) probability i:

P(false positive rnd) = 
P(syntax matches accidentally) * P(validation succeeds accidentally) =
P(first number is 1..6) * P(birth month is a valid month i.e. 1..12) * P(birth day is valid day i.e. 1..31)` * P(checksum matches) =
P(is 11 numerical digits) * 6/10 * 12/100 * 31/100 * 0.1 =
P(11 digits) * 0,002232

on existing PID DB, the P(false positive) can be estimated as

[count(invalid PIDs with valid format)  /  count(all PIDs)] * P(checksum accidentally matches) =
[count(invalid PIDs with valid format)  /  count(all PIDs)] * 0.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants