Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility for autoCorrection #22

Closed
lpatiny opened this issue Dec 8, 2022 · 3 comments · Fixed by #29
Closed

Possibility for autoCorrection #22

lpatiny opened this issue Dec 8, 2022 · 3 comments · Fixed by #29
Assignees

Comments

@lpatiny
Copy link
Member

lpatiny commented Dec 8, 2022

Some symbols are difficult to differentiate in the MRZ like the '0' or 'O'.

This PR https://github.com/cheminfo/mrz/pull/21/files#diff-f76f9fb30ec34fcebf47111d3562b71aa630be9dff89cd28fbc4d812ac2b61cbR39-R41 proposes to fix some mistakes as described specifically here:

https://github.com/cheminfo/mrz/pull/21/files#diff-f76f9fb30ec34fcebf47111d3562b71aa630be9dff89cd28fbc4d812ac2b61cbR39-R41

For us it is however important that this 'autoCorrection' is optional (second arguments in the methods, options={}) and that the end-user can be aware that it has been changed in the result.details.

This 'autoCorrection' should also work all the time it can. Meaning in TD1, TD2 and TD3 as well as from letters to numbers or from numbers to letters if a field can only contain one of the 2.

@lpatiny
Copy link
Member Author

lpatiny commented Jan 5, 2023

We need to be very rigorous and even if we have the option autocorrect (default false) we need to know exactly what has been corrected in the detailed information (array

This means that in result.details we can add the autocorrect information (default empty array if no correction)

 {
        label: 'Composite check digit',
        field: 'compositeCheckDigit',
        value: '6',
        valid: true,
        ranges: [ [Object], [Object], [Object], [Object] ],
        line: 1,
        start: 35,
        end: 36,
        autocorrect: [{line: 1, column: 35, original: '0', corrected: 'O'}]
      }

For autocorrection we also need to know for each field if it is only numbers, characters or both. Only field that contains only numbers or characters can be auto corrected.

For all the fields (and likely also at the level of fieldTemplates) we will probably need a property that specifies the allowed values (NUMERIC, CHARACTERS, BOTH)

We may have to go to the original specification to be sure maybe.

@1000i100
Copy link

1000i100 commented Jan 14, 2023

Only field that contains only numbers or characters can be auto corrected.

I'm not agree :
if the check number is a letter, you can correct it.
If you have the check number, you can switch each occurrence of [O0] in alphanumerical field to see if there is a matching with the computed check-number. This way, you can try to auto-correct alphanumerical fields.

@lpatiny
Copy link
Member Author

lpatiny commented Jan 16, 2023

Only field that contains only numbers or characters can be auto corrected.

I'm not agree : if the check number is a letter, you can correct it. If you have the check number, you can switch each occurrence of [O0] in alphanumerical field to see if there is a matching with the computed check-number. This way, you can try to auto-correct alphanumerical fields.

This would not be scientifically correct because if you have 5 0 or O (which is not unlikely) in the serial number you should try the 2^5 possibilities and it is statistically likely that one of them will match the check digit.

So the feature that you suggest would rather be another option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants