Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate IBANs #3908

Open
tillprochaska opened this issue Jan 13, 2023 · 3 comments
Open

Validate IBANs #3908

tillprochaska opened this issue Jan 13, 2023 · 3 comments

Comments

@tillprochaska
Copy link
Contributor

tillprochaska commented Jan 13, 2023

ingest-file extracts IBANs using a rather simple regex. This can lead to a lot of false positives. ingest-file could add additional validation for matches in order to improve precision:

  • Validating the length depending on country
  • Validating checksums

We should consider that the text the extraction is performed on is often the result of OCR processing which may detect characters incorrectly. If an IBAN’s checksum isn’t correct, that may be due to OCR having misdetected a character etc.

@Okssana
Copy link

Okssana commented Jan 13, 2023

I would add:

  • a validation using the first two characters which stands for a country

@stchris
Copy link
Contributor

stchris commented Jan 19, 2023

Perhaps as a first step we could check out how far we would get by using a library like https://pypi.org/project/schwifty/

@stchris
Copy link
Contributor

stchris commented Jan 19, 2023

@Okssana what would greatly help here is a list of IBANs to test with, in either text or document form (PDFs, images). Would you be able to add some to this ticket if they come your way?

@stchris stchris transferred this issue from alephdata/ingest-file Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants