-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update address regex in the standard pii policy #150
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just need to get that test to pass and this looks good - thanks for the fix!
Looks like the test that's failing is the one doing the Regex check on the address and making sure there isn't a match on the incorrect address examples. Not sure why the first would be considered incorrect. Recommendation would be a test that matches all or a portion of a correct address as I'd personally want to catch potential mistakes via fat fingering or the way a system may input the data from a system. An example would be if Street address, city, state and postal code are in different columns, it probably wouldn't get loaded in to a database correctly. Failing Test:
|
I would agree - the fields shown as invalid addresses are still "address-like" and as such, should be flagged -> which apparently they now are! We have two options here:
Personally, I feel as though option 2 keeps the spirit of the test alive. |
I like the second idea. I updated the test criteria to be more incorrect as you recommended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated so regex tests pass. The old descriptions were too address adjacent and likely that users could have those in an actual dataset. The changes are more obviously fake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the test that was failing for identifying an "incorrect address" but was landing on a valid portion of it.
Description
This update is a change to the templated Standard PII policy that will more accurately find US based addresses
Problem
Testing
Using the test dataset below which is contained in my Mantium application, I ran both the template standard PII policy , which includes an address scan, and my own custom policy. In comparison, I was able to correctly identify the addresses in the test dataset using my regex whereas in the template, none were identified.
Standard Policy
Custom Policy
This contains the new regex
Test Dataset