-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added inference function for Phone Number logical type #1357
Conversation
# Conflicts: # docs/source/release_notes.rst
# Conflicts: # docs/source/release_notes.rst
Codecov Report
@@ Coverage Diff @@
## main #1357 +/- ##
=======================================
Coverage 99.95% 99.95%
=======================================
Files 94 94
Lines 9680 9711 +31
=======================================
+ Hits 9676 9707 +31
Misses 4 4
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work with this! I like the diversity of phone numbers that you are able to capture with this regex, and the tests look great to me! Left a few nits, but nothing blocking.
@@ -10,7 +10,7 @@ | |||
from woodwork.type_sys.utils import _get_ltype_class, _get_specified_ltype_params | |||
from woodwork.utils import _is_s3, _is_url | |||
|
|||
SCHEMA_VERSION = "11.3.0" | |||
SCHEMA_VERSION = "12.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this schema update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant is used for some tests to validate that the correct logical types are being inferred based on what we know they should be able to infer when performing deserialization.
For example, test_deserialize_validation_control
here is validating that this URL location in S3 has a csv that can be read in, as well as typing information about the logical types in the csv. I uploaded this to the S3 bucket as the latest update of the typing information (which contains the PhoneNumber logical type).
# Current inference function does not match lack of area code | ||
invalid_row = pd.Series( | ||
{17: "252 9384", 18: "+1 194 129 1991", 19: "+001 236 248 8482"}, | ||
name="phone_number", | ||
).astype(dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we support phone numbers with country codes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this supports country codes but the only one we currently support is +1 which is USA/Canada (and several Caribbean nations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason that 18 is on there is because 18 has an invalid area code, USA area codes have to begin with a number between 2 and 9 inclusive.
Regarding 19, now that I think about it +001 is a valid country code for the US if someone is calling from another nation, so I think I'll update the regex to reflect this, thanks for pointing it out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, that makes sense. Thanks for explaining!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closes #1146
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rst
to include this pull request.