Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added inference function for Phone Number logical type #1357

Merged
merged 14 commits into from
Mar 29, 2022

Conversation

ParthivNaresh
Copy link
Collaborator

Closes #1146


After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of docs/source/release_notes.rst to include this pull request.

@CLAassistant
Copy link

CLAassistant commented Mar 24, 2022

CLA assistant check
All committers have signed the CLA.

@ParthivNaresh ParthivNaresh self-assigned this Mar 25, 2022
@codecov
Copy link

codecov bot commented Mar 28, 2022

Codecov Report

Merging #1357 (379a01b) into main (ac2af2f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #1357   +/-   ##
=======================================
  Coverage   99.95%   99.95%           
=======================================
  Files          94       94           
  Lines        9680     9711   +31     
=======================================
+ Hits         9676     9707   +31     
  Misses          4        4           
Impacted Files Coverage Δ
woodwork/config.py 100.00% <ø> (ø)
woodwork/tests/accessor/test_serialization.py 100.00% <ø> (ø)
woodwork/type_sys/type_system.py 100.00% <ø> (ø)
woodwork/logical_types.py 100.00% <100.00%> (ø)
woodwork/serializers/serializer_base.py 100.00% <100.00%> (ø)
woodwork/tests/accessor/test_table_accessor.py 100.00% <100.00%> (ø)
woodwork/tests/conftest.py 100.00% <100.00%> (ø)
woodwork/tests/logical_types/test_logical_types.py 100.00% <100.00%> (ø)
woodwork/tests/schema/test_table_schema.py 100.00% <100.00%> (ø)
woodwork/type_sys/inference_functions.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ac2af2f...379a01b. Read the comment docs.

@ParthivNaresh ParthivNaresh marked this pull request as ready for review March 29, 2022 11:27
Copy link
Contributor

@bchen1116 bchen1116 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work with this! I like the diversity of phone numbers that you are able to capture with this regex, and the tests look great to me! Left a few nits, but nothing blocking.

woodwork/logical_types.py Outdated Show resolved Hide resolved
@@ -10,7 +10,7 @@
from woodwork.type_sys.utils import _get_ltype_class, _get_specified_ltype_params
from woodwork.utils import _is_s3, _is_url

SCHEMA_VERSION = "11.3.0"
SCHEMA_VERSION = "12.0.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this schema update?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant is used for some tests to validate that the correct logical types are being inferred based on what we know they should be able to infer when performing deserialization.

For example, test_deserialize_validation_control here is validating that this URL location in S3 has a csv that can be read in, as well as typing information about the logical types in the csv. I uploaded this to the S3 bucket as the latest update of the typing information (which contains the PhoneNumber logical type).

Comment on lines 349 to 353
# Current inference function does not match lack of area code
invalid_row = pd.Series(
{17: "252 9384", 18: "+1 194 129 1991", 19: "+001 236 248 8482"},
name="phone_number",
).astype(dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we support phone numbers with country codes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this supports country codes but the only one we currently support is +1 which is USA/Canada (and several Caribbean nations)

Copy link
Collaborator Author

@ParthivNaresh ParthivNaresh Mar 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason that 18 is on there is because 18 has an invalid area code, USA area codes have to begin with a number between 2 and 9 inclusive.

Regarding 19, now that I think about it +001 is a valid country code for the US if someone is calling from another nation, so I think I'll update the regex to reflect this, thanks for pointing it out!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, that makes sense. Thanks for explaining!

@jeff-hernandez jeff-hernandez self-requested a review March 29, 2022 21:17
Copy link
Contributor

@jeff-hernandez jeff-hernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ParthivNaresh ParthivNaresh merged commit 94fa8cc into main Mar 29, 2022
@ParthivNaresh ParthivNaresh mentioned this pull request Apr 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add PhoneNumber inference function
5 participants