Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow TitleCase and camelCase #1854

Merged
merged 18 commits into from
Jan 28, 2022
Merged

Allow TitleCase and camelCase #1854

merged 18 commits into from
Jan 28, 2022

Conversation

tuethan1999
Copy link
Contributor

@tuethan1999 tuethan1999 commented Jan 20, 2022

Pull Request Description

Fixes #1761
implementation that allowed all variations of case and underscore: #1841

Feature Request

Provide convenience to a user by allowing them to pass a primitive name in either snake_case, TitleCase or camelCase

Implementation

Reduce TitleCase and camelCase into snake_case before matching them internally. This means we can't accept primitive names like "cOuNt" because it will be converted to "c_ou_nt". I think this is the desired behavior though

Assumptions:

  • primitive name is camel case

@tuethan1999 tuethan1999 changed the title support camel and titlecase only support camel snake and titlecase only Jan 20, 2022
@tuethan1999 tuethan1999 changed the title support camel snake and titlecase only Allow TitleCase and camelCase Jan 20, 2022
@codecov
Copy link

codecov bot commented Jan 21, 2022

Codecov Report

Merging #1854 (07d9102) into main (ec8a72b) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1854      +/-   ##
==========================================
+ Coverage   98.82%   98.83%   +0.01%     
==========================================
  Files         147      147              
  Lines       16290    16291       +1     
==========================================
+ Hits        16099    16102       +3     
+ Misses        191      189       -2     
Impacted Files Coverage Δ
featuretools/synthesis/deep_feature_synthesis.py 98.62% <100.00%> (+0.38%) ⬆️
featuretools/synthesis/utils.py 100.00% <100.00%> (ø)
...ols/tests/synthesis/test_deep_feature_synthesis.py 99.32% <100.00%> (-0.01%) ⬇️
featuretools/tests/synthesis/test_dfs_method.py 99.54% <100.00%> (+<0.01%) ⬆️
featuretools/tests/utils_tests/test_gen_utils.py 100.00% <100.00%> (ø)
featuretools/utils/gen_utils.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec8a72b...07d9102. Read the comment docs.

@gsheni gsheni requested a review from a team January 21, 2022 16:59
Comment on lines 44 to 46
def camel_and_title_to_snake(name):
name = re.sub('(.)([A-Z][a-z]+)', r'\1_\2', name)
return re.sub('([a-z0-9])([A-Z])', r'\1_\2', name).lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect "Top3Words" to be converted to "top_3_words" but currently this will return "top3_words"

@rwedge
Copy link
Contributor

rwedge commented Jan 21, 2022

A test just for the camel_and_title_to_snake method with a few inputs could be good

@gsheni gsheni assigned dvreed77 and unassigned tuethan1999 Jan 24, 2022
@dvreed77 dvreed77 requested a review from rwedge January 26, 2022 15:00
@tamargrey
Copy link
Contributor

Is there somewhere in the documentation that we can put the new constraints on what primitive strings will be recognized? I guess we never specifically mentioned that they had to me the same as primitve.name in the past, but now that there's more than one valid option, it might be nice to have the information written down somewhere.

@tamargrey
Copy link
Contributor

@dvreed77 Not sure if we need to handle this case, but wanted to note a potentially problematic string for camel_and_title_to_snake: "NaN". Something like ft.utils.gen_utils.camel_and_title_to_snake('hasNaNs') results in has_na_ns.

I think we can avoid dealing with this specific example if we just use the word "Null" an time we might use "nan" in a primitive name, but it's something to keep in mind bc we can't stop users from trying to pass in that string

@@ -68,3 +69,16 @@ def test_list_semantic_tags():
ft_semantic_tags = ft.list_semantic_tags()
ww_semantic_tags = list_semantic_tags()
assert ft_semantic_tags.equals(ww_semantic_tags)


def test_camel_and_title_to_snake():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we include a snake case test case to confirm snake case passed in is unaltered?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

rwedge
rwedge previously approved these changes Jan 28, 2022
Copy link
Contributor

@rwedge rwedge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dvreed77 dvreed77 merged commit a2afe58 into main Jan 28, 2022
@dvreed77 dvreed77 deleted the allow_title_and_camel_case branch January 28, 2022 16:34
@dvreed77 dvreed77 mentioned this pull request Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow users to pass camelCase to DFS for primitives
4 participants