Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically generate name for controllable primitives #481

Merged
merged 32 commits into from Apr 9, 2019

Conversation

Projects
None yet
4 participants
@jeff-hernandez
Copy link
Collaborator

commented Apr 5, 2019

No description provided.

CharlesBradshaw and others added some commits Feb 13, 2019

@codecov

This comment has been minimized.

Copy link

commented Apr 5, 2019

Codecov Report

Merging #481 into master will increase coverage by 0.02%.
The diff coverage is 98.57%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #481      +/-   ##
==========================================
+ Coverage   96.23%   96.25%   +0.02%     
==========================================
  Files         104      104              
  Lines        8658     8714      +56     
==========================================
+ Hits         8332     8388      +56     
  Misses        326      326
Impacted Files Coverage Δ
...etools/primitives/base/transform_primitive_base.py 100% <100%> (ø) ⬆️
...ools/primitives/base/aggregation_primitive_base.py 100% <100%> (ø) ⬆️
...ools/primitives/standard/aggregation_primitives.py 94.69% <100%> (+0.02%) ⬆️
featuretools/primitives/base/primitive_base.py 100% <100%> (ø) ⬆️
...tools/tests/primitive_tests/test_primitive_base.py 100% <100%> (ø) ⬆️
featuretools/primitives/base/utils.py 86.66% <85.71%> (-1.57%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 23dc0b1...215cb75. Read the comment docs.

@kmax12 kmax12 referenced this pull request Apr 5, 2019

Closed

Generate controllable name #432

@@ -13,13 +13,16 @@ class AggregationPrimitive(PrimitiveBase):
stack_on_self = True # whether or not it can be in input_types of self
allow_where = True # whether DFS can apply where clause to this primitive

def generate_name(self, base_feature_names, child_entity_id,
parent_entity_id, where_str, use_prev_str):
def generate_name(self, base_feature_names, child_entity_id, parent_entity_id, where_str, use_prev_str):

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 5, 2019

Member

let's no reformat the function definition

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

Change applied in this commit.

@@ -51,3 +54,28 @@ def get_function(self):

def get_filepath(self, filename):
return os.path.join(config.get("primitive_data_folder"), filename)

def get_args_string(self):
if not isinstance(self.__init__, types.MethodType): # __init__ must be defined

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 5, 2019

Member

how is it possible for __init__ not to be defined?

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 5, 2019

Author Collaborator

In this example, the __init__ is not defined. I included a test case.

    class Primitive(PrimitiveBase):
        pass

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 5, 2019

Member

so, I tested in python 3 and it worked without this, but just tested in python 2 and it doesn't so I see what you're saying

what if we just added this to PrimitiveBase?

def __init__(self):
        pass

I think that might be a cleaner solution

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 5, 2019

Author Collaborator

Yes, I think that should work.

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

Change applied in this commit.

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

Should the test case test_args_string_undefined be deleted?

if not isinstance(self.__init__, types.MethodType): # __init__ must be defined
return ''

v2 = version_info.major == 2

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 5, 2019

Member

let's try to use the featuretools.primitives.base.utils.inspect_function_args method

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 5, 2019

Author Collaborator

Instead of two functions, could we add an optional argument on inspect_function_args to restrict keywords like time?

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 5, 2019

Author Collaborator

Never mind, it might be better to split into two functions.

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 5, 2019

Member

let's try it with two functions and see how it looks

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

Change applied in this commit.

Show resolved Hide resolved featuretools/primitives/standard/aggregation_primitives.py
Show resolved Hide resolved featuretools/tests/primitive_tests/test_primitive_base.py

@jeff-hernandez jeff-hernandez requested a review from kmax12 Apr 8, 2019

@jeff-hernandez

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 8, 2019

Creating primitives using make_agg_primitive and make_trans_primitive will return an empty args_string because arguments are not assigned as attributes in the class and passed into __init__ as **kwargs.

@jeff-hernandez

This comment has been minimized.

Copy link
Collaborator Author

commented Apr 8, 2019

If a primitive argument is stored as a pandas series, then the generated name won't be very readable. Is there a preferred method on working with these types of arguments?

primitive = Primitve(argument=pd.Series(range(4)))
# PRIMITIVE(argument='0    0\n1    1\n2    2\n3    3\ndtype: int64', ...)
@kmax12
Copy link
Member

left a comment

Just a few more comments. Looking good overall

default = is_same_type and arg.default == getattr(self, arg.name)
return is_positional_or_keyword and not default

string = OrderedDict()

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 8, 2019

Member

does this need to be an OrderedDict? why not just make it a list

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

No, doesn't need to be. I will make into list.

string[arg.name] = str(getattr(self, arg.name))
if len(string) == 0:
return ''
string = ', ' + ', '.join(map('='.join, string.items()))

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 8, 2019

Member

can we break this into two lines to make it easier to read

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 8, 2019

Author Collaborator

Change applied in this commit.

Show resolved Hide resolved featuretools/primitives/base/primitive_base.py
@kmax12

This comment has been minimized.

Copy link
Member

commented Apr 8, 2019

If a primitive argument is stored as a pandas series, then the generated name won't be very readable. Is there a preferred method on working with these types of arguments?

primitive = Primitve(argument=pd.Series(range(4)))
# PRIMITIVE(argument='0    0\n1    1\n2    2\n3    3\ndtype: int64', ...)

for now, let's not handle the series case specifically. whatever it does by default is fine

jeff-hernandez added some commits Apr 8, 2019

@jeff-hernandez jeff-hernandez requested a review from kmax12 Apr 8, 2019

@kmax12

kmax12 approved these changes Apr 9, 2019

Copy link
Member

left a comment

Only one comment about a comment. Otherwise, ready to merge.

error = '"{}" must be attribute of {}'
assert hasattr(self, arg.name), error.format(arg.name, self.__class__.__name__)

# skip if *args or **kwargs

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 9, 2019

Member

should this be "skip if not"?

This comment has been minimized.

Copy link
@jeff-hernandez

jeff-hernandez Apr 9, 2019

Author Collaborator

The *args and **kwargs would not be considered POSITIONAL_OR_KEYWORD. This comment could work as well.

# skip if not a standard argument (e.g. excluding *args and **kwargs)

This comment has been minimized.

Copy link
@kmax12

kmax12 Apr 9, 2019

Member

looks good

Show resolved Hide resolved featuretools/primitives/base/primitive_base.py

jeff-hernandez and others added some commits Apr 9, 2019

Merge branch 'generate_controllable_name' of github.com:jeff-hernande…
…z/featuretools into generate_controllable_name

@kmax12 kmax12 changed the title Generate controllable name Automatically generate name for controllable primitives Apr 9, 2019

@kmax12 kmax12 merged commit 40298ad into Featuretools:master Apr 9, 2019

4 checks passed

codecov/patch 98.57% of diff hit (target 96.23%)
Details
codecov/project 96.25% (+0.02%) compared to 23dc0b1
Details
license/cla Contributor License Agreement is signed.
Details
test_all_python_versions Workflow: test_all_python_versions
Details

@rwedge rwedge referenced this pull request Apr 24, 2019

Merged

v0.7.1 #507

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.