Skip to content

Conversation

@thehomebrewnerd
Copy link
Contributor

Restructure primitives directory to use individual primitives files

Closes #2179

@thehomebrewnerd thehomebrewnerd marked this pull request as draft October 21, 2022 15:32
@codecov
Copy link

codecov bot commented Oct 21, 2022

Codecov Report

Merging #2331 (edb22fd) into main (a781c6f) will increase coverage by 0.01%.
The diff coverage is 99.95%.

@@            Coverage Diff             @@
##             main    #2331      +/-   ##
==========================================
+ Coverage   99.39%   99.41%   +0.01%     
==========================================
  Files         179      310     +131     
  Lines       19166    19798     +632     
==========================================
+ Hits        19051    19683     +632     
  Misses        115      115              
Impacted Files Coverage Δ
...s/standard/transform/natural_language/constants.py 100.00% <ø> (ø)
...tandard/transform/natural_language/count_string.py 100.00% <ø> (ø)
...sform/natural_language/mean_characters_per_word.py 100.00% <ø> (ø)
...d/transform/natural_language/median_word_length.py 100.00% <ø> (ø)
...ransform/natural_language/num_unique_separators.py 100.00% <ø> (ø)
...ansform/natural_language/number_of_unique_words.py 100.00% <ø> (ø)
...form/natural_language/number_of_words_in_quotes.py 100.00% <ø> (ø)
...rd/transform/natural_language/total_word_length.py 100.00% <ø> (ø)
...primitives/standard/transform/time_series/utils.py 100.00% <ø> (ø)
...utational_backend/test_calculate_feature_matrix.py 100.00% <ø> (ø)
... and 163 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Contributor

@sbadithe sbadithe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we could run into issues with file handling and/or imports if the filename shares its name with a Python built-in? i.e. max.py and max.

@thehomebrewnerd
Copy link
Contributor Author

Is it possible that we could run into issues with file handling and/or imports if the filename shares its name with a Python built-in? i.e. max.py and max.

That was a problem with and and or, and that's why I appended _primitive to the filenames, but I don't think there is an issue with max...if there is, I didn't run into it.

from featuretools.primitives.standard.aggregation_primitives.time_since_last import (
TimeSinceLast,
)
from featuretools.primitives.standard.aggregation_primitives.trend import Trend
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These can all be relative imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a pre-commit hook that converts them all to absolute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shucks, no worries.

@@ -0,0 +1,57 @@
# flake8: noqa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this line can be deleted in every init.py file. I added an ignore to these files in .flake8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't delete it in the files that use the from ... import * approach or we get this linting error:

featuretools/primitives/standard/__init__.py:4:1: F403 'from featuretools.primitives.standard.transform_primitives import *' used; unable to detect undefined names

I was able to remove it from a couple other places though.

@sbadithe
Copy link
Contributor

Is it possible that we could run into issues with file handling and/or imports if the filename shares its name with a Python built-in? i.e. max.py and max.

That was a problem with and and or, and that's why I appended _primitive to the filenames, but I don't think there is an issue with max...if there is, I didn't run into it.

I think it might be good to treat all, any, max, min, and sum the same way to prevent any potential naming issues in the future.

@ozzieD
Copy link
Contributor

ozzieD commented Oct 25, 2022

Since we have better namespacing we can rename rolling_primitives_utils.py to utils.py

@thehomebrewnerd
Copy link
Contributor Author

Since we have better namespacing we can rename rolling_primitives_utils.py to utils.py

Done!

@thehomebrewnerd
Copy link
Contributor Author

I think it might be good to treat all, any, max, min, and sum the same way to prevent any potential naming issues in the future.

@sbadithe I renamed all the associated files to include the _primitive suffix.

@rwedge
Copy link
Contributor

rwedge commented Oct 26, 2022

Thoughts on shortening the primitive submodules a bit?

Going from

featuretools.primitives.standard.transform_primitives.binary_transform_primitives

to

featuretools.primitives.standard.transform.binary

Not as descriptive but transform and primitive are still in the submodule path

@thehomebrewnerd
Copy link
Contributor Author

Thoughts on shortening the primitive submodules a bit?

Going from

featuretools.primitives.standard.transform_primitives.binary_transform_primitives

to

featuretools.primitives.standard.transform.binary

Not as descriptive but transform and primitive are still in the submodule path

@rwedge Not opposed to it. Would help make some of the imports more manageable and not split over so many lines I think.

@@ -0,0 +1,29 @@
from woodwork.column_schema import ColumnSchema
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that num_characters and num_words are moved here

@ozzieD
Copy link
Contributor

ozzieD commented Oct 26, 2022

I'm wondering if we can put all the stray transform primitives in their own folders as well. I'm okay with this as is. But maybe we could but the email and url ones in thier respective folders. Maybe the Cosine, Tangent, Percentile etc.. in one called math.

@thehomebrewnerd thehomebrewnerd merged commit 882b504 into main Oct 26, 2022
@thehomebrewnerd thehomebrewnerd deleted the issue-2179-reorg-primitives branch October 26, 2022 19:56
@gsheni gsheni mentioned this pull request Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Restructure primitives folder and various util functions

5 participants