Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor LatLong and Datetime Primitives into Separate Files #1861

Merged
merged 30 commits into from
Jan 28, 2022

Conversation

jacobboney
Copy link
Contributor

@jacobboney jacobboney commented Jan 24, 2022

Pull Request Description

Changes:
I decided to split all classes containing Lat/Long functions into their own file as well as classes containing date/time into their own file. In each file I also organized classes in alphabetical order. I don't believe there are any conflicts with the new files as I was able to run the tests.

Comments:
Whenever someone is able to review my changes I would also appreciate some input/advice on the testing. I am running them as described on Ubuntu. They run to the end but I do have some failed tests, not sure if this is due to my changes or if it is just part of the process.

As an aside I apologize for all of the unnecessary commits. I'm still getting the hang of it and understand now I may have gone overboard. Also, I accidentally deleted my original branch which is why I am submitting a second pull request.

@dvreed77 dvreed77 changed the title Issue#1855 split primitive Refactor LatLong and Datetime Primitives into Separate Files Jan 24, 2022
@dvreed77
Copy link
Contributor

@jacobboney Thank you for the submission. It looks like many tests are breaking, have you run make test locally?

@thehomebrewnerd
Copy link
Contributor

@jacobboney Thank you for the submission. It looks like many tests are breaking, have you run make test locally?

@jacobboney @dvreed77 There is an issue with the latest version of pandas. We are going to need to restrict our max pandas version before most of these tests will pass. I'm working on a PR for that now. Once my PR gets merged, @jacobboney will need to update his fork with the latest changes from main. Until then, I expect most of these tests will continue to fail our CI here.

Comment on lines 5 to 6
from .date_time_transform_primitive import *
from .lat_long_transform_primitive import *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this file different?

Suggested change
from .date_time_transform_primitive import *
from .lat_long_transform_primitive import *
from .datetime_transform_primitives import *
from .latlong_transform_primitives import *

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I'll implement those name changes today.

@jacobboney
Copy link
Contributor Author

@dvreed77 @thehomebrewnerd
I am a little new to testing. Make test runs on my machine to completion. Is there a log file or something that I'm able to check to see what is actually failing? As of right now I only have the results displayed on the terminal. It passes the vast majority of the tests and only fails a few locally. I have a screenshot of this, but I don't have access to it right now.

@gsheni
Copy link
Contributor

gsheni commented Jan 24, 2022

@jacobboney You will need to update your branch to get the latest from main. In addition, you will need to update the release notes. Here is an example of that:

@jacobboney
Copy link
Contributor Author

The two tests that fail are:
test_deserialize_features_s3
test_checks_primitives_correct_type

I looked at each test and am unsure what may be causing the failure.

@thehomebrewnerd
Copy link
Contributor

The two tests that fail are: test_deserialize_features_s3 test_checks_primitives_correct_type

I looked at each test and am unsure what may be causing the failure.

For the test test_checks_primitives_correct_type, I think that test is failing because the error message that is hard-coded into the test no longer matches the error message that is generated, because the location of the primitive has changed. For that test, I think you just need to update the transform_primitive part of the error message to match the new location: datetime_transform_primitives.

The test_deserialize_features_s3 test is a bit trickier. This test is attempting to recreate features from a JSON file stored on S3. However, because the module names have changed in your code, the module name stored in the file no longer contains the Haversine primitive, so the deserialization fails. To fix this we are going to need to change our serialization schema version as well as upload a new JSON file to S3 to use for this test. Let's tackle this one last, after all the other updates have been made and all open PR comments have been addressed.

@jacobboney
Copy link
Contributor Author

Sounds good, I'll look into the issue with first test then. Thank you for your input.

Copy link
Contributor

@thehomebrewnerd thehomebrewnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than one minor comment regarding release notes wording, this looks fine to me. Let me know when you are ready to address the remaining failing serialization test, and I can help with getting that resolved.

@@ -12,6 +12,7 @@ Future Release
* Add DateToHoliday Transform Primitive (:pr:`1848`)
* Add DistanceToHoliday Transform Primitive (:pr:`1853`)
* Temporarily restrict pandas and koalas max versions (:pr:`1863`)
* Split date, time, latitude and longitude primitives into separate files (:pr:`1861`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very minor nitpick, but I'd probably say Split Datetime and LatLong primitives into separate files here to make it a little more clear we are talking about primitives that specifically operate on Datetime or LatLong column inputs. Not a big deal though.

@jacobboney
Copy link
Contributor Author

Will do. I won't be able to get to those initial fixes until later in the evening. If that ends up being to late I will be available all day tomorrow to work on the final fix.

@jacobboney
Copy link
Contributor Author

The test_checks_primitives_correct_type issue seems to be resolved. I ran the tests again and there was no failure.

@thehomebrewnerd
Copy link
Contributor

The test_checks_primitives_correct_type issue seems to be resolved. I ran the tests again and there was no failure.

@jacobboney Thanks for the update. Looks like you will need to resolve the merge conflicts in the release notes file before the full CI tests will run again.

@jacobboney
Copy link
Contributor Author

Alright, after some Git confusion on my part I think I've resolved the merge conflicts.

@dvreed77
Copy link
Contributor

@jacobboney. I uploaded a new test JSON file to S3 to allow your PR to pass. Can you make the following changes in your code:

File: featuretools/feature_base/features_serializer.py
Line #: 10
Change: SCHEMA_VERSION = "8.0.0"

File: featuretools/tests/primitive_tests/test_feature_serialization.py
Line #: 46
Change: TEST_FILE = "test_feature_serialization_feature_schema_{}_entityset_schema_{}_2022_01_27.json".format(SCHEMA_VERSION, ENTITYSET_SCHEMA_VERSION)

@jacobboney
Copy link
Contributor Author

@dvreed77 Great, thank you. I'll make those changes this evening.

@jacobboney
Copy link
Contributor Author

jacobboney commented Jan 28, 2022

I was able to make all of the changes, however, after running the tests, the features serializer test still failed. I found that the schema version also had to be updated in test_features_serializer.

Unfortunately, there still seems to be an issue with koalas/pandas. The failures come from:
-featuretools/tests/primitive_tests/test_transform_features.py
-featuretools/tests/synthesis/test_koalas_dfs.py

All failures seem to be assertion errors.

@codecov
Copy link

codecov bot commented Jan 28, 2022

Codecov Report

Merging #1861 (32c84c7) into main (ff08a6a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1861   +/-   ##
=======================================
  Coverage   98.82%   98.82%           
=======================================
  Files         145      147    +2     
  Lines       16277    16290   +13     
=======================================
+ Hits        16086    16099   +13     
  Misses        191      191           
Impacted Files Coverage Δ
...retools/primitives/standard/transform_primitive.py 100.00% <ø> (ø)
...ols/tests/synthesis/test_deep_feature_synthesis.py 99.33% <ø> (ø)
featuretools/feature_base/features_serializer.py 100.00% <100.00%> (ø)
featuretools/primitives/standard/api.py 100.00% <100.00%> (ø)
...imitives/standard/datetime_transform_primitives.py 100.00% <100.00%> (ø)
...rimitives/standard/latlong_transform_primitives.py 100.00% <100.00%> (ø)
...s/primitive_tests/test_datetoholiday_primitives.py 100.00% <100.00%> (ø)
...imitive_tests/test_distancetoholiday_primitives.py 100.00% <100.00%> (ø)
...ests/primitive_tests/test_feature_serialization.py 99.35% <100.00%> (ø)
.../tests/primitive_tests/test_features_serializer.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff08a6a...32c84c7. Read the comment docs.

@jacobboney
Copy link
Contributor Author

Well, after seeing that all tests passed I guess that means there must be some version issue on my side. Glad to see there are no more errors though.

@thehomebrewnerd
Copy link
Contributor

thehomebrewnerd commented Jan 28, 2022

I was able to make all of the changes, however, after running the tests, the features serializer test still failed. I found that the schema version also had to be updated in test_features_serializer.

Unfortunately, there still seems to be an issue with koalas/pandas. The failures come from: -featuretools/tests/primitive_tests/test_transform_features.py -featuretools/tests/synthesis/test_koalas_dfs.py

All failures seem to be assertion errors.

@jacobboney Not exactly sure what might be happening, but all of our CI tests pass with the latest code you pushed, so I think we are good now, unless the final reviews highlight anything else that needs to be updated.

@jacobboney
Copy link
Contributor Author

@thehomebrewnerd Not a problem at all. I was just looking for a issue in the code that wasn't there. It must be something on my end that I'll have to figure out. I'll just stay tuned then for any possible final changes.

@dvreed77 @thehomebrewnerd @gsheni Thank you all very much for your guidance through this process. I greatly appreciate it.

@dvreed77 dvreed77 merged commit ec8a72b into alteryx:main Jan 28, 2022
@dvreed77
Copy link
Contributor

dvreed77 commented Jan 28, 2022

Congrats @jacobboney! Thanks for contribution!

@jacobboney
Copy link
Contributor Author

Great! Thank you!

@dvreed77 dvreed77 mentioned this pull request Feb 11, 2022
@dvreed77
Copy link
Contributor

dvreed77 commented Mar 7, 2022

fixes #1855

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Split Transform Primitives out into separate files
4 participants