-
Notifications
You must be signed in to change notification settings - Fork 909
Refactor and update test fixtures #1382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@rwedge Although the Woodwork integration work prompted the creation of this PR, nothing in this breaks the current implementation, and a couple of the fixture parameterization updates could be beneficial for the current code base. I am requesting a merge into |
| else: | ||
| error = 'must be one of the following formats: {}' | ||
| raise ValueError(error.format(', '.join(FORMATS))) | ||
| dtypes = description['loading_info']['properties']['dtypes'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the way the Dask entityset was created was causing a Dask error in test_to_csv. Dask was inferring the log entity zipcode column as int64, but it should be an object column based on the data it contains. The changes here were needed to tell Dask what the column types should be when reading the CSV, so it doesn't try to infer them. This is probably an overall safer way to read the file anyway since we know from serialization what the column dtypes should be.
I'm not exactly sure why changing the entityset construction highlighted this, but it did...
Also, I had to change the way the dtypes variable was created above because one of the tests for invalid input does not contain the properties entry in loading_info so using the old selection method resulted in a KeyError.
Codecov Report
@@ Coverage Diff @@
## main #1382 +/- ##
=======================================
Coverage 98.58% 98.58%
=======================================
Files 135 135
Lines 14546 14578 +32
=======================================
+ Hits 14340 14372 +32
Misses 206 206
Continue to review full report at Codecov.
|
| # if ks and any(isinstance(e.df, ks.DataFrame) for e in simple_es.entities): | ||
| # pytest.xfail("Koalas does not support categorical dtype") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. Removed.
rwedge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Refactor and update test fixtures
Closes #1364
This PR refactors tests fixtures to avoid direct assignment of Entity.df, which will not be possible when a Woodwork dataframe is used to replace Entity.
During review of the test fixtures, two other fixtures were found to not be using Dask and/or Koalas entityset versions so these fixtures were also updated to include the Dask/Koalas versions.