New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pandas workaround back in #1731
Conversation
@davesque this can merge into main, woodwork-integration branch has already been merged |
3817834
to
aafaed3
Compare
aafaed3
to
10364ec
Compare
10364ec
to
9e72515
Compare
Codecov Report
@@ Coverage Diff @@
## main #1731 +/- ##
=======================================
Coverage 98.69% 98.69%
=======================================
Files 138 138
Lines 15361 15368 +7
=======================================
+ Hits 15161 15168 +7
Misses 200 200
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davesque No issues with the reverted changes, but just a couple comments:
- Have you been able to confirm that reverting these changes fixes the bug you were seeing?
- Can we create an issue to investigate this further and try to add tests to our test suite that will fail if remove this code with the current pandas version?
Yep, it fixes the issue. Confirmed that before making the PR.
There's a bullet point in the above PR description relating to this. I think the tests should be part of this PR. |
Pull Request Description
We realized earlier today that the pandas workarounds that were removed by #1677 and #1679 are probably still needed as the bug reported in pandas-dev/pandas#22501 may still exist. More investigation is needed to understand exactly what causes data frames with misaligned categorical indices to be merged incorrectly. For now, we should at least revert the changes made by the two aforementioned PRs.
Left to do:
_calculate_agg_features
resulting from the categorical merge bug