-
Notifications
You must be signed in to change notification settings - Fork 885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix fragmentation PerformanceWarning in feature_set_calculator.py #2424
Conversation
else: | ||
for name, col in new_cols.items(): | ||
col.name = name | ||
if isinstance(data, dd.DataFrame): | ||
data = dd.concat([data, col], axis=1) | ||
else: | ||
data = ps.concat([data, col], axis=1) | ||
return data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like this approach for Dask/Spark, but we can't create a new dataframe from the columns here like we can for pandas, so we have to concat the columns one at a time. Maybe there is a better way, but I didn't come up with one off the top of my head.
Codecov Report
@@ Coverage Diff @@
## main #2424 +/- ##
=======================================
Coverage 99.44% 99.44%
=======================================
Files 340 340
Lines 20893 20908 +15
=======================================
+ Hits 20778 20793 +15
Misses 115 115
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
Co-authored-by: Roy Wedge <roy.wedge@alteryx.com>
Fixes #2405
pandas.util
"datetime"
without units