Skip to content

Version 0.22.0

Compare
Choose a tag to compare
@HyukjinKwon HyukjinKwon released this 14 Nov 05:37
· 848 commits to master since this release

Enable Arrow 0.15.1+

Apache Arrow 0.15.0 did not work well with PySpark 2.4 so it was disabled in the previous version.
With Arrow 0.15.1, now it works in Koalas (#902).

Expanding and Rolling

We also added expanding() and rolling() APIs in all groupby(), Series and Frame (#985, #991, #990, #1015, #996, #1034, #1037)

  • min
  • max
  • sum
  • mean
  • std
  • var

Multi-index columns support

We continue improving multi-index columns support. We made the following APIs support multi-index columns:

Documentation

We added "Best Practices" section in the documentation (#1041) so that Koalas users can read and follow. Please see https://koalas.readthedocs.io/en/latest/user_guide/best_practices.html

Other new features and improvements

We added the following new features:

koalas.DataFrame:

koalas.Series:

koalas.MultiIndex:

Along with the following improvements:

  • Introduce column_scols in InternalFrame substitude for data_columns. (#956)
  • Fix different index level assignment when 'compute.ops_on_diff_frames' is enabled (#1045)
  • Fix Dataframe.melt function & Add doctest case for melt function (#987)
  • Enable creating Index from list like 'Index([1, 2, 3])' (#986)
  • Fix combine_frames to handle where the right hand side arguments are modified Series (#1020)
  • setup.py should support Python 2 to show a proper error message. (#1027)
  • Remove Series.schema. (#993)