Skip to content

Version 0.17.0

Compare
Choose a tag to compare
@ueshin ueshin released this 05 Sep 07:19
· 1060 commits to master since this release

Options

We started using options to configure the Koalas' behavior. Now we have the following options:

  • display.max_rows (#714, #742)
  • compute.max_rows (#721, #736)
  • compute.shortcut_limit (#717)
  • compute.ops_on_diff_frames (#725)
  • compute.default_index_type (#723)
  • plotting.max_rows (#728)
  • plotting.sample_ratio (#737)

We can also see the list and their descriptions in the User Guide of our project docs.

Plots

We continue adding plot APIs as follows:

For Series:

  • plot.area() (#704)

For DataFrame:

Multi-index columns support

We also continue improving multi-index columns support. We made the following APIs support multi-index columns:

  • koalas.concat() (#680)
  • koalas.get_dummies() (#695)
  • DataFrame.pivot_table() (#635)

Other new features and improvements

We added the following new features:

koalas:

  • read_sql_table() (#741)
  • read_sql_query() (#741)
  • read_sql() (#741)

koalas.DataFrame:

Along with the following improvements:

  • GroupBy.apply should return Koalas DataFrame instead of pandas DataFrame (#731)
  • Fix rpow and rfloordiv to use proper operators in Series (#735)
  • Fix rpow and rfloordiv to use proper operators in DataFrame (#740)
  • Add schema inference support at DataFrame.transform (#732)
  • Add Option class to support type check and value check in options (#739)
  • Added missing tests (#687, #692, #694, #709, #711, #730, #729, #733, #734)

Backward compatibility

  • We renamed two of the default index names from one-by-one and distributed-one-by-one to sequence and distributed-sequence respectively. (#679)
  • We moved the configuration for enabling operations on different DataFrames from the environment variable to the option. (#725)
  • We moved the configuration for the default index from the environment variable to the option. (#723)