Hivemall v0.4.2-rc.2

@myui myui released this Jun 28, 2016 · 5 commits to master since this release

This is the second release candidate for Hivemall v0.4.2.

CAUTION: Please use add_bias instead of addBias for adding bias clauses from this release. If you want to use addBias, please issue the deprecated DDLs manually.

We DO NOT recommend to use this version in production yet.


Changes since v0.4.2-rc.1 are summarized as follows:

  • Bug Fixes

  • Major Changes

    • Fixed feature_hashing UDF behavior for a bias feature '0' [a8fd640]
    • Cleaned up deprecated DDLs including addBias [2264ae6]
    • Changed (Fixed) the behavior of binarize_label [ab5c19e]
    • Reverted to support a function alias addBias for Treasure Data env [0e99357]
  • Minor Changes

    • Updated feature OI validation scheme to use HiveUtils [612a5df]
    • Made feature UDF as GenericUDF and handle null weight [77922dc] #306 #269
    • Fixed tree_predict UDF of RandomForest to accept any Integer types [c7c5213]

Downloads

Hivemall v0.4.2-rc.1

@myui myui released this Jun 7, 2016 · 24 commits to master since this release

This is the first release candidate for Hivemall v0.4.2. We skipped v0.4.1 since there are tremendous changes in this release.

We would like to thank all contributors of this release.

Caution: FFM and BPRMF implementations are still experimental and subject to change (so not documented). We DO NOT recommend to use this version in production yet.


Changes since v0.4.1-alpha.6 are summarized as follows:

  • Major Enhancement

    • Initial support for Hivemall on Spark #274
    • Implemented Field-aware Factorization Machines #284 #286 #288 #290 #292 #293 #295 #297
    • Implemented BPR-MF (Matrix Factorization for Implicit Feedbacks) #278 [d427bd8]
      • Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, Lars Schmidt-Thieme. "BPR: Bayesian Personalized Ranking from Implicit Feedback", Proc. UAI, 2009.
  • Minor Enhancement

  • Major Change

    • Changed the default learning rate 'eta0' of Factorization Machines [24fc383]
    • Changed the default max_init_value option value of FM/FFM [208ed62]
    • Changed the default V initialization scheme for FM/FFM classification from random to guassian [8ec5199]
  • Minor Change

    • Renamed concat_array UDF to array_concat [f9561d6]
    • Fixed each_top_k to accept non-constant k for the first argument [2f18011]
    • Fixed to_order_map to support reverseOrder option [ac005fd]
    • Fixed extract_weight UDF to support FFM format [f954bad]
  • Bug Fixes

    • Fixed a bug in model_id generation scheme of RandomForest #299

Downloads

Hivemall v0.4.1-alpha.6 (Alpha release)

@myui myui released this Mar 9, 2016 · 266 commits to master since this release

This is the 6th alpha release for v0.4.1. We DO NOT recommend to use this version in production yet.


Changes since v0.4.1-alpha.5 are summarized as follows:

  • Minor Enhancement

    • Implemented r2 UDAF #268
  • Minor Change

    • Refactored DecisionTree and RegressionTree #270
  • Bug Fixes

    • Fixed a corner case bug in rf_ensemble [64f64b0]

Downloads

Hivemall v0.4.1-alpha.5 (Alpha release)

@myui myui released this Mar 7, 2016

This is the fifth alpha release for v0.4.1. We DO NOT recommend to use this version in production yet.
Note: v0.4.1-alpha.4 has been skipped.


Changes since v0.4.1-alpha.3 are summarized as follows:

  • Minor Enhancement

    • Implemented binarize_label UDTF #266
  • Major Change

    • Reduced memory usages of DecisionTree and RegressionTree #259 #264 #270
  • Bug Fixes

    • Fixed to report progress in long running UDTFs to avoid mapreduce.task.timeout [6eacd65]

Downloads

Hivemall v0.4.1-alpha.3 (Alpha release)

@myui myui released this Jan 8, 2016 · 304 commits to master since this release

This is the third alpha release for v0.4.1. We DO NOT recommend to use this version in production yet.

Major enhancement of this release is the support for mini batch gradient descent.
Find the usage in this page.

Note that the usage/behavior of RandomForest has been changed in this release.
You can find changes in this page.


Changes since v0.4.1-alpha.2 are summarized as follows:

  • Major Enhancement

    • Supported mini batch gradient descent for logistic regression #252
  • Minor Enhancement

    • Supported min_samples_leaf option in RandomForest #253
    • Added inflate/deflate UDF [4f2747c]
    • Added base91/unbase91 UDFs [8c70d8d]
    • Add scritps to start/stop MIX servers #241
    • Supported classification_error for the SplitRule of RandomForest classification [19ae202]
  • Major Change

    • Supported various mode types for the output prediction model of RandomForest. Due to that, the behavior of RandomForest has been changed. #249
      • Refined tree_predict UDF [57bae25]
      • Changed the default output model type [10f0ea2]
  • Minor Changes

    • Fixed the default numbers of used threads for smile for non-TD env [5c8bea3]
    • Changed/enlarged buffer size for iterative training of Factorization Machine [dcf0dbc]
    • Changed formula for calculating the number of random selected features [13c3917]
  • Bug Fixes

    • Fixed mf_predict logic for unkown examples [a43f793]
    • Fixed a bug that maxDepth is not set in DecisionTree [df21aba]

Downloads

Hivemall v0.4.1-alpha.2 (Alpha release)

@myui myui released this Nov 17, 2015 · 386 commits to master since this release

This is the 2nd alpha release for v0.4.1. We DO NOT recommend to use this version in production yet.

Changes since v0.4.1-alpha.1 are summarized as follows:

  • Bug Fixes
    • Fix bugs in MixServer PartialResult#diffClock and add tests [0c7672d]
    • Changed the implementation of fm_predict to GenericUDAF and fixed a bug [2906b38]
    • Applied a workaround for KryoException/java.util.ConcurrentModificationException in tokenize_ja [06b3762]

Downloads

Hivemall v0.4.1-alpha.1 (Alpha release)

@myui myui released this Nov 12, 2015 · 402 commits to master since this release

This is a alpha release for v0.4.1. We DO NOT recommend to use this version in production yet.

For the usage of tokenize_ja, please refer this wiki page.

Changes since v0.4.0-2 are summarized as follows:

  • Major Enhancement

    • Supported Japanese tokenizer UDF tokenize_ja [#227]
  • Major Changes

    • Separated maven module into core and mixserv [#225]
    • Changed the default max_depth of gradient tree boosting classifier [c4ade19]
  • Bug Fixes

    • Fixed fm_predict not to use Custom class for the result of terminatePartial [#230]

Downloads

Hivemall v0.4.0-2 (maintenance release)

@myui myui released this Nov 4, 2015 · 436 commits to master since this release

This is a maintenance release of Hivemall v0.4.0. The following bug fixes have been applied.

Changes since v0.4.0-1 are summarized as follows:

  • Minor Changes
    • Changed behaviors of categorical_features|indexed_features|quantitative_features|vectorize_features for empty string [24e77f4]
    • Changed behaviors of convert_label [9f611e6]

Downloads

Hivemall v0.4.0-1 (maintenance release)

@myui myui released this Oct 30, 2015 · 441 commits to master since this release

This is a maintenance release of Hivemall v0.4.0. The following bug fixes have been applied.

Changes since v0.4.0 are summarized as follows:

  • Minor Changes

    • Applied a fix to mf_predict for Treasure Data [c53de81]
  • Bugfix

    • Fixed a corner case bug in fm_predict [c53de81]

Downloads

The release version of Hivemall v0.4.0

@myui myui released this Oct 28, 2015 · 445 commits to master since this release

This is the stable release of Hivemall v0.4.0.

This version makes major development leaps and includes lots of changes. Major enhancements in this release includes supports for Factorization Machine (usage 1) and RandomForest (usage 1, 2).

Last but not least, I would like to thank contributors who made contributions to this release.

Changes since v0.3.2-3 are summarized as follows:

  • Major Enhancement

    • Introduced RandomForest classifier/regressor using Smile #219
    • Introduced Factorization Machine classifier/regressor #207
  • Minor Enhancement

  • Major Changes

    • Changed behavior of categorical_features UDF to always makes categorial features [d6f84f2]
    • Changed behavior of vectorize_features to parse numbers as double instead of float [982e079]
    • Fixed to include Netty jars to hivemall-fat.jar [6685d75]
  • Minor Changes

    • Added "-help" option to UDTF that shows an usage of the function [9603460]
    • Added -workers option to MixServer #212
    • Added dependency to log4j to hivemall-fat.jar [4b98bae]
    • Removed ant build file. Use Maven instead. [568ef5d]
  • Bugfix

    • Fixed a bug in tf UDAF [224057b]
    • Fixed a bug in diffclock computation logic in MixServer #220
    • Fixed a bug for sigmoid(null). Treasure Data PLT-4718. [ffe213a]
    • Fixed a bug in normalization functions that results becomes NaN when divided by zero [624e375]

Downloads