Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

The release version of Hivemall v0.3.1

@myui myui released this · 8 commits to master since this release

This is the stable release version of Hivemall v0.3.1. We have changed the license of Hivemall from LGPL v2 to Apache License v2 from this release.

From this release, stable releases of Hivemall are also released on Maven Central.

Changes since v0.3 are summarized as follows:

<dependency>
    <groupId>io.github.myui</groupId>
    <artifactId>hivemall</artifactId>
    <version>0.3.1</version>
</dependency>
  • Minor Enhancement

    • Added add_feature_index() UDF [bad68ec]
    • Added hivemall_version() UDF [ae493a2]
-- assign indices to dense features
select add_feature_index(array(3,4.0,5)) from dual;
> ["1:3.0","2:4.0","3:5.0"]
  • Major Changes

    • Changed the license from LGPL v2 to Apache License v2 [cc8be7e]
  • Minor Changes

    • Modifed extract_weight to accept a categorical feature representation [f5a3c28]
    • Modified to accept various types in addition to INT for label (classification) [c352670]
    • Modified to accept both float/double in label/target (regression) [35bedf1]
    • Feature parsing scheme has been refactored to be more efficient [e37c47f]
    • Changed the implementation of array_avg from ArrayAvgUDAF to ArrayAvgGenericUDAF [3a21853]
-- accept categorical variables (e.g., "weight") in Hivemall v0.3.1 in addition to quantitive variables (e.g., "weight:55.0") in extract_weight()
select extract_weight("weight"), extract_weight("weight:55.0") from dual;
> 1.0 | 55.0
select 
   -- logress(addBias(features), CAST(label as FLOAT)) as (feature, weight) -- Hivemall v0.3 (need to cast labels)
   logress(addBias(features), label) as (feature, weight) -- Hivemall v0.3.1 or later (no need to cast labels)
  • Bugfixes
    • Fixed a bug in jaccard() assuming the result of dividing integer expressions as float [4115913]

Downloads

The release version of Hivemall v0.3.0

@myui myui released this · 72 commits to master since this release

This is the first stable release version of Hivemall v0.3.0.

A major enhancement within this release is support for matrix factorization.
Note that hivemall v0.3 or later supports Hive v0.11 or later.

Changes since v0.3_beta3 are summarized as follows:

  • Major Enhancement

  • Minor Enhancement

    • Added f1score() UDAF that represents f-measure [6dedbbf]
    • Added kld() UDF [ 41ceeb0]
    • Added FixedEtaEstimator [5dfc7ef]
    • Supported mae/mse/rmse UDAFs [8ff0cdf]
    • Added rand_gid/rand_gid2 macro [caf5266]
    • Supported a new function, train_mf_adagrad [fb4e8f2]
  • Major Changes

    • Refactored parameter averaging schemes [7ff8985][71566fd]
    • Fixed the averaging scheme of covariance [ee43d2c]
  • Bugfix

    • Fixed rand_amplify() to shuffle in sweepAll() where buffer is not filled [5b798f1]
    • Fixed DoubleWritable import in kld() [2c0e7d5]

Downloads

Beta version #3 of v0.3

@myui myui released this · 161 commits to master since this release

A major change in this maintenance release is the support for TF-IDF computation.

Downloads

Beta version #2 of v0.3

@myui myui released this · 230 commits to master since this release

A major change in this maintenance release is the support for AdaGrad/AdaGradRDA/AdaDelta.

For the usage, see the following examples:

  • Enhancement

    • Supported AdaGrad/AdaDelta
    • Supported AdaGradRDA classification
  • Major changes

    • logress_iter() is no more supported [ad10c33]
  • Minor changes

    • Fixed not to use String#split because of its performance [9e0a95c]

Downloads

The initial beta release of v0.3 (beta1)

@myui myui released this · 270 commits to master since this release

We are pleased to announce the initial beta release (beta1) of Hivemall v0.3.

The major enhancement in this release is the support for model mixing.
See this page to know how to use the new feature.

You can find a brief explanation of the internal design of MIX protocol in this slide.

Note this is a pre-release and for alpha users. Any feedback is welcome!

Downloads

The stable release of Hivemall v0.2

@myui myui released this

This is the stable release of Hivemall v0.2, recommended for production uses.

From this release, Hive v0.11 or later is required.


Changes applied since v0.2-alpha4 is small is as follows:

  • Minor changes

    • Changed the package structure w.r.t. hivemall.io.* [95f4e0b]
  • Bugfix

    • Fixed a bug in the "disable_halffloat" option [1c975e2]
    • Fixed maven dependencies [a8d09ec]

Downloads

Alpha version #4 of v0.2

@myui myui released this · 371 commits to master since this release

The major changes in this release is the support for a dense model using "-densemodel" option.
When feature dimension is large (greater than 2^24), SpaceEfficientDenseModel is used.

Note this release contains destructive changes that "-fh" and "-b" options are removed.
Implicit feature hashing and bias clause through training options are no more supported.
Use explicit mhash() or add_bias() instead of them.


  • Major Changes

    • Added a support for dense model and removed bias option [84e1e0b]
    • Added SpaceEfficientDenseModel [a9076b5]
  • Minor Changes

    • Changed the default value of confidence parameter (-phi) [98077b4]
    • The default bias feature dimension is changed to "0" [dc46f73]
    • Fixed mhash() and sha() to return a hashed value starting from 1 [0b0ca68]
    • Added function aliases [3c7c868]
    • Removed output_touched option [bf99be3]
    • Added a support to use a fixed seed in rand_amplify() [731275c]

Downloads

Alpha version #3 of v0.2

@myui myui released this · 421 commits to master since this release

A major change in this maintenance release is argmin_kld().
See the usage in this example.

  • Enhancement

    • Added extract_weight(string featureVectors)::weights UDF [88af7b4]
    • Added "-outputs_touched" option for model loading [4cd59f2]
    • Add a function lr_datagen() used to generate regression datasets [273c486, cdbebbd]
    • Added generate_series() UDTF [5b21017]
    • Added normalize() UDF [707a16f]
  • Major Changes

    • Added argmin_kld(mean, covar) to replace voted_avg(mean) [0731913]
  • Minor Changes

    • Modified to reuse a decompressor [5c9c2ff]
    • Added a signature addBias(array features, int bias>) [2fae08c]
    • Reduced object allocations in rand_amplify() [753e123]
  • Bugfix

    • Fixed a bug in distcache_get() [946fe10]
    • Fixed a bug in logress_iter() [667c3fd]

Downloads

Alpha version #2 of v0.2

@myui myui released this · 485 commits to master since this release

Important: Hive v0.11 or later is required (or recommended) from this release.

A major enhancement for this pre-release is the support for iterative training using distributed cache.
Find examples in our wiki pages [1] [2].

Also, the returning value of cw/arow/scw is changed to return covariance as well.
See examples (e.g., this one) for details.


  • Enhancement

  • Major Changes

    • Modified to return covarinace in cw/arow/scw [e7ae2f1]
    • Added aliases for UDFs [8ee0586]
    • Changed hive version dependency to 0.11 or later [a856753, 96d7593]
  • Minor Changes

    • Fixed UDFs to return Hive values instead of Java primitive values [71a3415]
    • Fixed '==' to 'equals' for HivemallConstants [c9c0291, 8d34ce5]
    • Fixed mhash UDF for arrays inputs [fa8efcb]
  • Bugfix

    • Fixed a bug in multiclass PA [46da974]

Downloads

Alpha version #1 of v0.2

@myui myui released this · 557 commits to master since this release

  • Enhancement
    • kNN search using Minhash
    • kNN search using b-Bit Minhash
    • kNN search using cosine similarity
    • Introduced a recommendation scheme using Minhash
    • Modified to use OpenHashTable instead of HashMap for reducing memory consumption [9698972]
  • Bugfix
    • Fixed a bug in rand_amplify() [0962c07]

Downloads

Something went wrong with that request. Please try again.