Skip to content
daijyc edited this page Aug 14, 2015 · 8 revisions

This is the wiki page for how to use Hivemall with Pig. The feature requires Pig 0.15 or up.

Dataset generation

  1. classification/logistic regression

Binary Classification

a9a binary classification

  1. Logistic Regression

news20 binary classification

  1. Perceptron, Passive Aggressive
  2. CW, AROW, SCW
  3. AdaGradRDA, AdaGrad, AdaDelta

KDD2010a/b binary classification

  1. PA/CW/AROW/SCW
  1. AROW

Webspam binary classification

  1. PA1,AROW,SCW

Multiclass Classification

news20 multiclass classification

  1. PA
  2. CW, AROW, SCW
  3. Ensemble learning
  4. one vs the rest classifier

Regression

E2006 tfidf regression

  1. Passive Aggressive, AROW

KDDCup 2012 track 2 CTR prediction

  1. Logistic Regression, Passive Aggressive

Recommendation

News20 multiclass related article recommendation

  1. LSH/Minhash

MovieLens movie recommendation

  1. Matrix Factorization
  2. 10-fold Cross Validation (Matrix Factorization)

Nearest Neighbor

News20 multiclass similar article search

  1. LSH/Minhash and Brute-Force Search
  2. kNN search using b-Bits Minhash
Clone this wiki locally