Skip to content
daijyc edited this page Feb 24, 2015 · 81 revisions

Welcome to the hivemall wiki!

logo

General tips

Troubleshooting

Advanced topics

Particularly Important Tips

Feature Engineering

Evaluation

  1. Statistical evaluation of a prediction model

Dataset generation

  1. classification/logistic regression

Binary Classification

a9a binary classification

  1. Logistic Regression
  2. Iterative training using distributed cache

news20 binary classification

  1. Perceptron, Passive Aggressive
  2. CW, AROW, SCW
  3. AdaGradRDA, AdaGrad, AdaDelta

KDD2010a/b binary classification

  1. PA/CW/AROW/SCW
  1. AROW

Webspam binary classification

  1. PA1,AROW,SCW

Multiclass Classification

news20 multiclass classification

  1. PA
  2. CW, AROW, SCW
  3. Ensemble learning
  4. one vs the rest classifier

Regression

E2006 tfidf regression

  1. Passive Aggressive, AROW

KDDCup 2012 track 2 CTR prediction

  1. Logistic Regression, Passive Aggressive
  2. Logistic Regression with Amplifier
  3. AdaGrad, AdaDelta

Recommendation

News20 multiclass related article recommendation

  1. LSH/Minhash

MovieLens movie recommendation

  1. Matrix Factorization
  2. 10-fold Cross Validation (Matrix Factorization)

Nearest Neighbor

News20 multiclass similar article search

  1. LSH/Minhash and Brute-Force Search
  2. kNN search using b-Bits Minhash
Clone this wiki locally