experiments testing transductive svm for my blog posts
Perl Python Ruby
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
multiclass keeping up to date Nov 21, 2014
qn-s3vm-2014-paper add python-cgls Jan 26, 2015
weird_data added weird data that fails on the dual Aug 12, 2014
.gitignore add python-cgls Jan 26, 2015
Transductive_MetaHeuristics.ipynb i think this is the basic algo Feb 5, 2015
Transductive_RWLS.ipynb skeleton code for TSVM engine Jan 28, 2015
cluster_news.ipynb example of how to cluster Sep 5, 2014
document_classification_20newsgroups.py example of ridge regesssion Sep 5, 2014
gmm_dual_primal_disagreement.ipynb Clearer investigation of tolerances Aug 15, 2014
incremental_tsvm_news.py Add incremental SVM Aug 19, 2014
label_propagation.ipynb first pass Sep 5, 2014
make_svm_inputs.ipynb makes svmlight inputs for transduction Sep 6, 2014
make_tsvm_inputs.ipynb makes svmlin, svmlight, and universvm inputs; does not run yet Sep 6, 2014
svmlin.rb a Aug 4, 2014
tfidf_label_propagation.ipynb TFIDF baseline and label prop Sep 5, 2014
tsvm_gmm.ipynb Investigate dual/primal disagreement Aug 15, 2014
universvm.rb trying to fix bugs Aug 18, 2014



The Goal of this project is to design and run data science experiments to test various transductive and semi-supervised learning algorithms

The TSVM theory is described on my blog http://charlesmartin14.wordpress.com/2014/07/06/machine-learning-with-missing-labels-transductive-svms/

The first objective is to test svmlin http://vikas.sindhwani.org/svmlin.html

against liblinear http://www.csie.ntu.edu.tw/~cjlin/liblinear/

using the binary datasets provided for libsvm http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html

to determine how to tune svmlin well and what kind of data sets it performs well on

Later, we would like to look at Semi-Supervised learning algos such as


and the python scikit learn label propagation algo


For Newbies: If you don't know anything about machine learning, you should first learn how to run liblinear on the libsvm data sets

We need someone to create a liblinear tutorial For now, you can see http://jamescpoole.com/2012/10/30/libsvm-tutorial-part-1-overview/

libsvm is almost identical to liblinear

To get started 0. requirements: ruby 2.x and gnu parallel ruby can be installed using rvm gnu parallel should be in the path

  1. download and install liblinear and svmlin

  2. download the a1a trainig and test data sets



  1. edit svmlin, set the variables

SVMLIN_DIR = "~/packages/svmlin-v1.0"

LIBLINEAR_DIR = "~/packages/liblinear-1.94"

  1. run svmlin.rb a1a

  2. repeat for the a2a, a3a, ... data sets and the w2a, w3a, ... data sets