Skip to content

Implementation of Barnes-Hut t-SNE, a fast algorithm that embeds high-dimensional data in low dimensions. Binaries available for some distributions.

License

sg-s/bhtsne

 
 

Repository files navigation

Barnes-Hut t-SNE

This software package contains a Barnes-Hut implementation of the t-SNE algorithm. The implementation is described in this paper.

Installation

Assuming that you only using the MATLAB wrapper that comes with this, install using my package manager:

% copy and paste this into your MATLAB prompt
urlwrite('http://srinivas.gs/install.m','install.m'); 
install sg-s/bhtsne
install sg-s/srinivas.gs_mtools

Then, you can either compile from source, or use one of the binaries linked here.

Compiling binaries from source

On Linux or OS X, compile the source using the following command:

g++ sptree.cpp tsne.cpp -o bh_tsne -O2

The executable will be called bh_tsne.

On Windows using Visual C++, do the following in your command line:

  • Find the vcvars64.bat file in your Visual C++ installation directory. This file may be named vcvars64.bat or something similar. For example:
  // Visual Studio 12
  "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\amd64\vcvars64.bat"

  // Visual Studio 2013 Express:
  C:\VisualStudioExp2013\VC\bin\x86_amd64\vcvarsx86_amd64.bat
  • From cmd.exe, go to the directory containing that .bat file and run it.

  • Go to bhtsne directory and run:

  nmake -f Makefile.win all

The executable will be called windows\bh_tsne.exe.

Usage

The code comes with wrappers for Matlab and Python. These wrappers write your data to a file called data.dat, run the bh_tsne binary, and read the result file result.dat that the binary produces. There are also external wrappers available for Torch, R, and Julia. Writing your own wrapper should be straightforward; please refer to one of the existing wrappers for the format of the data and result files.

Demonstration of usage in Matlab:

filename = websave('mnist_train.mat', 'https://github.com/awni/cs224n-pa4/blob/master/Simple_tSNE/mnist_train.mat?raw=true');
load(filename);
numDims = 2; pcaDims = 50; perplexity = 50; theta = .5; alg = 'svd';
map = fast_tsne(digits', numDims, pcaDims, perplexity, theta, alg);
gscatter(map(:,1), map(:,2), labels');

About

Implementation of Barnes-Hut t-SNE, a fast algorithm that embeds high-dimensional data in low dimensions. Binaries available for some distributions.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C++ 79.0%
  • Python 13.1%
  • MATLAB 7.9%