Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Neural nets-based utility to build low dimensional codes or/and sparse codes
Python

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
examples
src/codemaker
tests
.gitignore
README.rst
VERSION.txt
activate.sh
setup.py

README.rst

codemaker

Python utilities based on theano and pynnet and scikits.learn to learn vector encoders that map vector data to either:

  • dense codes in low dimensional space, useful for semantic mapping and visualization by trying to preserve the local structure
  • sparse codes in medium to high dimensional space, useful for semantic indexing, denoising and compression

Project status

This is experimental code. Nothing is expected to work as advertised yet :)

Implemented:

  • deterministic (optimal) sparse encoding using an existing dictionary and Least Angle Regression (see codemaker.sparse)

Work in progress:

  • stochastic neighbor embedding in low dim space using autoencoders

Planned:

  • stochastic dictionary learning and approximate sparse coding using sparsity inducing autoencoders (see Ranzato 2007)

Licensing

MIT: http://www.opensource.org/licenses/mit-license.php

Hacking

Download the source distrib of the afore mentionned dependencies, untar them in the parent folder of codemaker, build scikits.learn in local mode with python setup build_ext -i and setup the dev environment with:

$ . ./activate.sh

You should now be able to fire you favorite python shell and import the codemaker package:

>>> import codemaker
>>> help(codemaker)

Run the tests with the nosetests command.

Examples

Sample usage can be found in the examples folder. Lower level usage patterns can also be found in the tests folder.

Plot of projection and manifold extraction of the swissroll dataset

Plot showing the results of the swissroll example

Failed attempt at using the codemaker embedding utility to extract a 2D manifold from a toy dataset.
Something went wrong with that request. Please try again.