Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Objective-C Matlab
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
data
fakes
images
README
cluster_demo.m
clusters-res1.txt
dosvd.m
filmorama.rb
labels.m
labels.rb
labels2.m
myKmeans.m
notes.txt
pdist.m
prog_test.png

README

Some Octave (aka GNU's Matlab implementation) scripts

Our data is from the subset of archive.org-to-dbpedia associations described at http://danbri.org/words/2011/02/01/658

Usage:

 * Run Octave (OSX: open /Applications/Octave.app)
 * cd /Users/danbri/working/Flatland (or wherever)
 * dosvd [return] at Octave prompt



Basic data (num rows): in data/ ...
	data/programmes.txt  156
	data/concepts.txt 4760
	data/concepts.txt 156


... numeric matrix derrived from movies on archive.org, plus flattened rdf properties.
The programmes.txt and concepts.txt files give labels for rows and columns respectively.

First row of data.txt is numeric IDs (todo: skip these)

cluster_demo.m dosvd.m 

* labels.m - generated by labels.rb from data/programmes.txt

myKmeans.m - one of the several k-means implementations for Octave, found at 
	http://www.christianherta.de/kmeans.html
	see also http://openclassroom.stanford.edu/MainFolder/courses/MachineLearning/exercises/ex9materials/kmeans.m


pdist.m - useful for computing distances (This is GPL, so we should either use it and GPL everything, or tidy it away -- todo)
redo.m - hack to re-run the non-slow bit of dosvd.m

misc documentation: prog_test.png ... failed/somethingwrong viz attempt, output from dosvd.m

See also (Mahout etc.):

 * https://cwiki.apache.org/MAHOUT/k-means-clustering.html (Mahout k-means, to scale this up once we get ideas straight)
 * https://cwiki.apache.org/MAHOUT/dimensional-reduction.html https://cwiki.apache.org/MAHOUT/svd-singular-value-decomposition.html
 * http://lucene.grantingersoll.com/2010/02/16/trijug-intro-to-mahout-slides-and-demo-examples/
 * http://danbri.org/words/2011/06/19/711 k-means notes for octave


NOTES

octave-3.2.3:4> size(M)
ans =

    156   4761

octave-3.2.3:5> size(U)
ans =

   156   156

octave-3.2.3:6> size(S)
ans =

    156   4761

octave-3.2.3:7> size(V)
ans =

   4761   4761



BUGS:

16:45 danbri: hmm is your one missing the row for prog=10?
16:45 danbri: i'm one film short
16:46 libby: yeah apears to be
16:46 danbri: try  egrep '^10' data.txt  


Something went wrong with that request. Please try again.