This repository has been archived by the owner. It is now read-only.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
follower cluster using infochimps wordbag api can i cluster the people i follow on twitter? get friends and stick that into ./get_words_for_friends.rb which injects into mongo $ ./get_words_for_friends.rb mat_kelcey INFOCHIMPS_APIKEY for my 142 friends there is mostly no overlap with terms of the 10,000 pairings there are only 3100 cases where there is at least one item in common the max overlap is 15 items, median 0, mean 0.4 :( running tokens through a snowball stemmer (http://github.com/aurelian/ruby-stemmer) improves a fraction... max of 17 with mean of 0.6 so without even worrying about a similiarity metric what about just seeing who has any overlapping tokens? run over and cluster ppl with $ ./cluster.rb | sort -nr > similiarities.tsv $ cat similiarities.tsv | head -n 35 | ./dotify.rb | circo -Tpng > top35.png --- irbness irb$ require 'rubygems';require 'mongo';mongo=Mongo::Connection.new('localhost',12300);users=mongo.db('wordbag')['users']