This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (
This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (
commit 1a476c505933b53728387d17c63a251dd42393c4
tree 1b669b2a3fd9da059096cb9cd784ef86bdc59496
parent 519026a29582a1c8f2154b685b57c32d7c86cc56
tree 1b669b2a3fd9da059096cb9cd784ef86bdc59496
parent 519026a29582a1c8f2154b685b57c32d7c86cc56
tagger /
| name | age | message | |
|---|---|---|---|
| |
README | Fri Oct 09 12:13:52 -0700 2009 | |
| |
autosvm.py | Fri Oct 09 09:05:25 -0700 2009 | |
| |
classify.py | Fri Oct 09 11:51:32 -0700 2009 | |
| |
conf.py | Fri Oct 09 09:05:25 -0700 2009 | |
| |
crawl_delicious.py | Fri Oct 09 11:32:25 -0700 2009 | |
| |
featurize.py | Fri Oct 09 09:05:25 -0700 2009 | |
| |
gen_training_test_set.py | Fri Oct 09 09:05:25 -0700 2009 | |
| |
libsvm-2.89.tar.gz | Wed Oct 07 20:00:46 -0700 2009 | |
| |
tags.txt | Wed Oct 07 20:00:46 -0700 2009 | |
| |
test_data.txt | Fri Oct 09 11:32:25 -0700 2009 | |
| |
training_data.txt | Fri Oct 09 11:32:25 -0700 2009 | |
| |
vector_data.cpickle | Fri Oct 09 11:32:25 -0700 2009 |
README
@author: Vik Singh (viksi@yahoo-inc.com) A simple BOSS example for Yahoo! Hack Day NYC A machine learned tagger trained on BOSS web / delicious data Read this if you want to learn more and especially check out the caveats section if you're planning to use this code for more practical purposes http://zooie.wordpress.com/2009/10/09/build-an-automatic-tagger-in-200-lines-with-boss/ # Install libsvm tar -xzvf libsvm-2.89.tar.gz cd libsvm-2.89 make cd .. # Optional: Crawl fresh delicious data via BOSS (a previous crawl already included) python crawl_delicious.py # Generate a binary training set via two tags (pick from tags.txt) python gen_training_test_set.py microsoft google # Learn from the resulting training_data.txt and predict on test_data.txt python autosvm.py training_set.txt test_set.txt # Prints out the accuracy of the learner and saves model + prediction files in timestamped folder







