code for the ml class
Python
Switch branches/tags
Nothing to show
Pull request Compare This branch is 3 commits ahead, 22 commits behind hmason:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
arts
classify.py
delicious_import.py
distance_demo.py
links.csv
nytimes_pull.py
rec.py
sports
stopwords.txt
tag_clustering.py

README.md

Machine Learning

class taught by Hilary Mason

Install

  • JSONView Chrome extension
  • Python (2.5, 2.6 or 2.7). Note forom scipy.org: NumPy installer should be used with the Python from http://python.org, not with Apple Python. These two are indeed incompatible, for one the python.org version is 32-bit while Apple version is 64-bit. Apple is also way behind on security updates, so normally python.org is the way to go.
  • NLTK
  • NumPy
  • pycluster
  • hcluster

Verify your install (the commands below after '$' should be typed at the command prompt, the rest is sample output)

$ python --version
Python 2.6.1

any version of 2.5, 2.6 or 2.7 is fine. 3.0 is not going to work.

$ python
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> 

If you don't get an error, it means it worked. Type exit() to leave the interactive python console.

References

Classifying Web Documents

Register for an API key at http://developer.nytimes.com/apps/register and select "Article Search API"

Article Search API

example command-line: curl "http://api.nytimes.com/svc/search/v1/article?query=jazz&api-key="

python nytimes_pull.py

creates two files "arts" and "sports"

Naive Bayes Clasifier

python