Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master
Fetching contributors…

Octocat-spinner-32-eaf2f5

Cannot retrieve contributors at this time

file 57 lines (36 sloc) 1.797 kb

Machine Learning

class taught by Hilary Mason

Install

  • JSONView Chrome extension
  • Python (2.5, 2.6 or 2.7). Note forom scipy.org: NumPy installer should be used with the Python from http://python.org, not with Apple Python. These two are indeed incompatible, for one the python.org version is 32-bit while Apple version is 64-bit. Apple is also way behind on security updates, so normally python.org is the way to go.
  • NLTK
  • NumPy
  • pycluster
  • hcluster

Verify your install (the commands below after '$' should be typed at the command prompt, the rest is sample output)

$ python --version
Python 2.6.1

any version of 2.5, 2.6 or 2.7 is fine. 3.0 is not going to work.

$ python
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49) 
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> 

If you don't get an error, it means it worked. Type exit() to leave the interactive python console.

References

Classifying Web Documents

Register for an API key at http://developer.nytimes.com/apps/register and select "Article Search API"

Article Search API

example command-line: curl "http://api.nytimes.com/svc/search/v1/article?query=jazz&api-key="

python nytimes_pull.py

creates two files "arts" and "sports"

Naive Bayes Clasifier

python

Something went wrong with that request. Please try again.