Permalink
Browse files

Added and edited all script headings. Chapters 6, 7, 8, 10, and 12 st…

…ill require descriptions.
  • Loading branch information...
1 parent d70ae86 commit 6e4cba4f6defce74752f9f5ff4c448242836fde1 @drewconway drewconway committed Feb 10, 2012
@@ -1,11 +1,11 @@
# File-Name: package_installer.R
-# Date: 2011-11-01
+# Date: 2012-02-10
# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: Install all of the packages needed for the Machine Learning for Hackers case studies
# Data Used: n/a
# Packages Used: n/a
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,7 +1,6 @@
-# File-Name: ml_basics.R
-# Date: 2011-11-01
-# Author: Drew Conway
-# Email: drew.conway@nyu.edu
+# File-Name: ufo_sightings.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: Code for Chapter 1. In this case we will review some of the basic
# R functions and coding paradigms we will use throughout this book.
# This includes loading, viewing, and cleaning raw data; as well as
@@ -11,7 +10,7 @@
# Data Used: http://www.infochimps.com/datasets/60000-documented-ufo-sightings-with-text-descriptions-and-metada
# Packages Used: ggplot2
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,14 +1,14 @@
# File-Name: email_classify.R
-# Date: 2011-11-01
-# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: Code for Chapter 3. In this case we introduce the notion of binary classification.
# In machine learning this is a method for determining what of two categories a
# given observation belongs to. To show this, we will create a simple naive Bayes
# classifier for SPAM email detection, and visualize the results.
# Data Used: Email messages contained in data/ directory, source: http://spamassassin.apache.org/publiccorpus/
# Packages Used: tm, ggplot2
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,5 +1,5 @@
# File-Name: priority_inbox.R
-# Date: 2011-11-01
+# Date: 2012-02-10
# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: Code for Chapter 4. In this case study we will attempt to write a "priority
# inbox" algorithm for ranking email by some measures of importance. We will
@@ -9,7 +9,7 @@
# source: http://spamassassin.apache.org/publiccorpus/
# Packages Used: tm, ggplot2
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,3 +1,22 @@
+# File-Name: chapter05.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/longevity.csv
+# Packages Used: ggplot2
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
library('ggplot2')
# First snippet
@@ -1,3 +1,22 @@
+# File-Name: chapter06.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/oreilly.csv
+# Packages Used: ggplot2, glmnet, tm, boot
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
library('ggplot2')
# First snippet
@@ -1,3 +1,22 @@
+# File-Name: chapter07.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/01_heights_weights_genders.csv, data/lexical_database.Rdata
+# Packages Used: n/a
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
# First code snippet
height.to.weight <- function(height, a, b)
{
View
@@ -1,3 +1,22 @@
+# File-Name: chapter08.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/DJI.csv, data/stock_prices.csv
+# Packages Used: ggplot2, lubridate, reshape
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
library('ggplot2')
# First code snippet
@@ -1,5 +1,5 @@
# File-Name: senate_mds.R
-# Date: 2011-11-01
+# Date: 2012-02-10
# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: Code for Chapter 4. In this case study we introduce multidimensional scaling (MDS),
# a technique for visually displaying the simialrity of observations in
@@ -9,7 +9,7 @@
# Data Used: *.dta files in code/data/, source: http://www.voteview.com/dwnl.htm
# Packages Used: foreign, ggplot2
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,3 +1,23 @@
+# File-Name: chapter10.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/example.csv, data/installations.csv
+# Packages Used: class, reshape
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
+
# First code snippet
df <- read.csv('data/example_data.csv')
@@ -1,5 +1,5 @@
# File-Name: google_sg.R
-# Date: 2012-01-19
+# Date: 2012-02-10
# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: File 1 for code from Chapter 11. This file contains a set of functions for building
# igraph network object from the Twitter social graphs. As the initial set of code
@@ -9,7 +9,7 @@
# Data Used: Accessed via the Google SocialGraph API, source: http://code.google.com/apis/socialgraph/
# Packages Used: igraph, RCurl, RJSONIO
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,5 +1,5 @@
# File-Name: twitter_net.R
-# Date: 2011-11-01
+# Date: 2012-02-10
# Author: Drew Conway (drew.cownway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: File 2 for code in Chapter 11. In this short file we write code for generating the
# the ego-network for a given Twitter user. Once the network object has been built we
@@ -9,7 +9,7 @@
# Data Used: n/a
# Packages Used: igraph, see 01_google_sg.R
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,5 +1,5 @@
# File-Name: twitter_rec.R
-# Date: 2011-11-01
+# Date: 2012-02-10
# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
# Purpose: File 3 for code in Chapter 9. In the final piece of this case study we design a
# simple social graph reccommendation system based on Twitter data. Using the
@@ -10,7 +10,7 @@
# Data Used: data/*.graphml
# Packages Used: igraph
-# All source code is copyright (c) 2011, under the Simplified BSD License.
+# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
# All images and materials produced by this code are licensed under the Creative Commons
@@ -1,3 +1,23 @@
+# File-Name: chapter12.R
+# Date: 2012-02-10
+# Author: Drew Conway (drew.conway@nyu.edu) and John Myles White (jmw@johnmyleswhite.com)
+# Purpose:
+# Data Used: data/df.csv, dtm.RData
+# Packages Used: ggplot2, glmnet, tm, boot
+
+# All source code is copyright (c) 2012, under the Simplified BSD License.
+# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php
+
+# All images and materials produced by this code are licensed under the Creative Commons
+# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/
+
+# All rights reserved.
+
+# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
+# working directory for the console to whereever you have saved this file prior to running.
+# Otherwise you will see errors when loading data or saving figures!
+
+
library('ggplot2')
# First code snippet

0 comments on commit 6e4cba4

Please sign in to comment.