<a href="https://colab.research.google.com/github/lsuhpchelp/lbrnloniworkshop2019/blob/master/day3_nn_R/nn.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Neural Network with R
===

# Outline


*   **Install and load R packages**

*   **`nnet` package**

*   **`neuralnet` package**

# Install and load R packages

May take a while

R packages to be installed:

In [0]:
install.packages("reshape")
install.packages("faraway")
install.packages("nnet")
install.packages("neuralnet")

Load R packages and scripts:

In [0]:
library(reshape)
library(nnet)
library(faraway)
library(neuralnet)
library(datasets)
download.file("https://gist.githubusercontent.com/fawda123/7471137/raw/466c1474d0a505ff044412703516c34f1a4684a5/nnet_plot_update.r","nnet_plot_update.r")
source("nnet_plot_update.r")

# 1. `nnet` package

## 1.1 Ozone data
* We apply the neural networks to the `ozone` data which was analyzed before using the `nnet` package, due to Venables and Ripley (2002). 
* `Ozone` data is included in the `faraway` package, which has 330 observations on the following 10 variables. 
> * **O3** Ozone conc., ppm, at Sandbug AFB.
> * **vh** a numeric vector
>*  **wind** wind speed
>* **humidity** a numeric vector
>* **temp** temperature
>* **ibh** inversion base height
>* **dpg** Daggett pressure gradient
>* **ibt** a numeric vector
>* **vis** visibility
>* **doy** day of the year

In [0]:
attach(ozone)
summary(ozone)

### 1.1.1 
* We started with just **three** variables for simplicity and fit a feed-forward neural network with **one** hidden layer containing **two** units and a linear output. 
> * Why linear output? This is a regression problem. 

The result (nnmd1) from will contain # of weights and the initial and final residual sum of squres (RSS, aka sum of squared errors of prediction (SSE)): 

In [5]:
nnmd1 <- nnet(O3~temp+ibh+ibt,ozone,size=2,linout=T)


# weights:  11
initial  value 65581.373332 
iter  10 value 21101.930641
final  value 21099.396948 
converged


* We calculate the total sum of squares:

In [6]:
sum((O3-mean(O3))^2)


## Quiz
* Why the total weights is 11?  
* Is this neural network model good or not? Why?

### 1.1.2 scaling variables
* The problems comes from the initial selection of the weights. It is hard to select the initial weights when the variables have very different scales. The solution is to rescale the data to mean zero and unit variance. 

Check standard deviation before and after the scaling:

In [0]:
apply(ozone,2,sd)
ozone2 <- scale(ozone)
apply(ozone2,2,sd)
nnmd1 <- nnet(O3~temp+ibh+ibt,ozone2,size=2,linout=T)
bestrss <- 10000
for (i in 1:100){
set.seed(i)
 nnmd1 <- nnet(O3~temp+ibh+ibt,ozone2,size=2,linout=T,trace=F)
 if (nnmd1$value < bestrss){
 bestnn <- nnmd1
 bestrss <- nnmd1$value
 }
 }
bestnn$value
summary(bestnn)

In [0]:
plot.nnet(bestnn)

In [0]:
bestrss <- 10000
for (i in 1:100){
set.seed(i)
 nnmd1 <- nnet(O3~temp+ibh+ibt,ozone2,size=2,linout=T,decay=0.001, trace=F)
 if (nnmd1$value < bestrss){
 bestnn <- nnmd1
 bestrss <- nnmd1$value
 }
 }
 bestnn$value
 summary(bestnn)

In [0]:
estrss <- 10000
for (i in 1:100){
set.seed(i)
 nnmd1 <- nnet(O3~temp+ibh+ibt,ozone2,size=4,linout=T, trace=F)
 if (nnmd1$value < bestrss){
 bestnn <- nnmd1
 bestrss <- nnmd1$value
 }
 }
 bestnn$value
 summary(bestnn)
# with decay
bestrss <- 10000
for (i in 1:100){
set.seed(i)
 nnmd1 <- nnet(O3~temp+ibh+ibt,ozone2,size=4,linout=T,decay=0.001, trace=F)
 if (nnmd1$value < bestrss){
 bestnn <- nnmd1
 bestrss <- nnmd1$value
 }
 }
 bestnn$value
 summary(bestnn)

In [0]:
install.packages("caret")

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
also installing the dependencies ‘numDeriv’, ‘SQUAREM’, ‘lava’, ‘prodlim’, ‘iterators’, ‘data.table’, ‘gower’, ‘ipred’, ‘RcppRoll’, ‘timeDate’, ‘foreach’, ‘ModelMetrics’, ‘recipes’



In [0]:
library(caret)

Loading required package: lattice

Attaching package: ‘lattice’

The following object is masked from ‘package:faraway’:

    melanoma

Loading required package: ggplot2
Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang


In [0]:
my.grid <- expand.grid(.decay = c(0.0001, 0.001,0.01), .size = c(1, 2, 3,4))
nn.model <- train(O3~ .,ozone2,method="nnet",tuneGrid = my.grid,trace=F)

In [0]:
summary(nn.model)

a 9-4-1 network with 45 weights
options were - decay=0.01
 b->h1 i1->h1 i2->h1 i3->h1 i4->h1 i5->h1 i6->h1 i7->h1 i8->h1 i9->h1 
 -1.39  -4.30  -2.46   4.25  -2.10   0.31  -0.45  -3.47   1.30   3.38 
 b->h2 i1->h2 i2->h2 i3->h2 i4->h2 i5->h2 i6->h2 i7->h2 i8->h2 i9->h2 
  0.91   0.24  -2.47  -1.01   0.03  -0.86   1.98   3.01  -2.71   3.08 
 b->h3 i1->h3 i2->h3 i3->h3 i4->h3 i5->h3 i6->h3 i7->h3 i8->h3 i9->h3 
  0.80   2.43  -0.56  -2.72  -0.79   1.18   2.14  -3.21  -1.23   0.59 
 b->h4 i1->h4 i2->h4 i3->h4 i4->h4 i5->h4 i6->h4 i7->h4 i8->h4 i9->h4 
 -2.86  -2.45   1.30  -3.74   0.23  -2.75  -4.22   0.38   1.22   3.07 
 b->o h1->o h2->o h3->o h4->o 
-1.42 -7.07  7.55 -7.12 -6.93 

In [0]:
str(infert)
table(infert$case)
set.seed(1)
indx <- sample(1:248,size=248,replace=F)
dat1 <- infert[indx[1:200],] #train set
dat2 <- infert[indx[201:248],] #test set


'data.frame':	248 obs. of  8 variables:
 $ education     : Factor w/ 3 levels "0-5yrs","6-11yrs",..: 1 1 1 1 2 2 2 2 2 2 ...
 $ age           : num  26 42 39 34 35 36 23 32 21 28 ...
 $ parity        : num  6 1 6 4 3 4 1 2 1 2 ...
 $ induced       : num  1 1 2 2 1 2 0 0 0 0 ...
 $ case          : num  1 1 1 1 1 1 1 1 1 1 ...
 $ spontaneous   : num  2 0 0 0 1 1 0 0 1 0 ...
 $ stratum       : int  1 2 3 4 5 6 7 8 9 10 ...
 $ pooled.stratum: num  3 1 4 2 32 36 6 22 5 19 ...



  0   1 
165  83 

In [0]:
set.seed(2)
nn <- neuralnet(case~age+parity+induced+spontaneous,data=dat1,hidden=4,err.fct="ce",linear.output=FALSE)
nn

In [0]:
names(nn)
out <- cbind(nn$covariate,nn$net.result[[1]])
dimnames(out) <- list(NULL,c("age","parity","induced","spontaneous","nn-output"))
head(out)
plot(nn)

In [0]:
plot(nn)