# Artificial Neural Networks

Typically made up of three common characteristics: activation function, network topology, training algorithm

### Activation Function:

Unit step, sigmoid, linear, saturated linear, hyperbolic tangent, gaussian

### Network Topology:

\# of layers, can information travel backwards, \# of nodes within each layer of the network

#### MLP (Multilayer Perceptron) 
is a multilayer feedforward network. And the **backpropagation algorithm** used within MLP is common in data mining.

**Strengths**

**1)** Can be adapted to classification or numeric prediction problems 

**2)** Among the most accurate modeling approaches 

**3)** Makes few assumptions about the data's underlying relationships 

**Weaknesses**

**1)** Reputation of being computationally intensive and slow to train, particularly if the network topology is complex 

**2)** Easy to overfit or underfit training data 

**3)** Results in a complex black box model that is difficult if not impossible to interpret


## Modeling the Strength of Concrete

### Step One: Collect Data

In [1]:
concrete <- read.csv("../csv/concrete.csv")

### Step Two: Explore and Prepare the Data

In [2]:
str(concrete)

'data.frame':	1030 obs. of  9 variables:
 $ cement      : num  540 540 332 332 199 ...
 $ slag        : num  0 0 142 142 132 ...
 $ ash         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ water       : num  162 162 228 228 192 228 228 228 228 228 ...
 $ superplastic: num  2.5 2.5 0 0 0 0 0 0 0 0 ...
 $ coarseagg   : num  1040 1055 932 932 978 ...
 $ fineagg     : num  676 676 594 594 826 ...
 $ age         : int  28 28 270 365 360 90 365 28 28 28 ...
 $ strength    : num  80 61.9 40.3 41 44.3 ...


Neural networks work best when the input data are scaled to a narrow range around zero, and here we see values ranging anywhere from zero up to over a thousand. Typically, the solution to this problem is to rescale the data with a normalizing or standardization function. If the data follow a bell-shaped curve (a normal distribution as described in Chapter 2, Managing and Understanding Data), then it may make sense to use standardization via R's built-in scale() function. On the other hand, if the data follow a uniform distribution or are severely non-normal, then normalization to a 0-1 range may be more appropriate. In this case, we'll use the latter. 

In [2]:
normalize <- function(x) {
    return((x - min(x)) / (max(x) - min(x)))
} 

In [3]:
concrete_norm <- as.data.frame(lapply(concrete, normalize))

In [5]:
summary(concrete_norm)

     cement            slag              ash             water       
 Min.   :0.0000   Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
 1st Qu.:0.2063   1st Qu.:0.00000   1st Qu.:0.0000   1st Qu.:0.3442  
 Median :0.3902   Median :0.06121   Median :0.0000   Median :0.5048  
 Mean   :0.4091   Mean   :0.20561   Mean   :0.2708   Mean   :0.4774  
 3rd Qu.:0.5662   3rd Qu.:0.39775   3rd Qu.:0.5912   3rd Qu.:0.5607  
 Max.   :1.0000   Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
  superplastic      coarseagg         fineagg            age         
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000   Min.   :0.00000  
 1st Qu.:0.0000   1st Qu.:0.3808   1st Qu.:0.3436   1st Qu.:0.01648  
 Median :0.1988   Median :0.4855   Median :0.4654   Median :0.07418  
 Mean   :0.1927   Mean   :0.4998   Mean   :0.4505   Mean   :0.12270  
 3rd Qu.:0.3168   3rd Qu.:0.6640   3rd Qu.:0.5770   3rd Qu.:0.15110  
 Max.   :1.0000   Max.   :1.0000   Max.   :1.0000   Max.   :1.00000  
    strength     
 M

In [4]:
concrete_train <- concrete_norm[1:773, ]
concrete_test <- concrete_norm[774:1030, ]

### Step Three: Training the Model on Data

In [7]:
install.packages("neuralnet", repo = "https://cran.r-project.org")

package 'neuralnet' successfully unpacked and MD5 sums checked

The downloaded binary packages are in
	C:\Users\Bradley Bailey\AppData\Local\Temp\RtmpALox3C\downloaded_packages


In [5]:
library(neuralnet)

: package 'neuralnet' was built under R version 3.2.5Loading required package: grid
Loading required package: MASS


In [8]:
 concrete_model <- neuralnet(strength ~ cement + slag + ash + 
                             water + superplastic + coarseagg + 
                             fineagg + age, data = concrete_train)

### Step Four: Evaluating Model Performance

In [9]:
 model_results <- compute(concrete_model, concrete_test[1:8])

In [10]:
predicted_strength <- model_results$net.result 

In [11]:
cor(predicted_strength, concrete_test$strength)

0
0.7211270762


### Step 5: Improving Model Performance

In [6]:
 concrete_model2 <- neuralnet(strength ~ cement + slag + ash + 
                             water + superplastic + coarseagg + 
                             fineagg + age, data = concrete_train,
                             hidden = 5)

In [9]:
model_results2 <- compute(concrete_model2, concrete_test[1:8])

In [12]:
predicted_strength2 <- model_results2$net.result 

In [13]:
cor(predicted_strength2, concrete_test$strength)

0
0.6985793824
