### 10.1 Iris Flowers Classification Dataset
This dataset is well studied and is a good problem for practicing on neural networks because all of the 4 input variables are numeric and have the same scale in centimeters.

### 10.2 Import Classes and Functions

In [1]:
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils import np_utils
from sklearn.model_selection import cross_val_score
from sklearn.cross_validation import KFold
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline

Using TensorFlow backend.


### 10.3 Initialize Random Number Generator
Next we need to initialize the random number generator to a constant value. This is important to ensure that the results we achieve from this model can be achieved again precisely. 

In [2]:
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)

### 10.4 Load The Dataset

In [3]:
# load dataset
dataframe = pandas.read_csv("iris.csv", header=None)
dataset = dataframe.values
print(dataframe.head())

              0            1             2            3        4
0  sepal_length  sepal_width  petal_length  petal_width  species
1           5.1          3.5           1.4          0.2   setosa
2           4.9            3           1.4          0.2   setosa
3           4.7          3.2           1.3          0.2   setosa
4           4.6          3.1           1.5          0.2   setosa


In [4]:
X = dataset[1:,0:4].astype(float)
Y = dataset[1:,4]

### 10.5 Encode The Output Variable
The output variable contains three different string values.

When modeling multiclass classification problems using neural networks, it is good practice to reshape the output attribute from a vector that contains values for each class value to be a matrix with a boolean for each class value and whether or not a given instance has that class value or not. This is called one hot encoding or creating dummy variables from a categorical variable.

In [5]:
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)

### 10.6 Define The Neural Network Model

There is a KerasClassifier class in Keras that can be used as an Estimator in scikit-learn. 
The network topology of this simple one-layer neural network can be summarized as:4 inputs -> [4 hidden nodes] -> 3 outputs

Note that we use a sigmoid activation function in the output layer. 

In [6]:
# define baseline model
def baseline_model():
# create model
    model = Sequential()
    model.add(Dense(4, input_dim=4, init='normal', activation='relu')) 
    model.add(Dense(3, init='normal', activation='sigmoid'))
  # Compile model
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 
    return model

In [7]:
estimator = KerasClassifier(build_fn=baseline_model, nb_epoch=200, batch_size=5, verbose=0)

#### 10.7 Evaluate The Model with k-Fold Cross Validation
The scikit-learn library has excellent capability to evaluate models using a suite of techniques. The gold standard for evaluating machine learning models is k-fold cross validation. 

In [8]:
kfold = KFold(n=len(X), n_folds=10, shuffle=True, random_state=seed)

Now we can evaluate our model (estimator) on our dataset (X and dummy y) using a 10-fold cross validation procedure (kfold).

In [9]:
results = cross_val_score(estimator, X, dummy_y, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))



Accuracy: 32.67% (13.15%)
