<font size=4 color='blue'>

# <center> Clase 7, julio 7 del 2021 </center>

<font size=4 color='blue'>

# <center> Topic that the Machine will learn: Mortality from diabetes </center>

<font size=4 color='blue'>
    
## Information about the topic

<font size=4>

Evolution of diabetes after one year.
    
In the present work, we characterize diabetes with the following ten features: age, sex, body mass index, mean blood pressure, and six measurements of blood serum (S1, S2, S3, S4, S5, S6).

<font size=4 color='blue'>
    
## Quantification of this information

<font size=4>

Information is available on 442 patients (samples). The response of interest, Y, is a quantitative measure of disease progression one year after the start of the study. Y values vary between 25 and 346​.

Information source: [diabetes data (samples)](https://www4.stat.ncsu.edu/~boos/var.select/diabetes.html)    

Original paper: [Least-Angle-Regression_2004](./Literatura/Least-Angle-Regression_2004.pdf)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import time

In [None]:
# Samples are available in the file diabetes.csv

df = pd.read_csv('diabetes.csv', sep ='\t')

In [None]:
# Showing the first 5 samples (features and target Y)

df.head()

<font size=4>
Abreviation shave the following meaning:
    
    AGE = Age
    SEX = Sex
    BMI = Body Mass Index (BMI)
     BP = Mean Arterial Pressure (MAP)
     S1 = Total Cholesterol (TC)
     S2 = Low Density lipoproteins (LDL)
     S3 = High Density lipoproteins (HDL)
     S4 = Triglyceride (TG, TCH)
     S5 = Serum Concentration of Lamorigine (LTG)
     S6 = Glucose (GLU)
     Y = Quantitative Measure of Diabetes Mellitus Disease Progression (QMDMDP) one year after the baseline.

In [None]:
# The describe() method generates a table with statistical information 
# for each of the features and the target.

df.describe()

## Histograms are created for each of the features that characterize patients with diabetes:

In [None]:
plt.figure(figsize=(20,8)) 

ax1 = plt.subplot(2,4,1)
ax2 = plt.subplot(2,4,2)
ax3 = plt.subplot(2,4,3)
ax4 = plt.subplot(2,4,4)

ax1.hist(df.AGE, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('Age (years)', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df.SEX, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('Sex', size=15)

ax3.hist(df.BMI, bins=30, color='red',edgecolor='purple', alpha=0.5)
ax3.set_xlabel('Body_mass_index', size=15)

ax4.hist(df.BP, bins=30, color='blue',edgecolor='purple', alpha=0.5)
ax4.set_xlabel('Mean_Arterial_Pressure', size=15);

In [None]:
plt.figure(figsize=(20,8)) 

ax1 = plt.subplot(2,4,1)
ax2 = plt.subplot(2,4,2)
ax3 = plt.subplot(2,4,3)
ax4 = plt.subplot(2,4,4)

ax1.hist(df.S1, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('Total Cholesterol', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df.S2, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('Low Density lipoproteins', size=15)

ax3.hist(df.S3, bins=30, color='red',edgecolor='purple', alpha=0.5)
ax3.set_xlabel('High Density lipoproteins', size=15)

ax4.hist(df.S4, bins=30, color='blue',edgecolor='purple', alpha=0.5)
ax4.set_xlabel('Triglyceride', size=15);

In [None]:
plt.figure(figsize=(15,8)) 

ax1 = plt.subplot(2,3,1)
ax2 = plt.subplot(2,3,2)
ax3 = plt.subplot(2,3,3)

ax1.hist(df.S5, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('Serum Concentration of Lamorigine', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df.S6, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('Glucose', size=15)

ax3.hist(df.Y, bins=30, color='purple',edgecolor='black', alpha=0.5)
ax3.set_xlabel('Y(Diabetes Mellitus Disease Progression)', size=15)

<font size=4>

To remove any possible correlation between the samples (the rows of the DataFrame), they are randomly reordered.

In [None]:
np.random.seed(1)

df = df.sample(frac=1)

<font size=4>
    
The original samples are divided into 2 sets: 90% for learning and 10% for making inferences (predictions) after learning.

In [None]:

test_ratio = 0.1

learn_ratio = int((1.0-test_ratio)*len(df.values[:,:]))

df_learn = df.iloc[0:learn_ratio,:]
df_test  = df.iloc[learn_ratio:,:]

In [None]:
print(df_learn.shape)
print(df_test.shape)

<font size=4>

All variables must have the same order of magnitude in order to operate with the models.
As a result, both the features (X) and the target (Y) values are normalized in the samples that will be utilized in learning: 
    
$$x_{i,norm} = \dfrac{x_{i}-\mu}{\sigma}$$
    
$$y_{i,norm} = \dfrac{y_{i}-\mu}{\sigma}$$

In [None]:
mu = df_learn.mean()
sigma = df_learn.std()
df_learn_norm = (df_learn - mu)/ sigma
df_learn_norm.head()

<font size=5 color='blue'> 
Important note: The normalization of the test samples is carried out with the values ​​of $ \ mu $ and $ \ sigma $ obtained with the samples used for learning

In [None]:
df_test_norm = (df_test - mu) / sigma
df_test_norm.head()

<font size=4>
    
Histograms of the variables to be used in the learning:

In [None]:
plt.figure(figsize=(20,8)) 

ax1 = plt.subplot(2,4,1)
ax2 = plt.subplot(2,4,2)
ax3 = plt.subplot(2,4,3)
ax4 = plt.subplot(2,4,4)

ax1.hist(df_learn_norm.AGE, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('x1(Age)', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df_learn_norm.SEX, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('x2(Sex)', size=15)

ax3.hist(df_learn_norm.BMI, bins=30, color='red',edgecolor='purple', alpha=0.5)
ax3.set_xlabel('x3(Body_mass_index)', size=15)

ax4.hist(df_learn_norm.BP, bins=30, color='blue',edgecolor='purple', alpha=0.5)
ax4.set_xlabel('x4(Mean_Arterial_Pressure)', size=15);

In [None]:
plt.figure(figsize=(20,8)) 

ax1 = plt.subplot(2,4,1)
ax2 = plt.subplot(2,4,2)
ax3 = plt.subplot(2,4,3)
ax4 = plt.subplot(2,4,4)

ax1.hist(df_learn_norm.S1, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('x5(Total Cholesterol)', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df_learn_norm.S2, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('x6(Low Density lipoproteins)', size=15)

ax3.hist(df_learn_norm.S3, bins=30, color='red',edgecolor='purple', alpha=0.5)
ax3.set_xlabel('x7(High Density lipoproteins)', size=15)

ax4.hist(df_learn_norm.S4, bins=30, color='blue',edgecolor='purple', alpha=0.5)
ax4.set_xlabel('x8(Triglyceride)', size=15);

In [None]:
plt.figure(figsize=(20,8)) 

ax1 = plt.subplot(2,3,1)
ax2 = plt.subplot(2,3,2)
ax3 = plt.subplot(2,3,3)

ax1.hist(df_learn_norm.S5, bins=30, color='green',edgecolor='purple', alpha=0.5)
ax1.set_xlabel('x9(Serum Concentration of Lamorigine)', size=15)
ax1.set_ylabel('Frequency', size=15)

ax2.hist(df_learn_norm.S6, bins=30, color='orange',edgecolor='purple', alpha=0.5)
ax2.set_xlabel('x10(Glucose)', size=15)

ax3.hist(df_learn_norm.Y, bins=30, color='purple',edgecolor='black', alpha=0.5)
ax3.set_xlabel('Y(Diabetes Mellitus Disease Progression)', size=15)


<font size=4>
X and Y values are extracted from the columns of the DataFrame.

In [None]:
learn_x = df_learn_norm.values[:,:-1]
learn_y = df_learn_norm.values[:,-1:]

In [None]:
test_x = df_test_norm.values[:,:-1]
test_y = df_test_norm.values[:,-1:]

In [None]:
print(learn_x.shape)
print(learn_y.shape)
print(test_x.shape)
print(test_y.shape)

<font size=5 color='blue'>

# <center> Modeling different Learning Machines </center>




<font size=4 color='blue'>

# <center> Implemented using the Keras framework as frontend </center>


<font size=4 color='mediumvioletred'>
   
[Keras](https://keras.io/)

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow.keras.layers import Activation
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.models import Model
from tensorflow.keras.utils import plot_model
from tensorflow.keras import initializers
from tensorflow.keras import optimizers

np.random.seed(1)

In [None]:
keras.__version__

<font size=4 color='black'>

The features that determine the phenomen are described by the vector  $X = (x_1, x_2, x_3, ...x_k,...x_K)$
    
The model assumes that the output Y varies linearly with each feature
    $$ F(X) = \sum_{k=1}^K w_k*x_k + b$$

<font size=5 color='blue'>
    
## <center> The first Learning Machine</center> 

<font size=5 color='blue'>

Model of the Machine: the output $Y$ depends linearly on each of the features.

In [None]:
import networkx as nx

class Network(object):
    
    def  __init__ (self,sizes):
        self.num_layers = len(sizes)
        print("It has", self.num_layers, "layers,")
        self.sizes = sizes
        print("with the following number of nodes per layer",self.sizes)
        self.biases = [np.random.randn(y, 1) for y in sizes[1:]]
        self.weights = [np.random.randn(y, x)
                        for x, y in zip(sizes[:-1], sizes[1:])]
        
    def feedforward(self, x_of_sample):
        """Return the output of the network F(x_of_sample) """        
        for b, w in zip(self.biases, self.weights):
            x_of_sample = sigmoid(np.dot(w, x_of_sample)+b)
        return x_of_sample
    
    def graph(self,sizes):
        a=[]
        ps={}
        Q = nx.Graph()
        for i in range(len(sizes)):
            Qi=nx.Graph()    
            n=sizes[i]
            nodos=np.arange(n)
            Qi.add_nodes_from(nodos)
            l_i=Qi.nodes
            Q = nx.union(Q, Qi, rename = (None, 'Q%i-'%i))
            if len(l_i)==1:
                ps['Q%i-0'%i]=[i/(len(sizes)), 1/2]
            else:
                for j in range(len(l_i)+1):
                    ps['Q%i-%i'%(i,j)]=[i/(len(sizes)),(1/(len(l_i)*len(l_i)))+(j/(len(l_i)))]
            a.insert(i,Qi)
        for i in range(len(a)-1):
            for j in range(len(a[i])):
                for k in range(len(a[i+1])):
                    Q.add_edge('Q%i-%i' %(i,j),'Q%i-%i' %(i+1,k))            
        nx.draw(Q, pos = ps)
                

In [None]:
n_x = learn_x.shape[1] 
n_y = learn_y.shape[1]
    
layers = [n_x, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Definition of the architecture. 
    
It includes the initialization of weights and biases, as well as the activation functions.

In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
output_nodes = n_y    # The output layer has n_y nodes

model = Sequential()

# For the first layer, you need to indicate its input layer, which corresponds to
# the input layer of the network.

model.add(Dense(output_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='linear'))


<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition

In [None]:
# We define the optimizing function and their hyperparameters: learining rate(lr), 
# decay, momentum and nesterov (whether to apply Nesterov gradient)

sgd = optimizers.SGD(lr=0.01, momentum=0.0, nesterov=False)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is learning

In [None]:
# 10 % of the learning samples will be used to validate the learning
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio, verbose=1)

# the "history" object contains the information generated during the learning

<font size=5 color='blue'>

Plots of the cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'], loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Underfitting

<font size = 5 color = 'blue'>
    
Evaluation of the Smart Machine. 
    
This is done using the test samples.

In [None]:
preds = model.evaluate(x=test_x, y=test_y)

print ("Loss = " + str(preds))

<font size=5 color='blue'>
    
## <center> A new Learning Machine</center> 

<font size=5 color='blue'>

Model of the Machine: The output $Y$ does not depend linearly with the features. 
This fact is modeled with a sigmoid type function; for example, a hyperbolic tangent

<font size=4 color='black'>

The features that determine the phenomen are described by the vector  $X = (x_1, x_2, x_3, ...x_k,...x_K)$
    
The our model assumes that the output Y varies lienarly with each feature
    $$ z = \sum_{k=1}^K w_k*x_k + b$$
    $$ F(X) = tanh(z)= \frac{{e}^{2z} - 1}{{e}^{2z} + 1}$$

In [None]:
def tanh(z):
    return (np.exp(2*z)- 1)/(np.exp(2*z)+1)

# The following array is generated for plotting the hyperbolic tangent function
x1 = np.arange(-2, 2.0, 0.1)
y1 = 1.759*tanh((2/3*x1))

y2 = x1
#Samples and function F are plotted
plt.figure(figsize=(13,8))

plt.rc('xtick', labelsize=16)
plt.rc('ytick', labelsize=16)
plt.rc('legend', fontsize=16)
plt.ylabel('Y', fontsize=16)
plt.xlabel('Z', fontsize=16)
plt.grid(True)
plt.title('Sigmoid-type = 1.7159*tanh((2/3*x)', size=20)

#Plotting function
plt.plot(x1, y1, color='green', lw=4)
plt.plot(x1, y2, color='red', lw=3)

plt.show()

In [None]:
n_x = learn_x.shape[1] 
n_y = learn_y.shape[1]
    
layers = [n_x, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
output_nodes = n_y    # The output layer has n_y nodes

model = Sequential()

model.add(Dense(output_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))


<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)


<font size=5 color='blue'>
    
The Machine is learning

In [None]:
# 10 % of the learning data will be used to validate the training
validation_fraction = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_fraction, verbose=2)

# the "history" object contains information generated during learning

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Uderfitting

<font size=5 color='blue'>
    
## <center> Learning Machines constructed with Artificial Neural Networks (ANN)</center> 

<font size=5 color='black'>
<center> SEM images of a neuron and a network of neurons. Neuron model and mathematical model of a neuron </center>    


<table>
  <tr>
    <td>Neuron</td>
     <td>Network of neurons</td>
      <td>Neuron model</td>
      <td>Mathematical model of a neuron</td>
         
  </tr>
  <tr>
    <td><img src="neuron_SEM.jpg" width=290 height=480></td>
    <td><img src="human-neuron.png" width=270 height=480></td>
    <td><img src="Neuron_labelled.png" width=200 height=380></td>
    <td><img src="neuron-mat-model.png" width=370 height=380></td>
  </tr>
 </table>

<font size=5 color='black'>
<center> Approximation by Superpositions of a Sigmoidal Function </center>    

<font size=4 color='black'>
    
[Reference](./Literatura/Approx-superpositions-sigmoids_1989.pdf)
$$ $$    
$\bf Abstract$-In this paper we demonstrate that finite linear combinations of com-
positions of a fixed, univariate function and a set ofaffine functionals can uniformly
approximate any continuous function of n real variables with support in the unit
hypercube; only mild conditions are imposed on the univariate function. Our
results settle an open question about representability in the class of single bidden
layer neural networks. In particular, we show that arbitrary decision regions can
be arbitrarily well approximated by continuous feedforward neural networks with
only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The
paper discusses approximation properties of other possible types of nonlinearities
that might be implemented by artificial neural networks.
    


<font size=5 color='black'>
<center> Approximation Capabilities of Multilayer Feedforward Networks </center>    

<font size=4 color='black'>
$\bf Abstract$--We show that standard multilayer feedfbrward networks with as few as a single hidden layer and
arbitrary bounded and nonconstant activation function are universal approximators with respect to LP(lt) per-
formance criteria, for arbitrary finite input environment measures p, provided only that sufficiently many hidden
units are available. If the activation function is continuous, bounded and nonconstant, then continuous mappings
can be learned uniformly over compact input sets. We also give very general conditions ensuring that networks
with sufficiently smooth activation functions are capable of arbitrarily accurate approximation to a_Function and
its derivatives.
    
[Reference](./Literatura/FF-NN-universal-Approximator_1991.pdf)

<font size=5 color='blue'>
    
## <center>  A new Learning Machine</center> 

<font size=5 color='blue'>
    
Model of the Machine: Full-Connected Feed-Forward Network (FF) with one hidden layer with two neurons. The activation function of the last neuron is linear


In [None]:
n_x = learn_x.shape[1] 
n_h = 2
n_y = learn_y.shape[1]
    
layers = [n_x, n_h, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
hlayer1_nodes = n_h   # The first hidden layer has n_h nodes
output_nodes = n_y    # The output layer has n_y nodes

model = Sequential()

model.add(Dense(hlayer1_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))

model.add(Dense(output_nodes, kernel_initializer='uniform', bias_initializer='zeros', activation='linear'))


<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is learning

In [None]:
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio, verbose=0)

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

A good model

<font size=5 color='blue'>
    
## <center>  A new Learning Machine</center> 

<font size=5 color='blue'>
    
Model of the Machine: Full-Connected Feed-Forward Network (FF) with one hidden layer with two neurons. The activation function of the last neuron is a sigmoid function.


In [None]:
n_x = learn_x.shape[1] 
n_h = 2
n_y = learn_y.shape[1]
    
layers = [n_x, n_h, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
hlayer1_nodes = n_h   # The first hidden layer has n_h nodes
output_nodes = n_y    # The output layer has n_y nodes

model = Sequential()

model.add(Dense(hlayer1_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))

model.add(Dense(output_nodes, kernel_initializer='uniform', bias_initializer='zeros', activation='tanh'))

<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is Learning

In [None]:
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio, verbose=0)

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Overfitting

<font size=5 color='blue'>
    
## <center> A new Learning Machine</center> 

<font size=5 color='blue'>
    
Model of the Machine: Full-Connected Feed-Forward Network (FF) with one hidden layer with three neurons. The activation function of the last neuron is linear.


In [None]:
n_x = learn_x.shape[1] 
n_h = 3
n_y = learn_y.shape[1]
    
layers = [n_x, n_h, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

model = Sequential()

input_nodes = n_x     #input layer has n_x nodes
hlayer1_nodes = n_h   #first hidden layer has n_h nodes
output_nodes = n_y    #output layer has n_y nodes

model.add(Dense(hlayer1_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))

model.add(Dense(output_nodes, kernel_initializer='uniform', bias_initializer='zeros', activation='linear'))

<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition.

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is learning

In [None]:
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio, verbose=0)

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Overfitting

<font size=5 color='blue'>
    
## <center> A new Learning Machine</center> 

<font size=5 color='blue'>
    
Model of the Machine: Full-Connected Feed-Forward Network (FF) with one hidden layer with four neurons. The activation function of the last neuron is linear.


In [None]:
n_x = learn_x.shape[1] 
n_h = 4
n_y = learn_y.shape[1]
    
layers = [n_x, n_h, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
hlayer1_nodes = n_h   # The first hidden layer has n_h nodes
output_nodes = n_y    # The output layer has n_y nodes


model = Sequential()

model.add(Dense(hlayer1_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))

model.add(Dense(output_nodes, kernel_initializer='uniform', bias_initializer='zeros', activation='linear'))

<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition.

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is Learning

In [None]:
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio,verbose=0)

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Overfitting   

<font size=5 color='blue'>
    
## <center> A new Learning Machine</center> 

<font size=5 color='blue'>
    
Model of the Machine: A Full-Connected Feed-Forward Network (FF) with two hidden layers: the first of which has 5 neurons and the second is one that has 4 neurons. The activation function of the last neuron is linear.


In [None]:
n_x = learn_x.shape[1] 
n_h1 = 5
n_h2  = 4
n_y = learn_y.shape[1]
    
layers = [n_x, n_h1, n_h2, n_y]
net = Network(layers)
net.graph(layers)

<font size=5 color='blue'>
    
Model architecture 


In [None]:
np.random.seed(1)

input_nodes = n_x     # The input layer has n_x nodes
hlayer1_nodes = n_h1  # The first hidden layer has n_h1 nodes
hlayer2_nodes = n_h2  # The second hidden layes has n_h2 nodes
output_nodes = n_y    # The output layer has n_y nodes


model = Sequential()

model.add(Dense(hlayer1_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))

model.add(Dense(hlayer2_nodes,  kernel_initializer='uniform', bias_initializer='zeros', \
                input_dim=input_nodes, activation='tanh'))


model.add(Dense(output_nodes, kernel_initializer='uniform', bias_initializer='zeros', activation='linear'))

<font size=5 color='blue'>
Architecture Summary and Chart

In [None]:
plot_model(model, to_file='model.png', show_shapes=True, rankdir='TB', 
      expand_nested=True, show_layer_names=True, dpi=96)

In [None]:
model.summary()

<font size=5  color='blue'>
    
Compiling the model. Includes the optimizer definition.

In [None]:
sgd = optimizers.SGD(lr=0.01)

model.compile(loss='mean_squared_error', optimizer=sgd)

<font size=5 color='blue'>
    
The Machine is learning

In [None]:
validation_ratio = 0.1
epochs = 600

history = model.fit(learn_x, learn_y, epochs=epochs, validation_split = validation_ratio,verbose=0)

<font size=5 color='blue'>

Plots of cost function versus epoch    

In [None]:
plt.figure(figsize=(10, 7))

plt.plot(history.history['loss'], color='red')
plt.plot(history.history['val_loss'], color='green')
plt.title('Cost function', size=16)
plt.ylabel('Cost', size=16)
plt.xlabel('Epoch', size=16)
plt.legend(['cost_train', 'cost_validation'],loc='upper right', prop={'size': 16})
plt.show()


<font size=5 color='blue'>

Overfitting    