In [1]:
%%html
<style>
h1 {
  border: 1.5px solid #333;
  padding: 8px 12px;
  background-color:#a0cfc0;
  position: static;
}  
h2 {
  padding: 8px 12px;
  background-color:#f0cfc0;
  position: static;
}   
h3 {
  padding: 4px 8px;
  background-color:#f0cfc0;
  position: static;
}   
</style>

In [2]:
import numpy as np
import matplotlib.pyplot as plt

## 1. Règle d'apprentissage pour un réseau à une couche

Soit $W$ la matrice des poids : chaque ligne correspond à un neurone de la couche. Soit $b$ la matrice colonne des biais : chaque ligne correspond à un neurone de la couche.

Soit $e$ la matrice colonne des erreurs, alors la règle d'apprentissage est :

$$
W^{\text{nouveau}} = W^{\text{ancien}} + e \times x^T
$$

$$
b^{\text{nouveau}} = b^{\text{ancien}} + e
$$


## 2. Comment choisir une architecture

Les spécifications du problème aident à définir le réseau de la manière suivante :
1. Le nombre d'entrées du réseau = le nombre d'entrées du problème
2. Le nombre de neurones dans la couche de sortie = le nombre de sorties du problème
3. Le choix de la fonction de transfert de la couche de sortie est en partie déterminé par
les spécifications des sorties du problème


**Exercise**

A single-layer neural network is to have six inputs and two outputs.
The outputs are to be limited to and continuous over the
range 0 to 1. What can you tell about the network architecture?
Specifically:
* How many neurons are required?
* What are the dimensions of the weight matrix?
* What kind of transfer functions could be used?
* Is a bias required?

**Answer**: 
* Two neurons, one for each output, are required.
* The weight matrix has two rows corresponding to the two neurons and
six columns corresponding to the six inputs. (The product is a two-element
vector.)
* Of the transfer functions we have discussed, the transfer function
would be most appropriate.
* Not enough information is given to determine if a bias is required.

## Problem

We have a classification problem with four classes of input vector. The four classes are : 
* class 1 : $x_1 = (1,1)$ and $x_2 = (1,2)$
* class 2 : $x_3 = (2,-1)$ and $x_4 = (2,0)$
* class 3 : $x_5 = (-1,2)$ and $x_6 = (-2,1)$
* class 4 : $x_7 = (-1,-1)$ and $x_8 = (-2,-2)$




a) Design a neural network to solve this problem.

We need 2 neurons and check if we can divide the 4 classes into 2 sets of 2.

<img src = "img/NNproblem1.png"> </img>

The answer is yes. 

Then we have to choose which value is expected according to the class of input. Let us choose theses target values :

* class 1 : $t_1 = t_2 = (0,0)$
* class 2 : $t_3 = t_4 = (0,1)$
* class 3 : $t_5 = t_6 = (1,0)$
* class 4 : $t_7 = t_8 = (1,1)$

Then we can  graphically find suitable weights for each neuron: $w_1 = (-3,-1)$ and $w_2 = (1,-2)$. 

It is easy to find correct bias by picking a point on each boundary line: $b_1 = 1$ and $b_2 = 0$. 

b) Train a perceptron network to solve this problem
using the perceptron learning rule.

_Tip: be careful of the size of your matrix when you make a product. Here is an example of product:_

In [12]:
B = np.array([[1,1],[4,-1]]) # it is a matrix
print(B)

[[ 1  1]
 [ 4 -1]]


In [13]:
x = np.array([1,1]) # it is a vector
C = np.dot(B,x)  # it is a vector
print(C)
print(np.array([C]).T) #it is a matrix

[2 3]
[[2]
 [3]]


In [14]:
X=[];T=[]
X.append(np.array([1,1]))
T.append(np.array([0,0]))
X.append(np.array([1,2]))
T.append(np.array([0,0]))
X.append(np.array([2,-1]))
T.append(np.array([0,1]))
X.append(np.array([2,0]))
T.append(np.array([0,1]))
X.append(np.array([-1,2]))
T.append(np.array([1,0]))
X.append(np.array([-2,1]))
T.append(np.array([1,0]))
X.append(np.array([-1,-1]))
T.append(np.array([1,1]))
X.append(np.array([-2,-2]))
T.append(np.array([1,1]))
for i in range(len(X)):
    T[i]=np.array([T[i]]).T
    X[i]=np.array([X[i]]).T

W = np.array([[1,0],[0,1]])
b = np.array([[1,1]]).T

In [15]:
def H(y):
    output = []
    for t in y:
        if t<0:
            output.append(0.)
        else:
            output.append(1.)
    output = np.array([output]).T
    return output

In [16]:
istarget = 0
n=0
while not istarget:
    n = n+1
    istarget = 1
    for i in range(len(X)):
        a = H(np.dot(W,X[i])+b)
        e = T[i]-a
        istarget = istarget *((T[i] == a).all())
        W = W + (np.dot(e,X[i].T))
        b = b + e


In [17]:
n

3

In [18]:
W

array([[-2.,  0.],
       [ 0., -2.]])

In [19]:
b

array([[-1.],
       [ 0.]])