<a href="https://colab.research.google.com/github/shivendr7/ml/blob/main/ExtractingWeightsAndManualCalculation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

By defining the center and how large the standard deviations are, you are able to control the range of random numbers that you will receive.

The Xavier weight initialization sets all of the weights to normally distributed random numbers. These weights are always centered at 0; however, their standard deviation varies depending on how many connections are present for the current layer of weights. Specifically, Equation 4.2 can determine the standard deviation:

Var(W)=2/(nin+nout)

nin=number of neurons in layer i-1

nout=no. of neurons in layer i 

The above equation shows how to obtain the variance for all of the weights. The square root of the variance is the standard deviation. Most random number generators accept a standard deviation rather than a variance.

In [5]:
"""
 We will train a simple neural network that learns the XOR function. 
 It is not hard to simply hand-code the neurons to provide an XOR function; 
 however, for simplicity, we will allow Keras to train this network for us. 
 We will just use 100K epochs on the ADAM optimizer. This is massive overkill,
  but it gets the result, and our focus here is not on tuning. The neural network
 is small. Two inputs, two hidden neurons, and a single output.
"""
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
import numpy as np

In [13]:
x=np.array([
            [0,0],
            [1,0],
            [0,1],
            [1,1]
])
y=np.array([
            0,
            1,
            1,
            0
])
# Build the network
# sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)

done=False
cycle=1
#np.set_printoptions(suppress=True)

while not done:
  print("Cycle #{}".format(cycle))
  cycle+=1
  model=Sequential()
  model.add(Dense(2, input_dim=2, activation='relu'))
  model.add(Dense(1))
  model.compile(loss='mean_squared_error',optimizer='adam')
  model.fit(x, y, verbose=0, epochs=10000)

  pred=model.predict(x)

  #check success
  done= pred[0]<0.01 and pred[3]<.01 and pred[1]>0.9 and pred[2]>.9
  print(pred)

Cycle #1
[[-0.00000046]
 [ 0.99999934]
 [ 0.9999992 ]
 [-0.00000055]]


In [25]:
#Extracting weights
for layerNum, layer in enumerate(model.layers):
  weights=layer.get_weights()[0]
  biases=layer.get_weights()[1]
  print(weights,'\n',biases)
  for toNeuronNum, bias in enumerate(biases):
    print(f'{layerNum}B -> L{layerNum+1}N{toNeuronNum}:{bias}')
  print()
  for fromNeuronNum, wgt in enumerate(weights):
    for  toNeuronNum, wgt2 in enumerate(wgt):
      print(f'L{layerNum}N{fromNeuronNum} -> L{layerNum}N{toNeuronNum} = {wgt2}')
  print()


[[-1.2098621   0.6771087 ]
 [ 0.6269057  -0.67710876]] 
 [-0.00021322  0.00636576]
0B -> L1N0:-0.00021321882377378643
0B -> L1N1:0.0063657574355602264

L0N0 -> L0N0 = -1.2098621129989624
L0N0 -> L0N1 = 0.6771087050437927
L0N1 -> L0N0 = 0.6269056797027588
L0N1 -> L0N1 = -0.6771087646484375

[[1.6106801]
 [1.4768674]] 
 [-0.00940184]
1B -> L2N0:-0.009401840157806873

L1N0 -> L1N0 = 1.610680103302002
L1N1 -> L1N0 = 1.476867437362671



In [28]:
#using weights
input0 = 0
input1 = 1

hidden0Sum = (input0*1.3)+(input1*1.3)+(-1.3)
hidden1Sum = (input0*1.2)+(input1*1.2)+(0)

print(hidden0Sum) # 0
print(hidden1Sum) # 1.2

hidden0 = max(0,hidden0Sum)
hidden1 = max(0,hidden1Sum)

print(hidden0) # 0
print(hidden1) # 1.2

outputSum = (hidden0*-1.6)+(hidden1*0.8)+(0)
print(outputSum) # 0.96

output = max(0,outputSum)

print(output) # 0.96

0.0
1.2
0
1.2
0.96
0.96
