
### Network Architecture Overview

- **Input Layer**: 
  - **Number of Neurons**: 3 (since `x_input` has 3 features: `[1.0, 2.0, 0.0]`)
  - Each neuron in the input layer represents one feature of the input data.

- **Hidden Layer**:
  - **Number of Hidden Layers**: 1 (as your architecture currently shows only one layer between the input and output layers).
  - **Number of Neurons**: 1 (since your weight matrix `w1` has a shape of (3, 1), indicating that it connects 3 input features to one neuron in the hidden layer).
  - The activation function used is the **sigmoid** function. This is applied to the output of the hidden layer neuron, introducing non-linearity to the model.

- **Output Layer**:
  - **Number of Neurons**: 1 (your weight matrix `w2` has a shape of (1, 1), indicating the output layer has only one neuron).
  - This neuron represents the final output of the network, which predicts the target value based on the inputs.

### Visual Representation of the Architecture

Here’s a simplified visual representation of the architecture:

```
Input Layer                  Hidden Layer                Output Layer
   (x1)                         (h1)                            (y)
     |                            |                              |
   x[0]  --------------------->  w1                             w2
   x[1]  --------------------->  sigmoid(h1) ----------------> y_pred
   x[2]  --------------------->  b1
```

### Summary of Layers and Neurons

- **Input Layer**: 3 neurons (for 3 input features)
- **Hidden Layer**: 
  - 1 hidden layer
  - 1 need more details or if you have further questions!

In [2]:
import numpy as np 
import tensorflow as tf
from tensorflow.keras.optimizers import SGD

In [12]:
import tensorflow as tf  

# Input data  
x_input = tf.constant([[1.0, 2.0, 0.0]], dtype=tf.float32)  
y_true = tf.constant([[1.0]], dtype=tf.float32)  

# Initialize the Weights  
w1 = tf.Variable([[0.5], [-0.4], [0.0]], dtype=tf.float32)  # Shape (3, 1) since x has 3 features  
b1 = tf.Variable([0.1], dtype=tf.float32)  

w2 = tf.Variable([[0.7]], dtype=tf.float32)                 # Shape (1, 1) for hidden layer to output  
b2 = tf.Variable([0.0], dtype=tf.float32)  

# Optimizer  
optimizer = tf.keras.optimizers.SGD(learning_rate=0.02)  # Ensure SGD is imported from tf.keras.optimizers  

In [None]:
def sigmoid(x):
    return 1 / (1 + tf.exp(-x))

def mse_loss(y_true, y_pred):
    # Mean Squared Error Loss Function
    return tf.reduce_mean((y_true - y_pred) ** 2)

def train_step():
    # Gradient Tape for automatic differentiation
    # This will record the operations for automatic differentiation
    # and will compute the gradients of the loss with respect to the variables
    # when we call tape.gradient.
    with tf.GradientTape() as tape:
        #Forward Pass
        z1 = tf.matmul(x_input, w1) + b1
        a1 = sigmoid(z1)

        z2 = tf.matmul(a1, w2) + b2
        y_pred = sigmoid(z2)

        #Loss Calculation
        loss = mse_loss(y_true, y_pred)
        
    #Backward Pass
    grads = tape.gradient(loss,[w1,b1,w2,b2])
    #Apply gradients to update weights and biases
    # This will apply the gradients to the variables using the optimizer
    optimizer.apply_gradients(zip(grads, [w1,b1,w2,b2]))

    return loss.numpy(), y_pred.numpy()

In [17]:
for epoch in range(10):
    loss, y_pred = train_step()
    print(f"Epoch {epoch+1:02d}, Loss: {loss:.4f}, Prediction: {y_pred[0][0]:.4f}")

Epoch 01, Loss: 0.1716, Prediction: 0.5857
Epoch 02, Loss: 0.1705, Prediction: 0.5871
Epoch 03, Loss: 0.1694, Prediction: 0.5884
Epoch 04, Loss: 0.1683, Prediction: 0.5898
Epoch 05, Loss: 0.1672, Prediction: 0.5911
Epoch 06, Loss: 0.1661, Prediction: 0.5924
Epoch 07, Loss: 0.1650, Prediction: 0.5938
Epoch 08, Loss: 0.1639, Prediction: 0.5951
Epoch 09, Loss: 0.1629, Prediction: 0.5964
Epoch 10, Loss: 0.1618, Prediction: 0.5978


In [26]:
import numpy as np
import tensorflow as tf

x_input = tf.constant([[1.0]], dtype=tf.float32)
y_true = tf.constant([[0.5]], dtype=tf.float32)

w = tf.Variable([[0.2]],dtype=tf.float32)
b = tf.Variable([0.1], dtype=tf.float32)

optimizer = tf.keras.optimizers.SGD(learning_rate=0.02)

def model(x):
    return tf.matmul(x, w) + b

def train_step():
    with tf.GradientTape() as tape:
        ## Forward Pass
        y_pred = model(x_input)
    
        ## Calculate the loss
        loss = tf.reduce_mean((y_true - y_pred) ** 2)

    grads = tape.gradient(loss,[w,b])
    optimizer.apply_gradients(zip(grads, [w,b]))
    return loss.numpy(), y_pred.numpy()

In [31]:
for epoch in range(10):
    loss, y_pred = train_step()
    print(f"Epoch : {epoch +1}, Loss : {loss:.4f}, Prediction: {y_pred[0][0]:.4f}")

Epoch : 1, Loss : 0.0000, Prediction: 0.4976
Epoch : 2, Loss : 0.0000, Prediction: 0.4978
Epoch : 3, Loss : 0.0000, Prediction: 0.4980
Epoch : 4, Loss : 0.0000, Prediction: 0.4981
Epoch : 5, Loss : 0.0000, Prediction: 0.4983
Epoch : 6, Loss : 0.0000, Prediction: 0.4984
Epoch : 7, Loss : 0.0000, Prediction: 0.4985
Epoch : 8, Loss : 0.0000, Prediction: 0.4987
Epoch : 9, Loss : 0.0000, Prediction: 0.4988
Epoch : 10, Loss : 0.0000, Prediction: 0.4989
