# Tutorial T3: Understanding Tensor Operations & Neural Networks
## Week 3, Day 4 - Deep Neural Network Architectures (Beginner-Friendly Version)

**👋 Welcome!** This tutorial is specifically designed for students from **ECE, Mechanical, and other non-CS backgrounds**.

---

## 📋 Student Information
**Please fill in your details before starting:**

| Field | Details |
|-------|---------|
| **Student Name** | `[ENTER YOUR NAME HERE]` |
| **Registration Number** | `[ENTER YOUR REG NO HERE]` |
| **Branch & Year** | `[e.g., M.Tech ECE/Mechanical - 1st Year]` |
| **Date of Submission** | `[ENTER DATE HERE]` |
| **Lab Session** | `Week 3, Day 4 - Tutorial T3 (Beginner)` |

---

**Duration:** 1 Hour | **Format:** Step-by-Step Guided Tutorial

### 🎯 Learning Objectives (Simplified)
By the end of this tutorial, you will:
1. **Understand** what activation functions do (like switches in circuits)
2. **Visualize** how these functions transform signals
3. **Experience** basic tensor operations (like matrix math you know)
4. **Build** a simple neural network layer step-by-step
5. **Connect** these concepts to your engineering background

### 🌟 Why This Matters for Your Field:
- **ECE Students**: Neural networks process signals just like filters and amplifiers
- **Mechanical Students**: Think of neural networks as control systems with adaptive parameters
- **All Engineers**: These are mathematical tools for pattern recognition and decision making

### 📚 Prerequisites:
- Basic Python (variables, functions, arrays)
- Matrix operations (you learned this in linear algebra)
- Mathematical functions (exponentials, trigonometry)

### 💡 Learning Strategy:
1. **Understand the concept first** - We'll explain WHY before HOW
2. **See it visually** - Lots of plots and diagrams
3. **Try simple examples** - Small, manageable code pieces
4. **Build gradually** - From simple to complex

---

## 🛠️ Setup and Imports

Let's start by importing the tools we need. Don't worry about understanding every import - think of these as loading your toolbox.

In [None]:
# Import our mathematical tools
import numpy as np                # For numerical computations (like MATLAB)
import matplotlib.pyplot as plt   # For plotting (like MATLAB plots)
import tensorflow as tf           # For neural network operations

# Make our plots look nice
plt.style.use('default')
plt.rcParams['figure.figsize'] = (10, 6)

# Set random seeds so we get consistent results
np.random.seed(42)
tf.random.set_seed(42)

print("🔧 Environment Setup Complete!")
print(f"✅ NumPy version: {np.__version__} (like MATLAB for Python)")
print(f"✅ TensorFlow version: {tf.__version__} (for neural networks)")
print("\n🎯 Ready to learn! Let's start with the basics...")

## Part 1: Understanding Activation Functions (20 minutes)

### 🤔 What Are Activation Functions?

Think of activation functions as **smart switches** or **signal processors**:

- **For ECE Students**: Like op-amps, filters, or signal conditioners that transform input signals
- **For Mechanical Students**: Like control valves that regulate flow based on input pressure
- **For Everyone**: Mathematical functions that decide "how much" a neuron should activate

### 🎯 Why Do We Need Them?
Without activation functions, neural networks would just be linear equations (boring!). Activation functions add **non-linearity**, making networks capable of learning complex patterns.

Let's explore the most common ones:

### Task 1A: The Sigmoid Function 📈

**Concept**: Sigmoid is like a **soft switch** that gradually turns on/off.

**Mathematical Formula**: σ(x) = 1/(1+e^(-x))

**Real-world Analogy**: 
- **ECE**: Like a soft-limiting amplifier or a smooth rectifier
- **Mechanical**: Like a pressure relief valve that gradually opens

**Properties**:
- Input: Any real number (-∞ to +∞)
- Output: Always between 0 and 1 (like a probability)
- Smooth S-shaped curve

In [None]:
# Let's implement the sigmoid function step by step

def sigmoid_function(x):
    """
    Sigmoid activation function
    
    Think of this as a smooth switch:
    - Large negative x → output close to 0 (switch OFF)
    - x = 0 → output = 0.5 (halfway)
    - Large positive x → output close to 1 (switch ON)
    """
    # TODO: Implement sigmoid function
    # Formula: 1 / (1 + e^(-x))
    # Hint: Use np.exp() for exponential
    
    # YOUR CODE HERE:
    return 1 / (1 + np.exp(-x))  # <-- Fill this in

# Let's test it with some simple values
test_values = [-5, -2, -1, 0, 1, 2, 5]
print("🧪 Testing Sigmoid Function:")
print("Input  → Output")
print("-" * 15)

for x in test_values:
    result = sigmoid_function(x)
    print(f"{x:3d}    → {result:.4f}")

print("\n💡 Notice how:")
print("   • Negative inputs give outputs close to 0")
print("   • Zero input gives exactly 0.5")
print("   • Positive inputs give outputs close to 1")

In [None]:
# Let's visualize the sigmoid function

# Create a range of x values
x = np.linspace(-6, 6, 100)  # 100 points from -6 to 6
y = sigmoid_function(x)

# Create the plot
plt.figure(figsize=(12, 5))

# Plot the sigmoid function
plt.subplot(1, 2, 1)
plt.plot(x, y, 'b-', linewidth=3, label='Sigmoid Function')
plt.axhline(y=0.5, color='r', linestyle='--', alpha=0.7, label='y=0.5 (threshold)')
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Output σ(x)', fontsize=12)
plt.title('Sigmoid Function: Smooth Switch', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

# Show some key points
key_points_x = [-2, 0, 2]
key_points_y = [sigmoid_function(xi) for xi in key_points_x]
plt.plot(key_points_x, key_points_y, 'ro', markersize=8)

# Add annotations
for i, (xi, yi) in enumerate(zip(key_points_x, key_points_y)):
    plt.annotate(f'({xi}, {yi:.2f})', (xi, yi), 
                xytext=(10, 10), textcoords='offset points',
                bbox=dict(boxstyle='round,pad=0.3', facecolor='yellow', alpha=0.7))

# Create an analogy plot for ECE students
plt.subplot(1, 2, 2)
plt.plot(x, y, 'g-', linewidth=3, label='Signal Output')
plt.fill_between(x, 0, y, alpha=0.3, color='green')
plt.xlabel('Input Signal Strength', fontsize=12)
plt.ylabel('Amplifier Output', fontsize=12)
plt.title('ECE Analogy: Soft-Limiting Amplifier', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("📊 Key Insights:")
print("   • The sigmoid 'squashes' any input to between 0 and 1")
print("   • It's smooth and differentiable everywhere (good for learning)")
print("   • Acts like a soft switch - gradual transition, not abrupt")

### 🤔 Concept Check: Sigmoid

Before moving on, let's make sure you understand:

1. **What happens when x = 0?** (Answer: sigmoid(0) = 0.5)
2. **What happens with very large positive x?** (Answer: sigmoid approaches 1)
3. **What happens with very large negative x?** (Answer: sigmoid approaches 0)
4. **Why is this useful in neural networks?** (Answer: It gives a probability-like output)

### Task 1B: The ReLU Function ⚡

**Concept**: ReLU (Rectified Linear Unit) is like a **one-way valve** or **diode**.

**Mathematical Formula**: f(x) = max(0, x)

**Real-world Analogy**:
- **ECE**: Like a perfect diode that blocks negative voltages, passes positive ones
- **Mechanical**: Like a check valve that only allows flow in one direction

**Properties**:
- Input: Any real number
- Output: 0 for negative inputs, x for positive inputs
- Simple and fast to compute

In [None]:
def relu_function(x):
    """
    ReLU (Rectified Linear Unit) activation function
    
    Think of this as a one-way valve:
    - Negative x → output = 0 (valve closed)
    - Positive x → output = x (valve open, signal passes through)
    """
    # TODO: Implement ReLU function
    # Formula: max(0, x)
    # Hint: Use np.maximum(0, x) for element-wise maximum
    
    # YOUR CODE HERE:
    return np.maximum(0, x)  # <-- Fill this in

# Test the ReLU function
test_values = [-5, -2, -1, 0, 1, 2, 5]
print("🧪 Testing ReLU Function:")
print("Input  → Output")
print("-" * 15)

for x in test_values:
    result = relu_function(x)
    print(f"{x:3d}    → {result:.1f}")

print("\n💡 Notice how:")
print("   • Negative inputs become 0 (blocked)")
print("   • Positive inputs pass through unchanged")
print("   • Zero stays zero")

In [None]:
# Visualize ReLU function
x = np.linspace(-5, 5, 100)
y_relu = relu_function(x)
y_sigmoid = sigmoid_function(x)  # For comparison

plt.figure(figsize=(14, 5))

# ReLU function
plt.subplot(1, 3, 1)
plt.plot(x, y_relu, 'r-', linewidth=3, label='ReLU Function')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Output f(x)', fontsize=12)
plt.title('ReLU: One-Way Valve', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

# Mark the "kink" at x=0
plt.plot(0, 0, 'ro', markersize=10, label='Kink at origin')
plt.annotate('Kink Point\n(not smooth)', (0, 0), 
            xytext=(20, 20), textcoords='offset points',
            bbox=dict(boxstyle='round,pad=0.3', facecolor='yellow', alpha=0.7),
            arrowprops=dict(arrowstyle='->'))

# ECE Analogy: Diode characteristic
plt.subplot(1, 3, 2)
plt.plot(x, y_relu, 'r-', linewidth=3, label='Current vs Voltage')
plt.fill_between(x[x>=0], 0, y_relu[x>=0], alpha=0.3, color='red')
plt.xlabel('Voltage (V)', fontsize=12)
plt.ylabel('Current (I)', fontsize=12)
plt.title('ECE Analogy: Ideal Diode', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

# Compare ReLU vs Sigmoid
plt.subplot(1, 3, 3)
plt.plot(x, y_relu, 'r-', linewidth=3, label='ReLU (Hard switch)')
plt.plot(x, y_sigmoid, 'b-', linewidth=3, label='Sigmoid (Soft switch)')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Output', fontsize=12)
plt.title('Comparison: Hard vs Soft Switch', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("📊 Key Insights:")
print("   • ReLU is simple: just cut off negative values")
print("   • It's like a hard switch (abrupt transition at x=0)")
print("   • Very fast to compute (just a comparison and selection)")
print("   • Most popular activation function in modern neural networks")

### Task 1C: Understanding Gradients (Derivatives)

**Why Do We Care About Gradients?**

In neural networks, gradients tell us "how to adjust weights to improve performance." Think of it as:
- **ECE**: Like finding the slope of a transfer function to optimize circuit response
- **Mechanical**: Like finding the rate of change in a control system to adjust parameters

**The Gradient Problem**: Some activation functions have gradients that become very small (vanishing) or very large (exploding), making learning difficult.

In [None]:
def sigmoid_gradient(x):
    """
    Derivative of sigmoid function
    Formula: σ'(x) = σ(x) * (1 - σ(x))
    """
    # TODO: Implement sigmoid gradient
    # Hint: First compute sigmoid(x), then use the formula above
    
    # YOUR CODE HERE:
    s = sigmoid_function(x)
    return s * (1 - s)  # <-- Fill this in

def relu_gradient(x):
    """
    Derivative of ReLU function
    Formula: 1 if x > 0, 0 if x <= 0
    """
    # TODO: Implement ReLU gradient
    # Hint: Use np.where(condition, value_if_true, value_if_false)
    
    # YOUR CODE HERE:
    return np.where(x > 0, 1.0, 0.0)  # <-- Fill this in

# Test gradients
test_values = np.array([-3, -1, 0, 1, 3])

print("🧪 Testing Gradient Functions:")
print("\nSigmoid Gradients:")
print("Input  → Gradient")
print("-" * 17)
for x in test_values:
    grad = sigmoid_gradient(x)
    print(f"{x:3.0f}    → {grad:.4f}")

print("\nReLU Gradients:")
print("Input  → Gradient")
print("-" * 17)
for x in test_values:
    grad = relu_gradient(x)
    print(f"{x:3.0f}    → {grad:.1f}")

print("\n💡 Key Observations:")
print("   • Sigmoid gradient is maximum at x=0 (0.25)")
print("   • Sigmoid gradient approaches 0 for large |x| (vanishing gradient problem!)")
print("   • ReLU gradient is either 0 or 1 (no vanishing gradient for positive values)")

In [None]:
# Visualize gradients to understand the "vanishing gradient" problem
x = np.linspace(-5, 5, 100)
sigmoid_grad = sigmoid_gradient(x)
relu_grad = relu_gradient(x)

plt.figure(figsize=(15, 5))

# Sigmoid and its gradient
plt.subplot(1, 3, 1)
plt.plot(x, sigmoid_function(x), 'b-', linewidth=3, label='Sigmoid Function')
plt.plot(x, sigmoid_grad, 'b--', linewidth=3, label='Sigmoid Gradient')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Output', fontsize=12)
plt.title('Sigmoid: Function vs Gradient', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

# Highlight the vanishing gradient problem
vanish_x = [-4, 4]
vanish_y = [sigmoid_gradient(xi) for xi in vanish_x]
plt.plot(vanish_x, vanish_y, 'ro', markersize=8)
for xi, yi in zip(vanish_x, vanish_y):
    plt.annotate(f'Gradient ≈ {yi:.3f}\n(Very small!)', (xi, yi), 
                xytext=(10, 20), textcoords='offset points',
                bbox=dict(boxstyle='round,pad=0.3', facecolor='red', alpha=0.7),
                arrowprops=dict(arrowstyle='->'))

# ReLU and its gradient
plt.subplot(1, 3, 2)
plt.plot(x, relu_function(x), 'r-', linewidth=3, label='ReLU Function')
plt.plot(x, relu_grad, 'r--', linewidth=3, label='ReLU Gradient')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Output', fontsize=12)
plt.title('ReLU: Function vs Gradient', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

# Annotate the constant gradient
plt.annotate('Gradient = 1\n(Constant for x > 0)', (2, 1), 
            xytext=(10, 20), textcoords='offset points',
            bbox=dict(boxstyle='round,pad=0.3', facecolor='green', alpha=0.7),
            arrowprops=dict(arrowstyle='->'))

# Compare gradient magnitudes
plt.subplot(1, 3, 3)
plt.plot(x, sigmoid_grad, 'b-', linewidth=3, label='Sigmoid Gradient')
plt.plot(x, relu_grad, 'r-', linewidth=3, label='ReLU Gradient')
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)
plt.xlabel('Input (x)', fontsize=12)
plt.ylabel('Gradient Magnitude', fontsize=12)
plt.title('Gradient Comparison', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("🔍 Why This Matters for Learning:")
print("   • Large gradients = fast learning")
print("   • Small gradients = slow learning (vanishing gradient problem)")
print("   • Zero gradients = no learning (dead neurons)")
print("   • This is why ReLU is so popular - it doesn't vanish for positive inputs!")

## Part 2: Gentle Introduction to Tensors (15 minutes)

### 🤔 What Are Tensors?

Don't let the fancy name intimidate you! Tensors are just **multi-dimensional arrays** (like matrices, but can have more dimensions).

**You already know these**:
- **Scalar** (0D tensor): Just a single number → `5`
- **Vector** (1D tensor): A list of numbers → `[1, 2, 3]`  
- **Matrix** (2D tensor): A rectangular array → `[[1, 2], [3, 4]]`
- **3D Tensor**: Like a stack of matrices → `[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]`

**Engineering Analogies**:
- **ECE**: Like multi-dimensional signals (time, frequency, spatial)
- **Mechanical**: Like stress tensors or multi-parameter system states

In [None]:
# Let's create different types of tensors
print("🔢 Creating Different Types of Tensors\n")

# Scalar (0D)
scalar = tf.constant(5.0)
print(f"📍 Scalar (0D tensor):")
print(f"   Value: {scalar.numpy()}")
print(f"   Shape: {scalar.shape} (no dimensions)")
print(f"   Think: A single measurement (temperature, voltage, etc.)\n")

# Vector (1D)
vector = tf.constant([1, 2, 3, 4], dtype=tf.float32)
print(f"📏 Vector (1D tensor):")
print(f"   Values: {vector.numpy()}")
print(f"   Shape: {vector.shape} (4 elements in a line)")
print(f"   Think: Time series data, signal samples, coordinates\n")

# Matrix (2D)
matrix = tf.constant([[1, 2, 3], 
                      [4, 5, 6]], dtype=tf.float32)
print(f"🔲 Matrix (2D tensor):")
print(f"   Values:\n{matrix.numpy()}")
print(f"   Shape: {matrix.shape} (2 rows, 3 columns)")
print(f"   Think: Image data, transformation matrix, spreadsheet\n")

# 3D Tensor
tensor_3d = tf.constant([[[1, 2], [3, 4]], 
                         [[5, 6], [7, 8]]], dtype=tf.float32)
print(f"📦 3D Tensor:")
print(f"   Values:\n{tensor_3d.numpy()}")
print(f"   Shape: {tensor_3d.shape} (2 matrices, each 2×2)")
print(f"   Think: Stack of images, batch of data, RGB color channels\n")

print("💡 Key Point: The 'shape' tells you the dimensions")
print("   Shape (4,) = vector with 4 elements")
print("   Shape (2, 3) = matrix with 2 rows, 3 columns")
print("   Shape (2, 2, 2) = 2 matrices, each 2×2")

### Task 2A: Basic Tensor Operations

Let's do some basic operations that you'll recognize from linear algebra:

In [None]:
# Matrix multiplication (you know this from linear algebra!)
print("🔄 Matrix Operations (Just Like Linear Algebra!)\n")

# Create two simple matrices
A = tf.constant([[1, 2], 
                 [3, 4]], dtype=tf.float32)
B = tf.constant([[5, 6], 
                 [7, 8]], dtype=tf.float32)

print("Matrix A:")
print(A.numpy())
print("\nMatrix B:")
print(B.numpy())

# Element-wise multiplication (like MATLAB .*)
element_wise = tf.multiply(A, B)  # or just A * B
print("\n🔸 Element-wise multiplication (A .* B in MATLAB):")
print(element_wise.numpy())
print("   Each element: A[i,j] * B[i,j]")

# Matrix multiplication (like MATLAB *)
matrix_mult = tf.matmul(A, B)  # or A @ B
print("\n🔹 Matrix multiplication (A * B in MATLAB):")
print(matrix_mult.numpy())
print("   Standard linear algebra multiplication")

# Let's verify the matrix multiplication manually for first element
manual_calc = A[0,0]*B[0,0] + A[0,1]*B[1,0]
print(f"\n🧮 Manual check for element [0,0]:")
print(f"   A[0,0]*B[0,0] + A[0,1]*B[1,0] = {A[0,0].numpy()}*{B[0,0].numpy()} + {A[0,1].numpy()}*{B[1,0].numpy()} = {manual_calc.numpy()}")
print(f"   Result from matrix multiplication: {matrix_mult[0,0].numpy()}")
print(f"   ✅ Match: {abs(manual_calc.numpy() - matrix_mult[0,0].numpy()) < 1e-6}")

In [None]:
# Shape manipulation (reshaping, like MATLAB reshape)
print("📐 Shape Manipulation (Like MATLAB reshape)\n")

# Start with a simple matrix
original = tf.constant([[1, 2, 3], 
                        [4, 5, 6]])
print(f"Original matrix ({original.shape}):")
print(original.numpy())

# TODO: Reshape to 3×2 (flip rows and columns)
# Hint: Use tf.reshape(tensor, [new_rows, new_columns])
reshaped_3x2 = tf.reshape(original, [3, 2])  # <-- Fill this in

print(f"\n📏 Reshaped to 3×2:")
print(reshaped_3x2.numpy())

# TODO: Flatten to 1D vector
# Hint: Use tf.reshape(tensor, [-1]) where -1 means "figure out this dimension"
flattened = tf.reshape(original, [-1])  # <-- Fill this in

print(f"\n📏 Flattened to 1D ({flattened.shape}):")
print(flattened.numpy())

# TODO: Transpose (flip rows and columns)
# Hint: Use tf.transpose(tensor)
transposed = tf.transpose(original)  # <-- Fill this in

print(f"\n📏 Transposed ({transposed.shape}):")
print(transposed.numpy())

print("\n💡 Key Insight: The total number of elements stays the same!")
print(f"   Original: {original.shape} = {original.shape[0] * original.shape[1]} elements")
print(f"   3×2 reshape: {reshaped_3x2.shape} = {reshaped_3x2.shape[0] * reshaped_3x2.shape[1]} elements")
print(f"   Flattened: {flattened.shape} = {flattened.shape[0]} elements")

## Part 3: Building a Simple Neural Layer (15 minutes)

### 🧠 What Is a Neural Layer?

A neural layer is like a **transformation box** that:
1. Takes inputs (numbers)
2. Multiplies by weights (learned parameters)
3. Adds biases (shifts)
4. Applies activation function (non-linearity)

**Formula**: output = activation(input × weights + bias)

**Engineering Analogy**:
- **ECE**: Like an op-amp circuit with gain (weights) and offset (bias)
- **Mechanical**: Like a control system with gain and reference point adjustment

In [None]:
# Let's build a simple neural layer step by step
print("🏗️ Building a Simple Neural Layer\n")

# Step 1: Define the layer parameters
input_size = 3    # 3 input neurons
output_size = 2   # 2 output neurons

print(f"📐 Layer Architecture: {input_size} inputs → {output_size} outputs")

# Step 2: Initialize weights and biases
# Weights: random small numbers (we'll learn better initialization later)
weights = np.random.randn(input_size, output_size) * 0.5
bias = np.zeros(output_size)  # Start with zero bias

print(f"\n🎯 Parameters:")
print(f"Weights shape: {weights.shape} (each input connects to each output)")
print(f"Weights:\n{weights}")
print(f"\nBias shape: {bias.shape}")
print(f"Bias: {bias}")

# Step 3: Create a test input
test_input = np.array([1.0, 2.0, 3.0])  # Simple test values
print(f"\n📥 Test Input: {test_input}")

# Step 4: Forward pass computation
print(f"\n🔄 Forward Pass Computation:")

# Linear transformation: input × weights + bias
linear_output = np.dot(test_input, weights) + bias
print(f"1. Linear transformation (input × weights + bias):")
print(f"   {test_input} × weights + {bias} = {linear_output}")

# Apply activation function (ReLU)
final_output = relu_function(linear_output)
print(f"2. Apply ReLU activation:")
print(f"   ReLU({linear_output}) = {final_output}")

print(f"\n✅ Final output: {final_output}")
print(f"   The layer transformed 3 inputs into 2 outputs!")

In [None]:
# Let's create a simple function to do this transformation
def simple_neural_layer(inputs, weights, bias, activation_function):
    """
    A simple neural layer function
    
    Steps:
    1. Linear transformation: inputs × weights + bias
    2. Apply activation function
    
    Args:
        inputs: Input values
        weights: Weight matrix 
        bias: Bias vector
        activation_function: Function to apply (sigmoid, ReLU, etc.)
    """
    # TODO: Implement the layer computation
    # Step 1: Linear transformation
    linear = np.dot(inputs, weights) + bias  # <-- Fill this in
    
    # Step 2: Apply activation
    output = activation_function(linear)  # <-- Fill this in
    
    return output, linear  # Return both for analysis

# Test our layer function
print("🧪 Testing Our Neural Layer Function\n")

# Test with different inputs
test_inputs = [
    [1, 0, 0],    # Only first input active
    [0, 1, 0],    # Only second input active
    [0, 0, 1],    # Only third input active
    [1, 1, 1],    # All inputs active
    [-1, 2, 0.5]  # Mixed positive/negative
]

print("Input      → Linear Output  → ReLU Output")
print("-" * 45)

for inp in test_inputs:
    inp = np.array(inp)
    relu_out, linear_out = simple_neural_layer(inp, weights, bias, relu_function)
    print(f"{str(inp):12} → {linear_out} → {relu_out}")

print("\n💡 Observations:")
print("   • Different inputs produce different outputs (good!)")
print("   • ReLU sets negative values to 0")
print("   • The weights determine how inputs influence outputs")

In [None]:
# Let's visualize how the layer transforms inputs
print("📊 Visualizing Neural Layer Behavior\n")

# Create a range of inputs for the first input dimension (keeping others at 0)
input_range = np.linspace(-3, 3, 50)
outputs_neuron1 = []
outputs_neuron2 = []

for x in input_range:
    test_input = np.array([x, 0, 0])  # Vary first input, keep others at 0
    output, _ = simple_neural_layer(test_input, weights, bias, relu_function)
    outputs_neuron1.append(output[0])
    outputs_neuron2.append(output[1])

# Plot the input-output relationship
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
plt.plot(input_range, outputs_neuron1, 'b-', linewidth=3, label='Output Neuron 1')
plt.plot(input_range, outputs_neuron2, 'r-', linewidth=3, label='Output Neuron 2')
plt.xlabel('First Input Value', fontsize=12)
plt.ylabel('Neuron Output', fontsize=12)
plt.title('Neural Layer Response', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)

# Show the effect of ReLU
plt.subplot(1, 2, 2)
# Compare with and without ReLU
linear_outputs_1 = []
linear_outputs_2 = []
for x in input_range:
    test_input = np.array([x, 0, 0])
    _, linear_out = simple_neural_layer(test_input, weights, bias, lambda x: x)  # No activation
    linear_outputs_1.append(linear_out[0])
    linear_outputs_2.append(linear_out[1])

plt.plot(input_range, linear_outputs_1, 'b--', linewidth=2, label='Linear (no ReLU)', alpha=0.7)
plt.plot(input_range, outputs_neuron1, 'b-', linewidth=3, label='With ReLU')
plt.xlabel('First Input Value', fontsize=12)
plt.ylabel('Output Value', fontsize=12)
plt.title('Effect of ReLU Activation', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.axhline(y=0, color='k', linestyle='-', alpha=0.3)
plt.axvline(x=0, color='k', linestyle='-', alpha=0.3)

plt.tight_layout()
plt.show()

print("📈 What We See:")
print("   • Without ReLU: Linear relationship (straight line)")
print("   • With ReLU: Non-linear (bent at zero)")
print("   • Different neurons can have different responses")
print("   • This non-linearity is what makes neural networks powerful!")

## Part 4: Building a Simple Network (10 minutes)

### 🏗️ Connecting Layers

A neural network is just **multiple layers connected together**:
- Output of Layer 1 → Input of Layer 2
- Output of Layer 2 → Input of Layer 3
- And so on...

Think of it like a **signal processing pipeline** or a **multi-stage amplifier**.

In [None]:
# Let's build a simple 2-layer network
print("🏗️ Building a Simple 2-Layer Neural Network\n")

# Network architecture: 3 → 4 → 2
# Layer 1: 3 inputs → 4 hidden neurons
# Layer 2: 4 inputs → 2 output neurons

print("📐 Network Architecture: 3 → 4 → 2")
print("   • Input layer: 3 neurons")
print("   • Hidden layer: 4 neurons (with ReLU)")
print("   • Output layer: 2 neurons (with ReLU)")

# Initialize weights and biases for both layers
# Layer 1: 3 → 4
weights1 = np.random.randn(3, 4) * 0.5
bias1 = np.zeros(4)

# Layer 2: 4 → 2  
weights2 = np.random.randn(4, 2) * 0.5
bias2 = np.zeros(2)

print(f"\n🎯 Layer 1 weights shape: {weights1.shape}")
print(f"🎯 Layer 1 bias shape: {bias1.shape}")
print(f"🎯 Layer 2 weights shape: {weights2.shape}")
print(f"🎯 Layer 2 bias shape: {bias2.shape}")

# Test input
network_input = np.array([1.0, 0.5, -0.5])
print(f"\n📥 Network Input: {network_input}")

# Forward pass through the network
print(f"\n🔄 Forward Pass:")

# Layer 1
hidden_output, hidden_linear = simple_neural_layer(network_input, weights1, bias1, relu_function)
print(f"1. After Layer 1 (3→4): {hidden_output}")

# Layer 2
final_output, final_linear = simple_neural_layer(hidden_output, weights2, bias2, relu_function)
print(f"2. After Layer 2 (4→2): {final_output}")

print(f"\n✅ Final Network Output: {final_output}")
print(f"   The network transformed 3 inputs into 2 outputs through 4 hidden neurons!")

In [None]:
# Let's create a function for our complete network
def simple_two_layer_network(inputs):
    """
    A simple 2-layer neural network: 3 → 4 → 2
    """
    # TODO: Implement the 2-layer network
    # Layer 1: inputs → hidden
    hidden, _ = simple_neural_layer(inputs, weights1, bias1, relu_function)
    
    # Layer 2: hidden → output
    output, _ = simple_neural_layer(hidden, weights2, bias2, relu_function)
    
    return output

# Test the network with various inputs
print("🧪 Testing Our 2-Layer Network\n")

test_cases = [
    [1, 0, 0],      # First input only
    [0, 1, 0],      # Second input only
    [0, 0, 1],      # Third input only
    [1, 1, 1],      # All inputs equal
    [2, -1, 0.5],   # Mixed values
    [-1, -1, -1],   # All negative
]

print("Input          → Network Output")
print("-" * 35)

for test_input in test_cases:
    test_input = np.array(test_input)
    output = simple_two_layer_network(test_input)
    print(f"{str(test_input):15} → {output}")

print("\n💡 Key Observations:")
print("   • The network produces different outputs for different inputs")
print("   • Some outputs might be zero due to ReLU activation")
print("   • The network has learned to map 3D inputs to 2D outputs")
print("   • This is the foundation of pattern recognition!")

## 🎯 Simple Understanding Check

Let's make sure you understand the key concepts:

In [None]:
# Quick conceptual check
print("🤔 Quick Understanding Check\n")

print("1. What does ReLU do?")
print("   Answer: Sets negative values to 0, keeps positive values unchanged")
print("   Like a one-way valve or diode\n")

print("2. What does Sigmoid do?")
print("   Answer: Squashes any input to between 0 and 1")
print("   Like a soft switch or probability converter\n")

print("3. What's a neural layer?")
print("   Answer: Input × Weights + Bias → Activation Function")
print("   Like an adjustable signal processor\n")

print("4. What's a neural network?")
print("   Answer: Multiple layers connected together")
print("   Like a pipeline of signal processors\n")

print("5. Why do we need activation functions?")
print("   Answer: To add non-linearity so the network can learn complex patterns")
print("   Without them, it would just be linear algebra (boring!)\n")

# Simple practical test
print("✅ Practical Test:")
simple_input = np.array([1, -1, 2])
simple_output = simple_two_layer_network(simple_input)
print(f"   Input {simple_input} → Output {simple_output}")
print(f"   ✅ Your network successfully processed the input!")

## 🎓 Gentle Assessment

Instead of complex unit tests, let's do a friendly understanding check:

In [None]:
def gentle_assessment():
    """
    A friendly assessment focused on understanding rather than implementation
    """
    print("🎓 Gentle Understanding Assessment\n")
    print("Let's check your understanding with simple questions:\n")
    
    score = 0
    total = 5
    
    # Test 1: Sigmoid understanding
    print("1️⃣ Sigmoid Function Test")
    try:
        sig_zero = sigmoid_function(0)
        if abs(sig_zero - 0.5) < 0.01:
            print("   ✅ Correct: sigmoid(0) ≈ 0.5")
            score += 1
        else:
            print(f"   ❌ Expected sigmoid(0) ≈ 0.5, got {sig_zero}")
    except:
        print("   ❌ Sigmoid function not implemented")
    
    # Test 2: ReLU understanding
    print("\n2️⃣ ReLU Function Test")
    try:
        relu_pos = relu_function(2)
        relu_neg = relu_function(-2)
        if relu_pos == 2 and relu_neg == 0:
            print("   ✅ Correct: ReLU blocks negative, passes positive")
            score += 1
        else:
            print(f"   ❌ Expected ReLU(2)=2, ReLU(-2)=0, got {relu_pos}, {relu_neg}")
    except:
        print("   ❌ ReLU function not implemented")
    
    # Test 3: Tensor shape understanding
    print("\n3️⃣ Tensor Shape Test")
    test_matrix = tf.constant([[1, 2, 3], [4, 5, 6]])
    try:
        transposed = tf.transpose(test_matrix)
        if transposed.shape == (3, 2):
            print("   ✅ Correct: Understood tensor reshaping")
            score += 1
        else:
            print(f"   ❌ Expected shape (3, 2), got {transposed.shape}")
    except:
        print("   ❌ Tensor operations not completed")
    
    # Test 4: Layer concept
    print("\n4️⃣ Neural Layer Test")
    try:
        test_input = np.array([1, 0, 0])
        layer_output = simple_two_layer_network(test_input)
        if layer_output is not None and len(layer_output) == 2:
            print("   ✅ Correct: Network produces 2 outputs from 3 inputs")
            score += 1
        else:
            print("   ❌ Network doesn't produce expected output")
    except:
        print("   ❌ Network not implemented")
    
    # Test 5: Conceptual understanding
    print("\n5️⃣ Conceptual Understanding")
    print("   Question: What makes neural networks powerful?")
    print("   Answer: Non-linear activation functions allow learning complex patterns")
    print("   ✅ This is a conceptual understanding - you get this point for participation!")
    score += 1
    
    # Results
    percentage = (score / total) * 100
    print("\n" + "="*50)
    print("📊 ASSESSMENT RESULTS")
    print("="*50)
    print(f"Score: {score}/{total} ({percentage:.0f}%)")
    
    if percentage >= 80:
        grade = "A"
        message = "🎉 Excellent understanding! You're ready for more advanced topics."
    elif percentage >= 60:
        grade = "B"
        message = "👍 Good work! You understand the basics well."
    else:
        grade = "C"
        message = "💪 Keep learning! Review the concepts and try the exercises again."
    
    print(f"Grade: {grade}")
    print(f"{message}")
    
    print("\n🎯 What You've Learned:")
    print("   • How activation functions work (sigmoid, ReLU)")
    print("   • Basic tensor operations (reshape, multiply)")
    print("   • How neural layers process information")
    print("   • How to connect layers into a network")
    print("   • Why non-linearity is important")
    
    return score, total

# Run the gentle assessment
assessment_score, total_possible = gentle_assessment()

## 🌟 Summary & Next Steps

### 🎯 What You Accomplished Today:

✅ **Understood Activation Functions**: You learned how sigmoid and ReLU work like switches and valves

✅ **Mastered Basic Tensors**: You worked with matrices and understood reshaping (just like MATLAB!)

✅ **Built Neural Layers**: You created signal processors that transform inputs to outputs

✅ **Constructed Networks**: You connected layers to create a simple neural network

### 🔗 Connection to Your Field:

**For ECE Students**:
- Neural networks are like adaptive signal processing systems
- Activation functions are like non-linear circuit elements
- Weights are like adjustable gains in amplifiers

**For Mechanical Students**:
- Neural networks are like adaptive control systems
- Activation functions are like control valves with different characteristics
- The network learns optimal control parameters

### 🚀 Next Steps in Your Learning Journey:

1. **Module 2**: Learn how networks adjust their weights (optimization)
2. **Module 3**: Apply neural networks to image processing
3. **Module 4**: Build specialized networks for pattern recognition
4. **Module 5**: Create systems that can detect and classify objects

### 💡 Key Insights to Remember:

- **Neural networks are just mathematical functions** that learn from examples
- **Activation functions add the "intelligence"** by introducing non-linearity
- **Layers are building blocks** - combine them to solve complex problems
- **It's all about transforming data** from one representation to another

### 📚 If You Want to Learn More:

- Practice with different activation functions
- Try building networks with more layers
- Experiment with different input patterns
- Think about how this applies to your engineering projects

---

## 🎊 Congratulations!

You've taken your first steps into the world of neural networks. These concepts might have seemed intimidating at first, but you've learned that they're just mathematical tools - like the equations and systems you already know from your engineering background.

**The key insight**: Neural networks are not magic - they're engineered systems that can be understood, analyzed, and applied to solve real problems in your field!

Keep this curiosity and systematic approach as you continue learning. You're well-prepared for the more advanced topics ahead! 🚀