# Problem Statement:

    Given a sequence of inputs over three time steps:
        Time Step 1: x1 = 0.7, Initial Hidden State: h0 = 0.1
        Time Step 2: x2 = 0.5
        Time Step 3: x3 = 0.2

    The RNN parameters are:
        Input-to-hidden weight: W_xh = 0.9
        Hidden-to-hidden weight: W_hh = 0.5
        Bias: b = 0.3

    Activation function: tanh(x)

# Objective:

    1. Compute the hidden state at each time step manually.

    2. Verify the solution using a Python implementation.

# Manual Solution:

    Activation function: tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))

    The update rule for hidden states is:
        h_t = tanh(W_xh * x_t + W_hh * h_(t-1) + b)

    Step 1: Compute h1
        h1 = tanh((0.9 * 0.7) + (0.5 * 0.1) + 0.3) = tanh(0.98) ≈ 0.753

    Step 2: Compute h2
        h2 = tanh((0.9 * 0.5) + (0.5 * 0.753) + 0.3) = tanh(1.1265) ≈ 0.809

    Step 3: Compute h3
        h3 = tanh((0.9 * 0.2) + (0.5 * 0.809) + 0.3) = tanh(0.8845) ≈ 0.708

# Implementing this in Python to verify our calculations.

Import required library

In [1]:
import numpy as np  # For numerical operations

Given parameters

In [2]:
x_values = [0.7, 0.5, 0.2]  # Input sequence
h_prev = 0.1  # Initial hidden state
W_xh = 0.9  # Input-to-hidden weight
W_hh = 0.5  # Hidden-to-hidden weight
b = 0.3  # Bias term

Define the tanh activation function

    The tanh function is defined as:
        tanh(x) = (e^x - e^(-x)) / (e^x + e^(-x))

    It helps maintain gradients and avoids vanishing gradients in deep networks.

In [3]:
def tanh(x):
    return np.tanh(x)

Compute hidden states for the sequence

In [4]:
hidden_states = []
for t, x_t in enumerate(x_values):
    # Compute the new hidden state using the RNN update rule
    # h_t = tanh(W_xh * x_t + W_hh * h_(t-1) + b)
    h_t = tanh((W_xh * x_t) + (W_hh * h_prev) + b)
    hidden_states.append(h_t)  # Store hidden state
    h_prev = h_t  # Update previous hidden state

    # Print the computed hidden state at each time step
    print(f"Time Step {t+1}: Computed hidden state h_{t+1} = {h_t:.6f}")

Time Step 1: Computed hidden state h_1 = 0.753066
Time Step 2: Computed hidden state h_2 = 0.809829
Time Step 3: Computed hidden state h_3 = 0.708873
