
## Theoretical Implementation (Markdown)

Here is the Python implementation of the Cost Function formula derived in the transcript.


In [1]:

import numpy as np

def compute_cost_theory(A, Y):
    """
    Computes the cost J using the Log Loss formula.
    
    Arguments:
    A -- The probability vector output from sigmoid (predictions), shape (1, m)
    Y -- The true "label" vector (0 or 1), shape (1, m)
    
    Returns:
    cost -- The scalar cost J
    """
    # Retrieve number of training examples
    m = Y.shape[1]

    # THEORY: J = -1/m * Sum( y*log(a) + (1-y)*log(1-a) )
    # We use np.multiply for element-wise multiplication
    # We use np.sum to aggregate losses across all examples
    logprobs = np.multiply(Y, np.log(A)) + np.multiply((1 - Y), np.log(1 - A))
    cost = (-1 / m) * np.sum(logprobs)
    
    # Squeeze ensures the result is a scalar value (e.g., 17) rather than an array [[17]]
    cost = np.squeeze(cost)
    
    return cost

# --- VERIFICATION ---
if __name__ == "__main__":
    # Simulate 3 examples
    # Y: [Cat, Not-Cat, Cat]
    Y_train = np.array([[1, 0, 1]])
    
    # Case A: Good Predictions (High confidence in correct class)
    A_good = np.array([[0.99, 0.01, 0.99]])
    cost_good = compute_cost_theory(A_good, Y_train)
    print(f"Cost for Good Predictions: {cost_good:.5f} (Should be low)")

    # Case B: Bad Predictions (High confidence in WRONG class)
    # The model thinks the first cat is definitely NOT a cat (0.01)
    A_bad = np.array([[0.01, 0.99, 0.01]])
    cost_bad = compute_cost_theory(A_bad, Y_train)
    print(f"Cost for Bad Predictions:  {cost_bad:.5f} (Should be very high)")

Cost for Good Predictions: 0.01005 (Should be low)
Cost for Bad Predictions:  4.60517 (Should be very high)
