
# **Mean Squared Error (MSE)**
Pros: Simple and widely used for regression problems.
Trains faster

Cons: Can be sensitive to outliers.

Use Cases: Regression tasks.


In [11]:
import tensorflow as tf
y_true = tf.constant([3.0, 4.0, 5.0])
y_pred = tf.constant([2.5, 4.5, 5.5])
loss = tf.reduce_mean(tf.square(y_true - y_pred))
mse_loss = tf.keras.losses.MeanSquaredError()
loss_value = mse_loss(y_true, y_pred)
print(loss_value)

tf.Tensor(0.25, shape=(), dtype=float32)



# **Mean Absolute Error (MAE)**

Pros: Less sensitive to outliers compared to MSE.

Cons: Might not have as smooth a gradient as MSE.

Use Cases: Regression tasks, especially when outliers are present.



In [12]:
loss = tf.reduce_mean(tf.abs(y_true - y_pred))
mae_loss = tf.keras.losses.MeanAbsoluteError()
loss_value = mae_loss(y_true, y_pred)
print(loss_value)

tf.Tensor(0.5, shape=(), dtype=float32)



# **Binary cross-entropy**

Pros: Less sensitive to outliers compared to MSE.
Directly applicable to binary classification problems.
Penalizes confident and wrong predictions heavily.

Use case: Used for binary classification tasks.
As a component in multi-label classification tasks where each label is treated independently.



$$
- y \log(p) - (1-y) \log(1-p)
$$


In [13]:
import tensorflow as tf
y_true = tf.constant([0., 1., 1.])
y_pred = tf.constant([0.1, 0.8, 0.3])
loss = tf.keras.losses.binary_crossentropy(y_true, y_pred)
print(loss)

tf.Tensor(0.5108254, shape=(), dtype=float32)



# **Categorical Cross-Entropy Loss**

Cons: Requires one-hot encoding of the labels, which can increase memory usage.
Not suitable for multi-label classification

Use case: Suitable for multi-class classification problems.
# New Section

$$
−∑
i
​
 Y
i
​
 log(P
i
​
 )
$$

In [14]:
y_true = tf.constant([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]])
y_pred = tf.constant([[0.7, 0.2, 0.1], [0.1, 0.6, 0.3], [0.2, 0.4, 0.4]])
loss = tf.keras.losses.categorical_crossentropy(y_true, y_pred)
print(loss)

tf.Tensor([0.35667497 0.5108256  0.9162907 ], shape=(3,), dtype=float32)



# **Sparse Categorical Cross-Entropy Loss**

Pros: Suitable for multi-class classification problems.
Does not require one-hot encoding of the labels, which is memory efficient.
Functionally equivalent to categorical cross-entropy but with different label representation.

Cons: The label representation might not be as intuitive in some contexts as one-hot encoding.


In [15]:
y_true = tf.constant([0, 1, 2])
y_pred = tf.constant([[0.7, 0.2, 0.1], [0.1, 0.6, 0.3], [0.2, 0.4, 0.4]])
loss = tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)
print(loss)

tf.Tensor([0.35667497 0.5108256  0.91629076], shape=(3,), dtype=float32)



# **Hinge Loss**

Pros: If you're aiming for a decision boundary that completely separates classes without ambiguity, hinge loss can be useful.

Cons: While being a benefit, the non-probabilistic nature of hinge loss might not align well with certain tasks where probability estimates are important.


$$
Hinge Loss=max(0,1−y⋅f)
$$

In [16]:
import tensorflow as tf

def hinge_loss(y_true, y_pred):
    loss = tf.maximum(0., 1. - y_true * y_pred)
    mean_loss = tf.reduce_mean(loss)
    return mean_loss

# Usage example
y_true = tf.constant([1, -1, 1], dtype=tf.float32)  # Convert labels to float32
y_pred = tf.constant([0.8, -0.7, 0.9], dtype=tf.float32)

loss_value = hinge_loss(y_true, y_pred)
print("Hinge Loss:", loss_value.numpy())


Hinge Loss: 0.2
