# RMSprop optimizer

RMSprop (Root Mean Square Propagation) is an optimization algorithm commonly used in training artificial neural networks (ANNs). It is particularly effective in scenarios where other optimization algorithms like vanilla stochastic gradient descent (SGD) may struggle due to problems such as vanishing or exploding gradients.

## Details of RMSprop Algorithm

RMSprop is an adaptive learning rate optimization algorithm proposed by Geoffrey Hinton in his course on Neural Networks for Machine Learning. The algorithm is designed to adaptively adjust the learning rates for different parameters during training.

Summary of how RMSprop works:

1. Compute Squared Gradients: RMSprop maintains a moving average of the squared gradients for each parameter. This is similar to AdaGrad but with a decaying average.

2. Update Parameters: The update rule adjusts the learning rate for each parameter based on the average of the squared gradients.

3. Adaptive Learning Rates: RMSprop divides the learning rate by the square root of the exponentially decaying average of squared gradients for each parameter. This helps to normalize the learning rates and overcome the problems of vanishing or exploding gradients.


## Pros of RMSprop optimizer

1. Adaptive Learning Rates: RMSprop adapts the learning rates for each parameter individually based on the magnitude of their gradients. This helps converge faster and more efficiently, especially in deep neural networks.

2. Stability: It helps to stabilize the learning process by mitigating the issues of vanishing and exploding gradients.

3. Simple Implementation: RMSprop is relatively easy to implement and widely used in practice.

## Cons of RMSprop optimizer

1. Hyperparameter Sensitivity: RMSprop, like other adaptive methods, has hyperparameters that need to be tuned, such as the learning rate and the decay rate. Improper tuning can lead to suboptimal performance.

2. Memory Usage: RMSprop maintains a moving average of squared gradients for each parameter, which can require additional memory, especially for large models with many parameters.


## References
- https://keras.io/api/optimizers/rmsprop/

In [None]:
from fashionmnist_model import FMM
import tensorflow as tf

In [None]:
# Load and preprocess the data
X_train, y_train, X_test, y_test = FMM.load_data()

In [None]:
# Reshape the data
X_train, X_test = FMM.reshape_data(X_train, X_test)

In [None]:
optimizer = tf.keras.optimizers.RMSprop()
model = FMM.create_model()
print(f"Training with {optimizer.__class__.__name__} optimizer...")
history, train_accuracy, val_accuracy = FMM.compile_and_train(
    model, X_train, y_train, X_test, y_test, optimizer
)

In [None]:
loss, accuracy = FMM.evaluate(model, X_test, y_test)

In [None]:
print(f"Training accuracy : {train_accuracy}")
print(f"Validation accuracy : {val_accuracy}")
print(f"Loss : {loss}")
print(f"Accuracy : {accuracy}")

In [None]:
FMM.plot_history(history, optimizer)

The trend shows that the training loss is increasing over time, while the validation loss is decreasing. This suggests that the model is overfitting the training data.