# Early Stopping in Neural Networks

**1. Introduction to Early Stopping**

*   **Purpose:** Early Stopping is a mechanism used to improve neural network training by preventing overfitting and determining the optimal number of training epochs.
*   **Context:** When training a neural network, you must specify the number of epochs – how many times the model iterates over the same data. Deciding the correct number of epochs (e.g., 100 vs. 1000) is crucial.

**2. The Problem: Overfitting**

*   **Definition:** Overfitting occurs when a model is trained for too many epochs, causing it to perform exceptionally well on the training data but poorly on new, unseen data.
*   **Identification:** This can be observed by monitoring both training loss and validation loss.
    *   Initially, both training and validation loss decrease.
    *   At a certain point, the validation loss (loss on the test/unseen data) starts to *increase*, while the training loss may continue to decrease. This divergence signifies overfitting.
    *   The "gap" between training and validation loss widens as overfitting progresses.
*   **Example from Source:** An example demonstrated training a model for 3500 epochs. The validation loss initially reduced but then began to increase around 360-380 epochs, indicating overfitting despite the training loss continuing to decrease. The ideal stopping point would have been around 360-380 epochs.

**3. What is Early Stopping?**

*   **Mechanism:** Early Stopping is a Keras mechanism that automatically detects when further training will not provide benefit, but rather cause harm (i.e., overfitting or increasing loss), and stops the model's training at that point.
*   **How it Works:** It monitors a specified metric (e.g., validation loss) during training and halts the process if the metric no longer improves for a certain number of epochs.
*   **Benefit:** It allows the model to stop training at the point where it achieves the best generalisation performance, without having to manually guess the number of epochs.

**4. Implementing Early Stopping in Keras**

*   **Keras Callbacks:** Early Stopping is implemented using the "callback" feature in Keras. Callbacks are functions that can be applied at certain stages of the training process (e.g., after each epoch).
*   **Steps:**
    1.  **Define Model and Compile:** As usual, set up your Keras model and compile it (no changes required here for Early Stopping).
    2.  **Create EarlyStopping Object:** Instantiate an `EarlyStopping` object from Keras. This object is a class constructor that takes several parameters.
    3.  **Add to Callbacks List:** Store the `EarlyStopping` object in a list, typically named `callbacks`.
    4.  **Pass to `model.fit()`:** Provide this `callbacks` list to the `callbacks` parameter in the `model.fit()` method.

*   **Demonstration:** When implemented, the model that previously trained for 3500 epochs and overfit, stopped automatically at 327 epochs, confirming that this was the optimal point to prevent overfitting. The resulting model's plot showed the validation loss diverging from the training loss at this point, but the training was stopped before significant overfitting occurred.

**5. Key Parameters of the `EarlyStopping` Callback**

These parameters allow for flexibility in the Early Stopping mechanism:

*   **`monitor`**:
    *   **Purpose:** Specifies the quantity to be monitored for improvement.
    *   **Common Use:** Typically set to `'val_loss'` (validation loss) as it directly reflects performance on unseen data. It can also be `'val_accuracy'` (validation accuracy).
    *   **Logic:** If monitoring loss, training stops when loss stops decreasing. If monitoring accuracy, training stops when accuracy stops increasing.
*   **`min_delta`**:
    *   **Purpose:** The minimum change in the monitored quantity to qualify as an "improvement".
    *   **Example:** If `min_delta` is 0.001, an improvement of 0.0005 would not be considered significant enough, and the model would continue to monitor for a larger change.
*   **`patience`**:
    *   **Purpose:** The number of epochs with no improvement after which training will be stopped.
    *   **Mechanism:** It acts as a buffer. Even if an epoch shows no improvement, training won't stop immediately. The model waits for `patience` number of epochs. If no improvement is observed during this period, training stops.
    *   **Example:** If `patience=3`, the model will wait for 3 consecutive epochs with no improvement before stopping.
*   **`verbose`**:
    *   **Purpose:** Controls the verbosity of the output messages.
    *   **Values:** `0` means no messages are printed, `1` means messages (like "Early stopping") are printed.
*   **`mode`**:
    *   **Purpose:** Determines the direction of improvement.
    *   **Values:**
        *   `'auto'` (default): Keras intelligently infers whether to look for a minimum or maximum based on the `monitor` quantity (e.g., a minimum for loss, a maximum for accuracy). This is generally recommended.
        *   `'min'`: Training stops when the monitored quantity has stopped *decreasing* (e.g., for loss).
        *   `'max'`: Training stops when the monitored quantity has stopped *increasing* (e.g., for accuracy).
*   **`baseline`**:
    *   **Purpose:** A baseline value for the monitored quantity. Training will stop if the model doesn't show improvement over this baseline.
    *   **Usage:** Requires a strong understanding of your data and a specific target for performance.
*   **`restore_best_weights`**:
    *   **Purpose:** Whether to restore model weights from the epoch with the best value of the monitored quantity.
    *   **Values:**
        *   `True`: The model's weights will revert to those from the epoch where the `monitor` quantity was at its optimal value (e.g., lowest validation loss).
        *   `False` (default): The model will retain the weights from the *last* epoch it trained, which might not be the absolute best.

**6. Conclusion**

*   Early Stopping is a powerful and practical feature in deep learning.
*   It simplifies the training process by automatically deciding the optimal number of epochs, saving time, and preventing the negative effects of overfitting.
*   Experimenting with its parameters is encouraged to develop an intuition for how it works and to achieve the best results for a specific model and dataset.

---

In [4]:
!pip install tensorflow

In [2]:
import pandas as pd
import numpy as np

In [3]:
df=pd.read_csv("https://raw.githubusercontent.com/pankaj-2708/Machine-Learning/refs/heads/main/Datasets/placement.csv")

In [5]:
df.head()

Unnamed: 0,cgpa,placement_exam_marks,placed
0,7.19,26.0,1
1,7.46,38.0,1
2,7.54,40.0,1
3,6.42,8.0,1
4,7.23,17.0,0


In [34]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping

model=Sequential()
model.add(Dense(10,input_dim=2,activation='sigmoid'))
model.add(Dense(6,activation='sigmoid'))
model.add(Dense(1,activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [35]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

X=df.drop(columns='placed')
y=df['placed']

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
std=StandardScaler()
X_train=std.fit_transform(X_train)
X_test=std.transform(X_test)

In [36]:
callback=EarlyStopping(
    monitor='val_loss',
    patience=10,
    verbose=1,
    mode='auto',
    baseline=None,
    restore_best_weights=False
)

In [37]:
model.fit(X_train,y_train,validation_data=(X_test,y_test),epochs=10000,callbacks=callback)

Epoch 1/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 25ms/step - accuracy: 0.4975 - loss: 0.6960 - val_accuracy: 0.5500 - val_loss: 0.6894
Epoch 2/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.5000 - loss: 0.6947 - val_accuracy: 0.5400 - val_loss: 0.6901
Epoch 3/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.4925 - loss: 0.6945 - val_accuracy: 0.5400 - val_loss: 0.6926
Epoch 4/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 12ms/step - accuracy: 0.4512 - loss: 0.6945 - val_accuracy: 0.4700 - val_loss: 0.6936
Epoch 5/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 11ms/step - accuracy: 0.4938 - loss: 0.6941 - val_accuracy: 0.5650 - val_loss: 0.6922
Epoch 6/10000
[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - accuracy: 0.4588 - loss: 0.6943 - val_accuracy: 0.4650 - val_loss: 0.6940
Epoch 7/10000
[

<keras.src.callbacks.history.History at 0x1e646f07f00>