In [None]:
# Install required packages

!pip install tensorflow==2.18.0
!pip install keras==3.7.0
!pip install torch==2.5.1
!pip install torchvision==0.20.1

!pip install numpy==2.0.2
!pip install scipy==1.14.1
!pip install pandas==2.2.3

!pip install scikit-learn==1.5.2

!pip install matplotlib==3.9.2

!pip install joblib==1.4.2
!pip install python-dateutil==2.9.0.post0

!pip install sympy==1.13.1
!pip install opt-einsum==3.4.0

!pip install tensorboard==2.18.0
!pip install protobuf==5.29.0
!pip install threadpoolctl==3.5.0
!pip install packaging==24.2


#1. Import Necessary Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from keras.regularizers import l2

* numpy: For numerical computations and dataset creation.
* matplotlib.pyplot: For visualizing loss behavior.
* train_test_split: Splits the dataset into training and testing sets.
* Sequential, Dense: Keras tools to build a shallow neural network model.
* Adam: Optimizer for training the model efficiently.
* l2: Implements weight decay (L2 regularization) to control overfitting.

#2. Generate Synthetic Dataset

In [None]:
np.random.seed(42)
n_samples = 100  # Larger dataset for better training
X = np.random.uniform(-1, 1, size=(n_samples, 1))
y = (np.sin(2 * np.pi * X).ravel() + 0.7 * np.random.normal(size=n_samples)) > 0  # Binary classification

* **X**: Input features sampled uniformly from [*-1, 1*].
* **y**: Output labels generated by thresholding a sine function with **Gaussian** noise (*𝜎 = 0.7*), creating a **binary classification** problem.

#3. Split Data into Training and Testing Sets

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Splits the dataset into:
* **X_train, y_train**: Training data (*80% of the dataset*).
* **X_test, y_test**: Testing data (*20% of the dataset*).

#4. Define Experiment Configurations

In [None]:
epoch_counts = [250, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500]  # Training durations
hidden_layer_size = 50  # Size of hidden layer (model complexity)
l2_weight = 0.00  # L2 regularization strength (disabled in this experiment)

* **epoch_counts**: Defines a range of training durations to explore epoch-wise double descent behavior.
* **hidden_layer_size**: Sets the number of neurons in the hidden layer, controlling model complexity.
* **l2_weight**: Specifies the strength of L2 regularization (*set to **0.00** to observe baseline behavior*).

#5. Initialize Result Storage

In [None]:
train_losses = []
test_losses = []

* **train_losses**: Stores training losses for each epoch configuration.
* **test_losses**: Stores testing losses for each epoch configuration.

#6. Loop Over Epoch Configurations

In [None]:
for epochs in epoch_counts:
    # Build the model
    model = Sequential([
        Dense(hidden_layer_size, input_dim=1, activation='relu', kernel_regularizer=l2(l2_weight)),
        Dense(1, activation='sigmoid', kernel_regularizer=l2(l2_weight))
    ])

    # Compile the model
    model.compile(optimizer=Adam(learning_rate=0.01), loss='binary_crossentropy', metrics=['accuracy'])

    # Train the model
    model.fit(X_train, y_train, epochs=epochs, verbose=0, batch_size=16)

    # Evaluate the model
    train_loss, _ = model.evaluate(X_train, y_train, verbose=0)
    test_loss, _ = model.evaluate(X_test, y_test, verbose=0)

    # Log the results
    train_losses.append(train_loss)
    test_losses.append(test_loss)

1. Model Creation:

* A shallow network with one hidden layer (*hidden_layer_size neurons*) using **ReLU activation**.
* Output layer uses a **sigmoid activation** for binary classification.
* L2 regularization (*weight decay*) is applied with the configured strength.
2. Compilation:

* **Optimizer**: Adam with a *learning rate of 0.01*.
* **Loss**: Binary Crossentropy, suited for binary classification tasks.
* **Metrics**: Accuracy to monitor performance.
3. Training:

* The model is trained for the specified number of epochs (*epochs*), with a batch size of 16.
4. Evaluation:

* Calculates train loss and test loss for the current epoch configuration.
5. Logging Results:

* Stores *training and testing losses* in **train_losses and test_losses** for later visualization.

#7. Visualize Results

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(epoch_counts, train_losses, label='Train Loss', marker='o')
plt.plot(epoch_counts, test_losses, label='Test Loss', marker='o')
plt.xlabel('Number of Epochs')
plt.ylabel('Loss (Binary Crossentropy)')
plt.title('Epoch-wise Double Descent Behavior')
plt.legend()
plt.grid()
plt.show()

* **X-axis**: Number of training epochs.
* **Y-axis**: Loss (*Binary Crossentropy*).
* **Curves**:
  * Training loss (*train_losses*): **Dashed curve**.
  * Testing loss (*test_losses*): **Solid curve**.
* **Highlights**: *Shows epoch-wise double descent behavior, where test loss decreases, increases, and decreases again as training progresses*.

#Key Observations
**Double Descent**:

* The **test loss** exhibits a double descent pattern as the model transitions from **underfitting to overfitting** and finally stabilizes.

**Effect of Training Duration**:

* Short training durations may lead to **underfitting**, while prolonged training can help the model generalize better.