<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/assets/logos/SN_web_lightmode.png" height=300 width=300 />


# Final Project: League of Legends Match Predictor 


### Introduction  

League of Legends, a popular multiplayer online battle arena (MOBA) game, generates extensive data from matches, providing an excellent opportunity to apply machine learning techniques to real-world scenarios. Perform the following steps to build a logistic regression model aimed at predicting the outcomes of League of Legends matches.  

Use the [league_of_legends_data_large.csv](https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/rk7VDaPjMp1h5VXS-cUyMg/league-of-legends-data-large.csv) file to perform the tasks.  

### Step 1: Data Loading and Preprocessing  

#### Task 1: Load the League of Legends dataset and preprocess it for training.  

Loading and preprocessing the dataset involves reading the data, splitting it into training and testing sets, and standardizing the features. You will utilize `pandas` for data manipulation, `train_test_split` from `sklearn` for data splitting, and `StandardScaler` for feature scaling.  

Note: Please ensure all the required libraries are installed and imported.

1 .Load the dataset:
Use `pd.read_csv()` to load the dataset into a pandas DataFrame.</br>
2. Split data into features and target: Separate win (target) and the remaining columns (features).</br>
   X = data.drop('win', axis=1)</br>
   y = data['win'] </br>
3 .Split the Data into Training and Testing Sets:
Use `train_test_split()` from `sklearn.model_selection` to divide the data. Set `test_size`=0.2 to allocate 20% for testing and 80% for training, and use `random_state`=42 to ensure reproducibility of the split.</br>
4. Standardize the features:
Use `StandardScaler()` from sklearn.preprocessing to scale the features.</br>
5. Convert to PyTorch tensors:
Use `torch.tensor()` to convert the data to PyTorch tensors.

#### Exercise 1:  

Write a code to load the dataset, split it into training and testing sets, standardize the features, and convert the data into PyTorch tensors for use in training a PyTorch model.  


### Setup
Installing required libraries:

The following required libraries are not pre-installed in the Skills Network Labs environment. You will need to run the following cell to install them:


In [1]:
%%time
%pip install pandas scikit-learn matplotlib
%pip install torch==2.8.0+cpu torchvision==0.23.0+cpu torchaudio==2.8.0+cpu \
    --index-url https://download.pytorch.org/whl/cpu


Note: you may need to restart the kernel to use updated packages.
Looking in indexes: https://download.pytorch.org/whl/cpu
[31mERROR: Could not find a version that satisfies the requirement torch==2.8.0+cpu (from versions: 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.6.0, 2.7.0, 2.7.1, 2.8.0, 2.9.0, 2.9.1)[0m[31m
[0m[31mERROR: No matching distribution found for torch==2.8.0+cpu[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
CPU times: user 35.8 ms, sys: 20.9 ms, total: 56.7 ms
Wall time: 3.9 s


In [None]:
## Write your code here
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

import pandas as pd
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

# Load dataset from URL
url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/rk7VDaPjMp1h5VXS-cUyMg/league-of-legends-data-large.csv'
lol_data = pd.read_csv(url)

# Separate features from target variable
feature_cols = [col for col in lol_data.columns if col != 'win']
X_features = lol_data[feature_cols]
y_target = lol_data['win']

# Create train/test split (80/20)
X_tr, X_te, y_tr, y_te = train_test_split(
    X_features, y_target, 
    test_size=0.2, 
    random_state=42,
    stratify=y_target
)

# Apply standardization
feature_scaler = StandardScaler()
X_tr_scaled = feature_scaler.fit_transform(X_tr)
X_te_scaled = feature_scaler.transform(X_te)

# Convert to PyTorch tensor format
X_train = torch.from_numpy(X_tr_scaled.astype(np.float32))
X_test = torch.from_numpy(X_te_scaled.astype(np.float32))
y_train = torch.from_numpy(y_tr.values.astype(np.float32)).reshape(-1, 1)
y_test = torch.from_numpy(y_te.values.astype(np.float32)).reshape(-1, 1)

print(f"Training features shape: {X_train.shape}")
print(f"Testing features shape: {X_test.shape}")
print(f"Feature count: {X_train.shape[1]}")

### Step 2: Logistic Regression Model  

#### Task 2: Implement a logistic regression model using PyTorch.  

Defining the logistic regression model involves specifying the input dimensions, the forward pass using the sigmoid activation function, and initializing the model, loss function, and optimizer.  

1 .Define the Logistic Regression Model:</br>
  Create a class LogisticRegressionModel that inherits from torch.nn.Module.</br>
 - In the `__init__()` method, define a linear layer (nn.Linear) to implement the logistic regression model.</br>
- The `forward()` method should apply the sigmoid activation function to the output of the linear layer.</br>

2.Initialize the Model, Loss Function, and Optimizer:</br>
- Set input_dim: Use `X_train.shape[1]` to get the number of features from the training data (X_train).</br>
- Initialize the model: Create an instance of the LogisticRegressionModel class  (e.g., `model = LogisticRegressionModel()`)while passing input_dim as a parameter</br>
- Loss Function: Use `BCELoss()` from torch.nn (Binary Cross-Entropy Loss).</br>
- Optimizer: Initialize the optimizer using `optim.SGD()` with a learning rate of 0.01</br>

#### Exercise 2:  

Define the logistic regression model using PyTorch, specifying the input dimensions and the forward pass. Initialize the model, loss function, and optimizer.  


In [None]:
## Write your code here
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

import torch.nn as nn
import torch.optim as optim

# Define custom logistic regression architecture
class LogisticRegressionModel(nn.Module):
    def __init__(self, num_features):
        super().__init__()
        self.linear = nn.Linear(num_features, 1)
        
    def forward(self, x):
        logits = self.linear(x)
        probabilities = torch.sigmoid(logits)
        return probabilities

# Get number of input features
num_input_features = X_train.shape[1]

# Instantiate model
model = LogisticRegressionModel(num_input_features)

# Define binary cross entropy loss
criterion = nn.BCELoss()

# Initialize SGD optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)

print(f"Model architecture:")
print(f"  Input features: {num_input_features}")
print(f"  Output: 1 (binary classification)")
print(f"\nModel structure:\n{model}")



### Step 3: Model Training  

#### Task 3: Train the logistic regression model on the dataset.  

The training loop will run for a specified number of epochs. In each epoch, the model makes predictions, calculates the loss, performs backpropagation, and updates the model parameters.

1. Set Number of Epochs:  
   - Define the number of epochs for training to 1000.

2. Training Loop:  
   For each epoch:
   - Set the model to training mode using `model.train()`.
   - Zero the gradients using `optimizer.zero_grad()`.
   - Pass the training data (`X_train`) through the model to get the predictions (`outputs`).
   - Calculate the loss using the defined loss function (`criterion`).
   - Perform backpropagation with `loss.backward()`.
   - Update the model's weights using `optimizer.step()`.

3. Print Loss Every 100 Epochs:  
   - After every 100 epochs, print the current epoch number and the loss value.

4. Model Evaluation:  
   - Set the model to evaluation mode using `model.eval()`.
   - Use `torch.no_grad()` to ensure no gradients are calculated during evaluation.
   - Get predictions on both the training set (`X_train`) and the test set (`X_test`).

5. Calculate Accuracy:  
   - For both the training and test datasets, compute the accuracy by comparing the predicted values with the true values (`y_train`, `y_test`).
   - Use a threshold of 0.5 for classification
   
6. Print Accuracy:  
   - Print the training and test accuracies after the evaluation is complete.

#### Exercise 3:  

Write the code to train the logistic regression model on the dataset. Implement the training loop, making predictions, calculating the loss, performing backpropagation, and updating model parameters. Evaluate the model's accuracy on training and testing sets.  


In [None]:
# Write your code here
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

# Training configuration
num_epochs = 1000
print_interval = 100

# Training loop
for current_epoch in range(num_epochs):
    # Enable training mode
    model.train()
    
    # Reset gradients
    optimizer.zero_grad()
    
    # Forward propagation
    train_predictions = model(X_train)
    
    # Compute loss
    train_loss = criterion(train_predictions, y_train)
    
    # Backpropagation
    train_loss.backward()
    
    # Parameter update
    optimizer.step()
    
    # Periodic logging
    if (current_epoch + 1) % print_interval == 0:
        print(f'Epoch {current_epoch + 1}/{num_epochs} | Loss: {train_loss.item():.4f}')

# Evaluation phase
model.eval()
with torch.no_grad():
    train_probs = model(X_train)
    test_probs = model(X_test)

# Calculate accuracies using 0.5 threshold
train_predictions_binary = (train_probs >= 0.5).float()
test_predictions_binary = (test_probs >= 0.5).float()

train_acc = (train_predictions_binary == y_train).float().mean()
test_acc = (test_predictions_binary == y_test).float().mean()

print(f'\n{"="*50}')
print(f'Training Set Accuracy: {train_acc.item()*100:.2f}%')
print(f'Test Set Accuracy: {test_acc.item()*100:.2f}%')
print(f'{"="*50}')


### Step 4: Model Optimization and Evaluation  

#### Task 4: Implement optimization techniques and evaluate the model's performance.  

Optimization techniques such as L2 regularization (Ridge Regression) help in preventing overfitting. The model is retrained with these optimizations, and its performance is evaluated on both training and testing sets. 

**Weight Decay** :In the context of machine learning and specifically in optimization algorithms, weight_decay is a parameter used to apply L2 regularization to the model's parameters (weights). It helps prevent the model from overfitting by penalizing large weight values, thereby encouraging the model to find simpler solutions.To use L2 regularization, you need to modify the optimizer by setting the weight_decay parameter. The weight_decay parameter in the optimizer adds the L2 regularization term during training.
For example, when you initialize the optimizer with optim.SGD(model.parameters(), lr=0.01, weight_decay=0.01), the weight_decay=0.01 term applies L2 regularization with a strength of 0.01.

1. Set Up the Optimizer with L2 Regularization:
   - Modify the optimizer to include `weight_decay` for L2 regularization.
   - Example:
     ```python
     optimizer = optim.SGD(model.parameters(), lr=0.01, weight_decay=0.01)
     ```
2. Train the Model with L2 Regularization:
    - Follow the same steps as before but use the updated optimizer with regularization during training.
    - Use epochs=1000
   
3. Evaluate the Optimized Model:
   - After training, evaluate the model on both the training and test datasets.
   - Compute the accuracy for both sets by comparing the model's predictions to the true labels (`y_train` and `y_test`).

4. Calculate and Print the Accuracy:
   - Use a threshold of 0.5 to determine whether the model's predictions are class 0 or class 1.
   - Print the training accuracy and test accuracy  after evaluation.


#### Exercise 4:  

Implement optimization techniques like L2 regularization and retrain the model. Evaluate the performance of the optimized model on both training and testing sets.  


In [None]:
## Write your code here
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

# Reinitialize model for regularized training
model_regularized = LogisticRegressionModel(num_input_features)
criterion = nn.BCELoss()

# SGD with L2 penalty (weight_decay)
optimizer_reg = optim.SGD(model_regularized.parameters(), lr=0.01, weight_decay=0.01)

# Training with regularization
num_epochs = 1000
for epoch_idx in range(num_epochs):
    model_regularized.train()
    optimizer_reg.zero_grad()
    
    outputs_reg = model_regularized(X_train)
    loss_reg = criterion(outputs_reg, y_train)
    
    loss_reg.backward()
    optimizer_reg.step()
    
    if (epoch_idx + 1) % 100 == 0:
        print(f'Epoch {epoch_idx + 1}/{num_epochs} | Regularized Loss: {loss_reg.item():.4f}')

# Evaluate regularized model
model_regularized.eval()
with torch.no_grad():
    train_probs_reg = model_regularized(X_train)
    test_probs_reg = model_regularized(X_test)

# Calculate accuracies
train_preds_reg = (train_probs_reg >= 0.5).float()
test_preds_reg = (test_probs_reg >= 0.5).float()

acc_train_reg = (train_preds_reg == y_train).float().mean()
acc_test_reg = (test_preds_reg == y_test).float().mean()

print(f'\n{"="*50}')
print(f'With L2 Regularization:')
print(f'  Training Accuracy: {acc_train_reg.item()*100:.2f}%')
print(f'  Test Accuracy: {acc_test_reg.item()*100:.2f}%')
print(f'{"="*50}')

### Step 5: Visualization and Interpretation  

Visualization tools like confusion matrices and ROC curves provide insights into the model's performance. The confusion matrix helps in understanding the classification accuracy, while the ROC curve illustrates the trade-off between sensitivity and specificity.

Confusion Matrix : A Confusion Matrix is a fundamental tool used in classification problems to evaluate the performance of a model. It provides a matrix showing the number of correct and incorrect predictions made by the model, categorized by the actual and predicted classes.
Where 
-  True Positive (TP): Correctly predicted positive class (class 1).
- True Negative (TN): Correctly predicted negative class (class 0).
- False Positive (FP): Incorrectly predicted as positive (class 1), but the actual class is negative (class 0). This is also called a Type I error.
- False Negative (FN): Incorrectly predicted as negative (class 0), but the actual class is positive (class 1). This is also called a Type II error. 

ROC Curve (Receiver Operating Characteristic Curve):
The ROC Curve is a graphical representation used to evaluate the performance of a binary classification model across all classification thresholds. It plots two metrics:
- True Positive Rate (TPR) or Recall (Sensitivity)-It is the proportion of actual positive instances (class 1) that were correctly classified as positive by the model.
- False Positive Rate (FPR)-It is the proportion of actual negative instances (class 0) that were incorrectly classified as positive by the model.
  
AUC: 
AUC stands for Area Under the Curve and is a performance metric used to evaluate the quality of a binary classification model. Specifically, it refers to the area under the ROC curve (Receiver Operating Characteristic curve), which plots the True Positive Rate (TPR) versus the False Positive Rate (FPR) for different threshold values.

Classification Report:
A Classification Report is a summary of various classification metrics, which are useful for evaluating the performance of a classifier on the given dataset.

#### Exercise 5:  

Write code to visualize the model's performance using confusion matrices and ROC curves. Generate classification reports to evaluate precision, recall, and F1-score. Retrain the model with L2 regularization and evaluate the performance.


In [None]:
## Write your code here
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, auc
import numpy as np

# Generate confusion matrix
test_labels = (test_probs_reg >= 0.5).float().numpy()
conf_matrix = confusion_matrix(y_test.numpy(), test_labels)

# Plot confusion matrix
fig, ax = plt.subplots(figsize=(7, 7))
im = ax.imshow(conf_matrix, interpolation='nearest', cmap='Blues')
ax.set_title('Confusion Matrix - Match Outcome Prediction', fontsize=14, pad=20)
plt.colorbar(im, ax=ax)

class_labels = ['Loss', 'Win']
tick_positions = np.arange(len(class_labels))
ax.set_xticks(tick_positions)
ax.set_yticks(tick_positions)
ax.set_xticklabels(class_labels, rotation=45)
ax.set_yticklabels(class_labels)

# Add text annotations
threshold_val = conf_matrix.max() / 2.0
for i in range(conf_matrix.shape[0]):
    for j in range(conf_matrix.shape[1]):
        text_color = "white" if conf_matrix[i, j] > threshold_val else "black"
        ax.text(j, i, str(conf_matrix[i, j]), 
                ha="center", va="center", color=text_color, fontsize=16)

ax.set_ylabel('Actual Outcome', fontsize=12)
ax.set_xlabel('Predicted Outcome', fontsize=12)
plt.tight_layout()
plt.show()

# Classification metrics
print("\n" + "="*60)
print("CLASSIFICATION METRICS REPORT")
print("="*60)
print(classification_report(y_test.numpy(), test_labels, target_names=class_labels))

# ROC Curve
false_pos_rate, true_pos_rate, _ = roc_curve(y_test.numpy(), test_probs_reg.numpy())
area_under_curve = auc(false_pos_rate, true_pos_rate)

fig, ax = plt.subplots(figsize=(8, 6))
ax.plot(false_pos_rate, true_pos_rate, color='darkorange', lw=2.5, 
        label=f'ROC Curve (AUC = {area_under_curve:.3f})')
ax.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--', label='Random Classifier')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate', fontsize=12)
ax.set_ylabel('True Positive Rate', fontsize=12)
ax.set_title('ROC Curve - Model Performance', fontsize=14)
ax.legend(loc="lower right", fontsize=11)
ax.grid(alpha=0.3)
plt.tight_layout()
plt.show()

print(f"\nArea Under ROC Curve: {area_under_curve:.4f}")



Double-click <b>here</b> for the Hint.
<!-- 

#Change the name of variables as per your code
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, classification_report, roc_curve, auc
import itertools

# Visualize the confusion matrix
#Change the variable names as used in your code
y_pred_test_labels = (y_pred_test > 0.5).float()
cm = confusion_matrix(y_test, y_pred_test_labels)

plt.figure(figsize=(6, 6))
plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
plt.title('Confusion Matrix')
plt.colorbar()
tick_marks = range(2)
plt.xticks(tick_marks, ['Loss', 'Win'], rotation=45)
plt.yticks(tick_marks, ['Loss', 'Win'])

thresh = cm.max() / 2
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    plt.text(j, i, cm[i, j], horizontalalignment="center", color="white" if cm[i, j] > thresh else "black")

plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
plt.show()

# Print classification report
print("Classification Report:\n", classification_report(y_test, y_pred_test_labels, target_names=['Loss', 'Win']))

# Plot ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_test)
roc_auc = auc(fpr, tpr)

plt.figure()
plt.plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (area = {roc_auc:.2f})')
plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic (ROC)')
plt.legend(loc="lower right")
plt.show()
-->


### Step 6: Model Saving and Loading  

#### Task 6: Save and load the trained model.  

This task demonstrates the techniques to persist a trained model using `torch.save` and reload it using `torch.load`. Evaluating the loaded model ensures that it retains its performance, making it practical for deployment in real-world applications.  

1. Saving the Model:
- Save the model's learned weights and biases using torch.save().( e.g. , torch.save(model.state_dict(), 'your_model_name.pth'))
- Saving only the state dictionary (model parameters) is preferred because it’s more flexible and efficient than saving the entire model object.

2. Loading the Model:
- Create a new model instance (e.g., `model = LogisticRegressionModel()`) and load the saved parameters. ( e.g. , `model.load_state_dict(torch.load('your_model_name.pth'))`)`.

3. Evaluating the Loaded Model:
   - After loading, set the model to evaluation mode by calling `model.eval()
   - After loading the model, evaluate it again on the test dataset to make sure it performs similarly to when it was first trained..Now evaluate it on the test data.
   - Use `torch.no_grad()` to ensure that no gradients are computed.

#### Exercise 6:  

Write code to save the trained model and reload it. Ensure the loaded model performs consistently by evaluating it on the test dataset.  


In [None]:
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

# Save trained model parameters
model_save_path = 'lol_match_predictor.pth'
torch.save(model_regularized.state_dict(), model_save_path)
print(f"Model saved successfully to: {model_save_path}")

# Load model into new instance
loaded_model = LogisticRegressionModel(num_input_features)
loaded_model.load_state_dict(torch.load(model_save_path))
print(f"Model loaded successfully from: {model_save_path}")

# Set to evaluation mode
loaded_model.eval()

# Verify loaded model performance
with torch.no_grad():
    loaded_test_probs = loaded_model(X_test)
    loaded_test_preds = (loaded_test_probs >= 0.5).float()
    loaded_accuracy = (loaded_test_preds == y_test).float().mean()

print(f"\n{'='*50}")
print(f"Loaded Model Verification:")
print(f"  Test Accuracy: {loaded_accuracy.item()*100:.2f}%")
print(f"  Match with original: {torch.allclose(loaded_test_probs, test_probs_reg, atol=1e-6)}")
print(f"{'='*50}")

### Step 7: Hyperparameter Tuning  

#### Task 7: Perform hyperparameter tuning to find the best learning rate.  

By testing different learning rates, you will identify the optimal rate that provides the best test accuracy. This fine-tuning is crucial for enhancing model performance . 
1. Define Learning Rates:
   - Choose these learning rates to test ,[0.01, 0.05, 0.1]

2. Reinitialize the Model for Each Learning Rate:
  - For each learning rate, you’ll need to reinitialize the model and optimizer e.g.(`torch.optim.SGD(model.parameters(), lr=lr)`).
   - Each new learning rate requires reinitializing the model since the optimizer and its parameters are linked to the learning rate.

3. Train the Model for Each Learning Rate:
  - Train the model for a fixed number of epochs (e.g., 50 or 100 epochs) for each learning rate, and compute the accuracy on the test set.
  - Track the test accuracy for each learning rate and identify which one yields the best performance.

4. Evaluate and Compare:
  - After training with each learning rate, compare the test accuracy for each configuration.
   - Report the learning rate that gives the highest test accuracy

#### Exercise 7:  

Perform hyperparameter tuning to find the best learning rate. Retrain the model for each learning rate and evaluate its performance to identify the optimal rate.  


In [None]:
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

# Learning rates to test
learning_rates_to_test = [0.01, 0.05, 0.1]
tuning_epochs = 100

results_dict = {}

print("Starting Hyperparameter Tuning...")
print("="*60)

for lr_value in learning_rates_to_test:
    print(f"\nTesting Learning Rate: {lr_value}")
    
    # Initialize fresh model
    tuning_model = LogisticRegressionModel(num_input_features)
    tuning_optimizer = optim.SGD(tuning_model.parameters(), lr=lr_value, weight_decay=0.01)
    tuning_criterion = nn.BCELoss()
    
    # Train model
    for epoch in range(tuning_epochs):
        tuning_model.train()
        tuning_optimizer.zero_grad()
        
        outputs = tuning_model(X_train)
        loss = tuning_criterion(outputs, y_train)
        
        loss.backward()
        tuning_optimizer.step()
    
    # Evaluate on test set
    tuning_model.eval()
    with torch.no_grad():
        test_outputs = tuning_model(X_test)
        test_predictions = (test_outputs >= 0.5).float()
        test_accuracy = (test_predictions == y_test).float().mean().item()
    
    results_dict[lr_value] = test_accuracy
    print(f"  Final Test Accuracy: {test_accuracy*100:.2f}%")

# Find best learning rate
best_lr = max(results_dict, key=results_dict.get)
best_accuracy = results_dict[best_lr]

print(f"\n{'='*60}")
print(f"Hyperparameter Tuning Results:")
for lr, acc in results_dict.items():
    marker = " ← BEST" if lr == best_lr else ""
    print(f"  LR={lr}: {acc*100:.2f}%{marker}")
print(f"\nOptimal Learning Rate: {best_lr}")
print(f"Best Test Accuracy: {best_accuracy*100:.2f}%")
print(f"{'='*60}")

### Step 8: Feature Importance  

#### Task 8: Evaluate feature importance to understand the impact of each feature on the prediction.  

The code to evaluate feature importance to understand the impact of each feature on the prediction.

 1.Extracting Model Weights:
  - The weights of the logistic regression model represent the importance of each feature in making predictions. These weights are stored in the model's linear layer (`model.linear.weight`).
 - You can extract the weights using `model.linear.weight.data.numpy()` and flatten the resulting tensor to get a 1D array of feature importances.

2.Creating a DataFrame:
 - Create a pandas DataFrame with two columns: one for the feature names and the other for their corresponding importance values (i.e., the learned weights).
 - Ensure the features are aligned with their names in your dataset (e.g., `X_train.columns).

3. Sorting and Plotting Feature Importance:
  - Sort the features based on the absolute value of their importance (weights) to identify the most impactful features.
  - Use a bar plot (via `matplotlib`) to visualize the sorted feature importances, with the feature names on the y-axis and importance values on the x-axis.

4. Interpreting the Results:
  - Larger absolute weights indicate more influential features. Positive weights suggest a positive correlation with the outcome (likely to predict the positive class), while negative weights suggest the opposite.

#### Exercise 8:  

Evaluate feature importance by extracting the weights of the linear layer and creating a DataFrame to display the importance of each feature. Visualize the feature importance using a bar plot.  


In [None]:
# filepath: /Users/hemank/Documents/github/trading/study/ML/IBM Deep Learning with PyTorch, Keras and Tensorflow/02 Introduction to Neural Networks and PyTorch/23 Final Project League of Legends Match Predictor-v2.ipynb

import pandas as pd
import matplotlib.pyplot as plt

# Extract learned weights
model_weights = model_regularized.linear.weight.data.numpy().flatten()
feature_names = list(X_features.columns)

# Create importance dataframe
importance_df = pd.DataFrame({
    'Feature': feature_names,
    'Weight': model_weights,
    'Absolute_Weight': np.abs(model_weights)
})

# Sort by absolute importance
importance_df = importance_df.sort_values('Absolute_Weight', ascending=False)

print("Feature Importance Analysis")
print("="*60)
print(importance_df.head(15).to_string(index=False))

# Visualize top features
top_n = 20
top_features = importance_df.head(top_n)

fig, ax = plt.subplots(figsize=(10, 8))
colors = ['green' if w > 0 else 'red' for w in top_features['Weight']]
bars = ax.barh(range(len(top_features)), top_features['Weight'], color=colors, alpha=0.7)

ax.set_yticks(range(len(top_features)))
ax.set_yticklabels(top_features['Feature'])
ax.set_xlabel('Feature Weight (Importance)', fontsize=12)
ax.set_title(f'Top {top_n} Most Important Features', fontsize=14, pad=15)
ax.axvline(x=0, color='black', linestyle='-', linewidth=0.8)
ax.grid(axis='x', alpha=0.3)

# Add legend
from matplotlib.patches import Patch
legend_elements = [
    Patch(facecolor='green', alpha=0.7, label='Positive Impact (Win)'),
    Patch(facecolor='red', alpha=0.7, label='Negative Impact (Loss)')
]
ax.legend(handles=legend_elements, loc='lower right')

plt.tight_layout()
plt.show()

print(f"\nInterpretation:")
print(f"- Positive weights (green) increase win probability")
print(f"- Negative weights (red) decrease win probability")
print(f"- Larger absolute values indicate stronger influence")

Double-click <b>here</b> for the Hint
<!-- 
#Use the following code to extract the weight and create dataframe
#Change the name of variables per your code

Extract the weights of the linear layer:
weights = model.linear.weight.data.numpy().flatten()
features = X.columns
Create a DataFrame for feature importance:
feature_importance = pd.DataFrame({'Feature': features, 'Importance': weights})
feature_importance = feature_importance.sort_values(by='Importance', ascending=False)
print(feature_importance)
Plot feature importance plt.figure(figsize=(10, 6))
plt.bar(feature_importance['Feature'], feature_importance['Importance'])
plt.xlabel('Features')
plt.ylabel('Importance')
plt.title('Feature Importance')
plt.xticks(rotation=45)
plt.show()
-->


#### Conclusion:  

Congratulations on completing the project! In this final project, you built a logistic regression model to predict the outcomes of League of Legends matches based on various in-game statistics. This comprehensive project involved several key steps, including data loading and preprocessing, model implementation, training, optimization, evaluation, visualization, model saving and loading, hyperparameter tuning, and feature importance analysis. This project provided hands-on experience with the complete workflow of developing a machine learning model for binary classification tasks using PyTorch.

© Copyright IBM Corporation. All rights reserved.
