# Loading the Dataset
Here we will import all the necessary libraries and load the dataset as df and separate the features ('H', 'B', 'D', 't') as X and target ('Kd') as y.

In [3]:
import numpy as np
import pandas as pd
from scipy.optimize import least_squares
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Load Dataset (Assuming 'df' is already loaded)
df = pd.read_excel('test.xlsx')
# 'Kd' is the target, and all other columns are features
X = df[['H', 'B', 'D', 't']].values  # 4 input features
y = df['Kd'].values.reshape(-1, 1)  # Ensure correct shape
print(df.head())

       H     B      D       t        Kd
0  1.625  1.25  0.188  0.0188  4.858459
1  1.625  1.25  0.188  0.0283  3.496159
2  1.625  1.25  0.188  0.0312  3.165983
3  1.625  1.25  0.188  0.0346  2.966884
4  2.500  1.25  0.188  0.0188  4.360603


# Artificial Neural Network
This notebook implements a simple feedforward neural network with one hidden layer using a log-sigmoid activation for the hidden neurons and identity activation for the output. The model is trained to perform regression using Levenberg-Marquardt optimization (scipy.optimize.least_squares).
Key steps:

Standardization of input and output features using StandardScaler for better convergence.

Manual forward pass implementation with custom activation functions.

Loss function computes residuals for the optimizer.

Model structure: 4 input features, 2 hidden neurons, and 1 output neuron.

Training/test split, weight initialization, and optimization.

Model evaluation using MSE, MAE, and R² on both training and test sets.

Prediction comparison in original scale for interpretability.

The final output includes model performance metrics and a table comparing actual vs predicted values with percentage error.

In [4]:
# Activation Functions
def logsigmoid(x):
    x = np.clip(x, -500, 500)  # Clip input to prevent overflow in exp()
    return 1 / (1 + np.exp(-x)) # Log-sigmoid activation for hidden layer


def identity(x):
    return x  # Identity for output layer

# Forward Pass: compute the network output given input X and weights
def forward_pass(X, weights):
    W1 = weights[:features * hidden_neuron].reshape(features, hidden_neuron)  # (5, hidden_neuron)
    b1 = weights[features * hidden_neuron : features * hidden_neuron + hidden_neuron].reshape(1, hidden_neuron)  # (1, hidden_neuron)
    W2 = weights[features * hidden_neuron + hidden_neuron : features * hidden_neuron + 2 * hidden_neuron].reshape(hidden_neuron, 1)  # (hidden_neuron, 1)
    b2 = weights[-1]  # Single bias for output neuron
    
    hidden_layer = logsigmoid(np.dot(X, W1) + b1)  # Log-Sigmoid Activation
    output_layer = identity(np.dot(hidden_layer, W2) + b2)  # Identity Activation

    return output_layer

# Cost Function (LM requires residuals)
def loss_function(weights, X, y):
    y_pred = forward_pass(X, weights)
    return (y_pred - y).ravel() # Flatten for least_squares


# Standardize Input & Output
scaler_X = StandardScaler()
scaler_y = StandardScaler()

X = scaler_X.fit_transform(X)
y = scaler_y.fit_transform(y)  # Standardize y too!

# Define Model Structure
features = 4 # Number of input features
hidden_neuron = 2  # Number of neurons in the hidden layer
total_weights = (features * hidden_neuron) + hidden_neuron + hidden_neuron + 1  # Total number of trainable parameters

# Split into Training and Testing Sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=42) 

# Initialize Weights Randomly
initial_weights = np.random.randn(total_weights)  # 15 parameters

# Apply Levenberg-Marquardt Optimization
result = least_squares(loss_function, initial_weights, method='lm', args=(X_train, y_train))

# Optimized Weights
optimized_weights = result.x
print("Optimized Weights:\n", optimized_weights)

# Evaluate Model on Training and Test Data
def evaluate_regression(X, y, weights, scaler_y):
    y_pred = forward_pass(X, weights)
    
    # Inverse transform y_pred to original scale
    y_pred_original = scaler_y.inverse_transform(y_pred)
    y_original = scaler_y.inverse_transform(y)
    
    

    mse = mean_squared_error(y_original, y_pred_original)
    mae = mean_absolute_error(y_original, y_pred_original)
    r2 = r2_score(y_original, y_pred_original)
    
    return mse, mae, r2, y_original, y_pred_original

# Get Metrics for Training and Testing
train_mse, train_mae, train_r2, y_train_original, y_train_pred_original = evaluate_regression(X_train, y_train, optimized_weights, scaler_y)
test_mse, test_mae, test_r2, y_test_original, y_test_pred_original = evaluate_regression(X_test, y_test, optimized_weights, scaler_y)

# Print Results
print(f"Training MSE: {train_mse:.4f}, MAE: {train_mae:.4f}, R²: {train_r2:.4f}")
print(f"Test MSE: {test_mse:.4f}, MAE: {test_mae:.4f}, R²: {test_r2:.4f}")

# Compare y_test and y_pred Side by Side (Original Scale)
comparison_df = pd.DataFrame({
    'Actual (y_test)': y_test_original.flatten(),
    'Predicted (y_pred)': y_test_pred_original.flatten(),
    'Error (%)': np.abs(((y_test_original.flatten()-y_test_pred_original.flatten())*100/y_test_original.flatten()))
})

# Print First 20 Rows
print(comparison_df.head(20))


Optimized Weights:
 [-0.0437315   0.01131759  0.5960148  -0.64389609  0.58718547 -1.74043342
 -2.65663108  1.40109385 -3.2485941  -0.67903587  4.82714864 -1.46562389
  0.00827682]
Training MSE: 0.1018, MAE: 0.2022, R²: 0.9282
Test MSE: 0.0447, MAE: 0.1818, R²: 0.9702
    Actual (y_test)  Predicted (y_pred)  Error (%)
0          3.437131            3.217400   6.392878
1          2.967681            3.232247   8.914885
2          5.472730            5.111147   6.606998
3          2.538444            2.438436   3.939751
4          3.892027            3.755649   3.504017
5          3.334042            3.599685   7.967589
6          4.511162            4.817465   6.789906
7          4.313159            4.078242   5.446527
8          2.057608            2.152428   4.608259
9          3.798328            3.671162   3.347942
10         2.974447            2.680606   9.878852
11         2.621114            2.685669   2.462867
12         2.729425            2.700868   1.046297
13         4.22335

# Evaluation on Entire Dataset
This section evaluates the performance of the trained neural network on the entire dataset (training + testing combined) to give a holistic view of its predictive accuracy.

Key Steps:
Prediction: The model performs a forward pass through the entire input data to generate predicted outputs in standardized form, which are then inverse-transformed to the original scale.

Comparison DataFrame: A table is created comparing actual vs predicted values along with the percentage error for each data point.

$$
\text{Error (\%)} = \left| \frac{\text{Actual} - \text{Predicted}}{\text{Actual}} \right| \times 100
$$

Statistics Printed:
Max Error (%): The largest prediction error across the dataset.
Min Error (%): The smallest error observed.
Count of Errors > 10%: Number of data points where the prediction error exceeds 10%, helping identify outlier predictions or areas needing improvement.

This analysis helps assess model robustness and generalization by identifying the spread and distribution of prediction errors across all samples.

In [5]:
# Evaluate on the Entire Dataset
mse, mae, r2, y_original, y_pred_original = evaluate_regression(X, y, optimized_weights, scaler_y)
# Create DataFrame to Compare
full_comparison_df = pd.DataFrame({
    'Actual (y)': y_original.flatten(),
    'Predicted (y_pred)': y_pred_original.flatten(),
    'Error (%)': np.abs(((y_original.flatten() - y_pred_original.flatten()) * 100 / y_original.flatten()))
})

# Display Results
print(full_comparison_df.head(50))  # Show first 50 rows

# Error Statistics
print(f"Max Error (%): {full_comparison_df['Error (%)'].max():.2f}")
print(f"Min Error (%): {full_comparison_df['Error (%)'].min():.2f}")
error_above_10_percent = (full_comparison_df['Error (%)'] > 10).sum()
print(f"Count of Errors > 10%: {error_above_10_percent}")

    Actual (y)  Predicted (y_pred)  Error (%)
0     4.858459            4.626446   4.775448
1     3.496159            3.503703   0.215768
2     3.165983            3.240981   2.368889
3     2.966884            2.980358   0.454162
4     4.360603            4.611923   5.763407
5     3.132846            3.493517  11.512551
6     2.967681            3.232247   8.914885
7     2.734263            2.973207   8.738871
8     2.321248            2.438856   5.066580
9     2.057608            2.152428   4.608259
10    4.331673            4.595371   6.087675
11    3.190949            3.481946   9.119424
12    3.021513            3.222331   6.646259
13    2.763243            2.965093   7.304837
14    2.268575            2.434854   7.329662
15    2.009174            2.150695   7.043741
16    4.383091            4.593305   4.796028
17    3.124499            3.480504  11.394011
18    3.057160            3.221096   5.362391
19    2.766603            2.964084   7.138013
20    2.294049            2.434357

Extracting and Inspecting Neural Network Weights and Biases
This section retrieves and displays the learned parameters of the trained neural network after optimization using the Levenberg–Marquardt algorithm.

🔍 Step-by-Step Explanation:
1. Mean and Standard Deviation of Features:
These are extracted from StandardScaler after standardizing the input features and target variable.

Used for inverse transformation and interpretation of results in original units. 
 
2. Weight Extraction Function:
The function extract_weights() decomposes the flattened 1D optimized_weights vector into separate matrices and vectors:

Input-to-Hidden Weights ($W_1$):

Hidden Layer Biases ($b_1$):

Hidden-to-Output Weights ($W_2$):

Output Layer Bias ($b_2$):



In [None]:
# Get mean and standard deviation of input features
X_means = scaler_X.mean_
X_stds = scaler_X.scale_
# Get mean and standard deviation of target variable
y_mean = scaler_y.mean_[0]
y_std = scaler_y.scale_[0]
print("X Means:", X_means)
print("X Standard Deviations:", X_stds)
print("y Mean:", y_mean)
print("y Standard Deviation:", y_std)

def extract_weights(optimized_weights, features, hidden_neuron):
    # Input-to-hidden weights (W1): shape (features, hidden_neuron)
    W1 = optimized_weights[:features * hidden_neuron].reshape(features, hidden_neuron)
    # Hidden layer biases (b1): shape (1, hidden_neuron)
    b1_start = features * hidden_neuron
    b1_end = b1_start + hidden_neuron
    b1 = optimized_weights[b1_start:b1_end].reshape(1, hidden_neuron)
    # Hidden-to-output weights (W2): shape (hidden_neuron, 1)
    W2_start = b1_end
    W2_end = W2_start + hidden_neuron
    W2 = optimized_weights[W2_start:W2_end].reshape(hidden_neuron, 1)
    # Output bias (b2): scalar
    b2 = optimized_weights[-1]
    return W1, b1, W2, b2

# Extract the weight matrices and bias vectors
W1, b1, W2, b2 = extract_weights(optimized_weights, features, hidden_neuron)

# Print in human-readable form
print("Input-to-Hidden Weights (W1):\n", W1)
print("Hidden Biases (b1):\n", b1)
print("Hidden-to-Output Weights (W2):\n", W2)
print("Output Bias (b2):\n", b2)

X Means: [7.01946721 2.06454918 0.53210656 0.06766475]
X Standard Deviations: [3.82262397 0.71274148 0.21078035 0.03034432]
y Mean: 3.6378581098196467
y Standard Deviation: 1.2040775968079265
Input-to-Hidden Weights (W1):
 [[-0.0437315   0.01131759]
 [ 0.5960148  -0.64389609]
 [ 0.58718547 -1.74043342]
 [-2.65663108  1.40109385]]
Hidden Biases (b1):
 [[-3.2485941  -0.67903587]]
Hidden-to-Output Weights (W2):
 [[ 4.82714864]
 [-1.46562389]]
Output Bias (b2):
 0.008276820143825815


# Final Results: Combine, Inverse Transform, and Export
This section prepares a complete dataset containing both the original input features and the predicted vs actual target values (in their original scale), useful for final analysis, visualization, or reporting.

Step-by-Step Explanation:
1. Combine Datasets:

The training and testing sets (X_train, X_test) are vertically stacked using np.vstack() to form a unified input dataset X_all.

Similarly, corresponding predicted and actual outputs are combined.

2. Inverse Transformation of Input Features:

The standardized inputs (X_all) are converted back to their original scale using scaler_X.inverse_transform(), giving you interpretable feature values.

3. Construct Final DataFrame:

A pandas DataFrame is created containing:

Original input features (unstandardized),

Actual output values (Actual Kd),

Predicted output values (Predicted Kd).

4. Save to Excel:

The final DataFrame can be exported to an Excel file (e.g., All_data_results2Neuron.xlsx) using to_excel(). This line is commented out but can be activated for saving.

This step is ideal for final reporting, error analysis, or sharing results with others in a format that’s easy to understand and work with (e.g., in Excel or CSV).

In [52]:
# Reverse scale all data
X_all = np.vstack((X_train, X_test))  # Combine training and test data
X_all_original = scaler_X.inverse_transform(X_all)  # Reverse scale input features
y_all_actual = np.vstack((y_train_original, y_test_original))  # Actual values
y_all_predicted = np.vstack((y_train_pred_original, y_test_pred_original))  # Predicted values

# Create DataFrame with all data
final_df = pd.DataFrame(X_all_original)  # Restore original feature names
final_df['Actual Kd'] = y_all_actual.flatten()  # Add actual y values
final_df['Predicted Kd'] = y_all_predicted.flatten()  # Add predicted y values

# Save to Excel
#final_df.to_excel('All_data_results2Neuron.xlsx', index=False)