# Earthquake Magnitude Prediction using Ensemble Learning


* Ensemble learning is a machine learning method that combines the predictions of multiple models (classifiers or regressors) to improve overall predictive performance. 
* The basic idea behind ensemble learning is that by combining the output of several base models, the ensemble model can often achieve better results than any individual model.


# Import required modules

Import statements provide the necessary tools to build, train, and evaluate your deep learning model for earthquake magnitude prediction

In [1]:
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor, VotingRegressor, AdaBoostRegressor
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from sklearn.svm import SVR
from xgboost import XGBRegressor
import lightgbm as lgbm
from sklearn.linear_model import Lasso

# Data Loading and Preprocessing

The code loads your earthquake data from a CSV file named 'resultdata.csv' using pd.read_csv.

In [2]:
# Load your earthquake data (replace 'your_data.csv' with your dataset)
data = pd.read_csv('resultdata.csv')

# Feature Extraction:

* It extracts features from the dataset, which include 'Longitude', 'Latitude', 'Depth', and 'Timestamp'. These features are stored in the features array.
* The target variable, 'Magnitude', is stored in the labels array.


In [3]:
# Assuming your dataset has columns 'longitude', 'latitude', and 'magnitude'
# Adjust this accordingly based on your actual data columns
features = data[['Latitude', 'Longitude','Depth','Timestamp']].values
labels = data['Magnitude'].values

# Normalization:

The longitude and latitude features are normalized to a range between 0 and 1 using the MinMaxScaler from scikit-learn. This scaling helps neural networks perform better.

In [4]:
# Normalize longitude and latitude features to a range between 0 and 1
scaler = MinMaxScaler()
scaled_features = scaler.fit_transform(features)

# Train-Test Split:
The dataset is split into training and testing sets using train_test_split from scikit-learn. The split ratio is 80% for training and 20% for testing

In [5]:
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(scaled_features, labels, test_size=0.2, random_state=42)

# Base Models:

* RandomForestRegressor: Ensemble of 100 decision trees for earthquake magnitude prediction. random_state=42 ensures consistent results.
* GradientBoostingRegressor: Utilizes 100 decision trees to refine magnitude predictions iteratively. random_state=42 for reproducible results.
* Support Vector Regressor (SVR): Predicts magnitudes using support vector machines with an RBF kernel. C=1.0, epsilon=0.2 control error and tube width.
* XGBoostRegressor: Efficiently combines 100 trees for magnitude prediction. random_state=42 ensures consistent outcomes.
* AdaBoostRegressor: Combines 100 weak learners for more accurate predictions. random_state=42 for result reproducibility.
* LightGBMRegressor: Speedy gradient boosting with 100 trees for magnitude estimation. random_state=42 for reproducibility.
* Lasso Regression: L1-regularized linear regression for magnitude prediction. alpha=0.1 controls model complexity and feature selection.


In [6]:
# Define the Base model
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)
gb_regressor = GradientBoostingRegressor(n_estimators=100, random_state=42)
svr_regressor = SVR(kernel='rbf', C=1.0, epsilon=0.2)
xgb_regressor = XGBRegressor(n_estimators=100, random_state=42)
ada_regressor = AdaBoostRegressor(n_estimators=100, random_state=42)
lgbm_regressor = lgbm.LGBMRegressor(n_estimators=100, random_state=42)
lasso_regressor = Lasso(alpha=0.1)

# Ensemble Model:

* An ensemble model is created using the VotingRegressor from scikit-learn. This ensemble model combines the predictions of the previously defined individual regressors.
* Each individual regressor is provided a name and is included in the ensemble.
* The VotingRegressor combines the predictions through a weighted average.

In [7]:
# Define the Ensemble model
ensemble_regressor = VotingRegressor(estimators=[
    ('random_forest', rf_regressor),
    ('gradient_boosting', gb_regressor),
    ('svr', svr_regressor),
    ('xgboost', xgb_regressor),
    ('adaboost', ada_regressor),
    ('lightgbm', lgbm_regressor),
    ('lasso', lasso_regressor)
])

# Model Training:

The ensemble model is trained on the training data using the fit method.

In [8]:
ensemble_regressor.fit(X_train, y_train)

[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.001153 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1020
[LightGBM] [Info] Number of data points in the train set: 17563, number of used features: 4
[LightGBM] [Info] Start training from score 5.869838


# Making Predictions:

* The model is used to make predictions on the test data, and the predicted earthquake magnitudes are stored in the predictions array.

In [9]:
# Make predictions using the ensemble model
y_pred = ensemble_regressor.predict(X_test)

# Model Evaluation:

* The Mean Squared Error (MSE) is calculated to evaluate the performance of the ensemble model. 
* MSE is a common metric used to measure the accuracy of regression models.


In [10]:
# Calculate Mean Squared Error as a metric
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

Mean Squared Error: 0.1704475585642093
