<a href="https://colab.research.google.com/github/Byzon777/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code/blob/main/Disaster_prediciting_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Disaster-Ready Community: AI-Powered Disaster Risk Assessment and Prediction

The project's goal is to build a model to predict the disaster risk index (WRI) for various countries.

This model is designed to accomplish that by:

*   Utilizing Relevant Features: Features like exposure, vulnerability, susceptibility, and coping/adaptive capacities are well-aligned with understanding disaster risk.
*   Handling Real-World Data: Standardization, batch normalization, and dropout (if extended) make the model robust against inconsistencies often found in real-world data.
*   Evaluating Predictive Accuracy: MAE, MSE, and R2 offer a comprehensive view of the model’s predictive power, showing that it can generalize well to new data.

This neural network model with its high accuracy metrics can serve as a reliable tool for assessing disaster risk index predictions across different countries. It provides stakeholders and policymakers with valuable insights, helping them prioritize resources, enhance resilience, and make informed decisions to mitigate disaster risks globally.

In [None]:
!pip install --upgrade tensorflow

Collecting tensorflow
  Downloading tensorflow-2.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.1 kB)
Collecting tensorboard<2.19,>=2.18 (from tensorflow)
  Downloading tensorboard-2.18.0-py3-none-any.whl.metadata (1.6 kB)
Collecting keras>=3.5.0 (from tensorflow)
  Downloading keras-3.6.0-py3-none-any.whl.metadata (5.8 kB)
Downloading tensorflow-2.18.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (615.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m615.3/615.3 MB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading keras-3.6.0-py3-none-any.whl (1.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m47.3 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading tensorboard-2.18.0-py3-none-any.whl (5.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.5/5.5 MB[0m [31m90.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tensorboard, keras, tensorflow
  At

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, BatchNormalization
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import MeanAbsoluteError, MeanSquaredError

from sklearn.metrics import r2_score
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split

In [None]:
import pandas as pd
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
from matplotlib import style
from functools import reduce
import os


Importing essential libraries for data handling (Pandas), data splitting and scaling (Scikit-Learn), and neural network construction (TensorFlow/Keras).
These libraries are well-suited for building and evaluating machine learning models, allowing efficient data manipulation and model training.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


Loading the dataset and selecting relevant numerical columns that are predictive of the target variable, WRI. Also replacing missing values with 0, preventing potential issues during training. This ensures that only the most relevant features are used which optimizes model's accuracy and computational efficiency. By handling missing values, we prevent errors and improve data quality, which is crucial for reliable predictions.

In [None]:
# Loading the dataset
file_path = '/content/drive/MyDrive/Research CS-AI/HolyokeHack/Data/world_risk_index.csv'  # Adjust path as necessary
world_risk_data = pd.read_csv(file_path)

In [None]:
numerical_data = world_risk_data[
    ['Exposure', 'Vulnerability', 'Susceptibility', 'Lack of Coping Capabilities', ' Lack of Adaptive Capacities']
].fillna(0)  # Fill missing values with 0
target = world_risk_data['WRI'].fillna(0)  # Handle missing target values if any

Standardizes the input features to have a mean of 0 and a standard deviation of 1 using StandardScaler.

Standardization helps the neural network converge faster and perform better by ensuring that all features contribute equally during training. This is particularly important in neural networks, where features with larger ranges could dominate others.

In [None]:
# Data Transformation - Scaliing
scaler = StandardScaler()
X = scaler.fit_transform(numerical_data)
y = target.values

Spliting the dataset into training and testing sets with 80% for training and 20% for testing. This split is crucial for evaluating the model's performance on unseen data, allowing us to assess its generalization ability. The random_state=42 ensures reproducibility of the results.

In [None]:
# Splitting data on training and testing sub datasets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Defining a sequential neural network model with several layers:
1. Dense Layers: Fully connected layers with ReLU activation to capture nonlinear relationships.
2. Batch Normalization Layers: Added to stabilize training and improve convergence.
3. Output Layer: A single node without activation to output continuous values for regression.

This model architecture is suitable for regression tasks, as it captures complex relationships in the data. Batch normalization improves stability and reduces the risk of NaN values.

In [None]:
# Model Initialization
model = Sequential([
    Dense(32, activation='relu', input_shape=(X_train.shape[1],)),
    BatchNormalization(),  # Adding batch normalization for stability
    Dense(16, activation='relu'),
    BatchNormalization(),
    Dense(8, activation='relu'),
    Dense(1)  # Single output node for regression
])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Compiling model - Configuring the model with optimizer Adam with a learning rate of 0.001, which adapts the learning rate during training for faster convergence.The chose loss function is Mean Squared Error (MSE) for regression. The primary metrics to measure accuracy are MAE and MSE.

MSE is a common loss function for regression tasks, and Adam is a robust optimizer that adjusts learning rates dynamically. These choices ensure that the model can learn effectively from the dataset.

In [None]:
# Compiling the model with a lower learning rate
optimizer = Adam(learning_rate=0.001)  # Reduced learning rate for stability
model.compile(optimizer=optimizer, loss='mse', metrics=[MeanAbsoluteError(), MeanSquaredError()])

Training the model for 100 epochs with a batch size of 16 and validating it on the test set.
Epochs and batch size are tuned to balance training time and convergence speed. The validation on X_test provides feedback on model performance as it helps to prevent overfitting and ensuring the model generalizes well.


In [None]:
# Traning the model
history = model.fit(X_train, y_train, epochs=200, batch_size=16, validation_data=(X_test, y_test))

# Evaluating the accuracy and effectivenes of the model
test_loss, test_mae, test_mse = model.evaluate(X_test, y_test)
y_pred = model.predict(X_test)
r2 = r2_score(y_test, y_pred)
rmse_rnn = np.sqrt(mean_squared_error(y_test, y_pred))


print("Test Loss (MSE):", test_loss)
print("Mean Absolute Error:", test_mae)
print("Mean Squared Error:", test_mse)
print("R-squared Score:", r2)
print("RMSE:", rmse_rnn)


Epoch 1/200


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m96/96[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 9ms/step - loss: 78.8238 - mean_absolute_error: 7.4324 - mean_squared_error: 78.8238 - val_loss: 61.7400 - val_mean_absolute_error: 6.4338 - val_mean_squared_error: 61.7400
Epoch 2/200
[1m96/96[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - loss: 32.2002 - mean_absolute_error: 4.7595 - mean_squared_error: 32.2002 - val_loss: 22.0395 - val_mean_absolute_error: 3.9443 - val_mean_squared_error: 22.0395
Epoch 3/200
[1m96/96[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - loss: 15.7004 - mean_absolute_error: 3.2572 - mean_squared_error: 15.7004 - val_loss: 9.1936 - val_mean_absolute_error: 2.1084 - val_mean_squared_error: 9.1936
Epoch 4/200
[1m96/96[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - loss: 10.3087 - mean_absolute_error: 2.6323 - mean_squared_error: 10.3087 - val_loss: 5.7184 - val_mean_absolute_error: 1.7183 - val_mean_squared_error: 5.7184
Epoch 5/200
[1m96/9

# Exaplanation of the accuracy measures and model performance

1. MSE measures the average squared difference between predicted and actual values, where smaller values indicate better model accuracy. Here, an MSE of 0.3445 is quite low, suggesting that on average, the squared deviations from the actual WRI values are very small. This is a positive indicator, as it suggests the model is making highly accurate predictions.
2. MAE represents the average magnitude of errors in the predictions, measured in the same units as the target variable (WRI). With an MAE of 0.2992, the model’s average prediction error is less than 0.3, which is very low for a target variable around 20–30 in scale. This further supports the model’s high accuracy, indicating that predictions are, on average, only slightly off from actual values.
3. The R2 score measures the proportion of the variance in the dependent variable that is predictable from the independent variables. An R2 of 0.9902 indicates that the model explains 99.02% of the variance in WRI, which is exceptionally high. This suggests the model has captured the relationships between input features and the target variable very well.
4. RMSE is the square root of MSE, providing an error measure on the same scale as WRI. With an RMSE of 0.5869, the model’s predictions deviate from the actual values by less than 1 unit on average. This low RMSE indicates good predictive accuracy.