BANSAM

Introduction

Overview

This model utilizes a Deep Neural Network to predict flood that could happen in Samarinda City. It includes 9 features to ensure accurate analysis and provides output in the form of probabilities across three labels: Safe, Alert, and Danger, helping the people in Samarinda City understand the level of flood risk effectively.

Problem Statement

What are the main weather factors influencing “danger” conditions in Samarinda, such as rainfall, humidity, air pressure, or wind speed?
What type of weather conditions (condition_type) most often occur before, during, and after a flood, and how does the combination of humidity and air pressure affect visibility in each of these conditions?
How can data-driven insights improve community awareness and preparedness for flood risks in Samarinda?

Dataset

Variable

ID	City	Temperature (°C)	Humidity (%)	Pressure (hPa)	Wind Speed (m/s)	Wind Direction (°)	Cloudiness (%)	Visibility (m)	Description	Condition Type	Timestamp
3	Samarinda	24.38	93	1010	1.54	200	72	10000	broken clouds	Clouds	2024-10-25 13:09:08
4	Samarinda	24.41	93	1011	1.32	201	55	10000	broken clouds	Clouds	2024-10-25 14:09:08
5	Samarinda	24.34	92	1010	1.47	207	60	10000	broken clouds	Clouds	2024-10-25 15:09:08

For the full dataset, click here.

Preprocessing Steps

Separating Data
in this step we separate the data based on the rain column whose value is greater than 0.
Labelling
after that we labelled the data with some considerations.
Oversampling
Next, since the data is skewed, we apply oversampling using the SMOTE technique.

EDA

from the correlation matrix above, there are several pairs of columns with values close to 1 (related):

Wind Speed & Temperature
Rain & Condition Type
Rain & Description
Cloudiness & Description
Description & Condition Type

The 'aman' category is the most dominant with a percentage of 87.6%, followed by the 'waspada' category with a value of 7.9%, and the 'bahaya' category which has the smallest percentage of 4.5%.

Model

Architecture

In this case, we use a DNN for our machine learning model, with 9 feature inputs, 1 normalization layer, 4 hidden layers, and 1 softmax output.

Performance Metrics

A significant improvement is observed around epoch 44, with the training accuracy reaching 95% and validation accuracy at 94%

Results

After completing the training, we made predictions using the existing model that had been exported with the most recent training data, and we obtained these results, with a total of 31 data points being mispredicted.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
DataPreparation		DataPreparation
csv		csv
img		img
logs		logs
Data_Analysis_C242_PS018_BANSAM.ipynb		Data_Analysis_C242_PS018_BANSAM.ipynb
README.md		README.md
model.ipynb		model.ipynb
prediksi_banjir_2024-12-01_15-10-07.keras		prediksi_banjir_2024-12-01_15-10-07.keras
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BANSAM

Introduction

Overview

Problem Statement

Dataset

Variable

Preprocessing Steps

EDA

Model

Architecture

Performance Metrics

Results

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BANSAM

Introduction

Overview

Problem Statement

Dataset

Variable

Preprocessing Steps

EDA

Model

Architecture

Performance Metrics

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages