##**Project Title: Flood Risk Prediction in India**

###**Problem Statement:**

Floods in India cause severe damage to life, property, and agriculture every year. Early prediction of flood risk based on rainfall, river discharge, and historical trends is essential for effective disaster preparedness and management.

###**Description:**
The project compares rainfall as well as river data to forecast flood hazards in Indian regions through machine learning models. The research seeks to map high-risks zones, decipher seasonality patterns, and aid in the management of disasters through data-based decision-making.

In [5]:
# Installing Kaggle Librabries in-case of Need
!pip install kaggle numpy pandas scikit-learn matplotlib seaborn



In [6]:
# Imported basic libraries just to load and explore the dataset - gradually I will add more to it
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [7]:
# Loading the dataset
df = pd.read_csv("/flood_risk_dataset_india.csv")

#Displaying the First few rows:
print(df.head(7))

    Latitude  Longitude  Rainfall (mm)  Temperature (°C)  Humidity (%)  \
0  18.861663  78.835584     218.999493         34.144337     43.912963   
1  35.570715  77.654451      55.353599         28.778774     27.585422   
2  29.227824  73.108463     103.991908         43.934956     30.108738   
3  25.361096  85.610733     198.984191         21.569354     34.453690   
4  12.524541  81.822101     144.626803         32.635692     36.292267   
5  12.523841  93.105329     221.571312         36.006300     39.380945   
6   9.684425  68.931178     288.362370         39.766935     40.436802   

   River Discharge (m³/s)  Water Level (m)  Elevation (m)    Land Cover  \
0             4236.182888         7.415552     377.465433    Water Body   
1             2472.585219         8.811019    7330.608875        Forest   
2              977.328053         4.631799    2205.873488  Agricultural   
3             3683.208933         2.891787    2512.277800        Desert   
4             2093.390678       

In [8]:
# Exploring the Basics of a Flood Risk Prediction in India dataset
print("Dataset Info:")
print(df.info())

print("\n\nDataset Summary:")
print(df.describe())

print("\n\nMissing Values:")
print(df.isnull().sum())

Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10000 entries, 0 to 9999
Data columns (total 14 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Latitude                10000 non-null  float64
 1   Longitude               10000 non-null  float64
 2   Rainfall (mm)           10000 non-null  float64
 3   Temperature (°C)        10000 non-null  float64
 4   Humidity (%)            10000 non-null  float64
 5   River Discharge (m³/s)  10000 non-null  float64
 6   Water Level (m)         10000 non-null  float64
 7   Elevation (m)           10000 non-null  float64
 8   Land Cover              10000 non-null  object 
 9   Soil Type               10000 non-null  object 
 10  Population Density      10000 non-null  float64
 11  Infrastructure          10000 non-null  int64  
 12  Historical Floods       10000 non-null  int64  
 13  Flood Occurred          10000 non-null  int64  
dtypes: float64(9), int64(3), 