PROBLEM STATEMENT :

Landslides are a significant natural hazard causing severe damage to lives,
infrastructure, and the environment. Predicting landslide-prone areas based
on environmental and geological data can help in early warning systems
and disaster management planning.

DESCRIPTION :

This project aims to build a machine learning model (Decision Tree/Random Forest)
to predict the occurrence of landslides using features such as slope, precipitation,
elevation, lithology, NDVI, etc.

In [10]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report


In [9]:
dataset = pd.read_csv("/content/LandslidePrediction.csv")

In [11]:
dataset

Unnamed: 0,Landslide,Aspect,Curvature,Earthquake,Elevation,Flow,Lithology,NDVI,NDWI,Plan,Precipitation,Profile,Slope
0,0,3,3,2,2,2,1,4,2,2,3,3,2
1,0,1,5,2,3,1,1,4,2,5,5,2,2
2,0,3,4,3,2,2,4,3,2,4,5,2,2
3,0,1,3,3,3,5,1,2,4,3,5,3,3
4,0,5,4,2,1,4,1,2,4,3,3,1,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1207,1,4,2,1,4,2,5,1,5,3,2,4,2
1208,1,4,5,1,5,3,5,1,5,5,2,1,5
1209,1,3,4,1,5,2,5,2,3,3,2,2,5
1210,1,2,2,1,3,1,1,5,1,1,1,3,3


In [12]:
dataset.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1212 entries, 0 to 1211
Data columns (total 13 columns):
 #   Column         Non-Null Count  Dtype
---  ------         --------------  -----
 0   Landslide      1212 non-null   int64
 1   Aspect         1212 non-null   int64
 2   Curvature      1212 non-null   int64
 3   Earthquake     1212 non-null   int64
 4   Elevation      1212 non-null   int64
 5   Flow           1212 non-null   int64
 6   Lithology      1212 non-null   int64
 7   NDVI           1212 non-null   int64
 8   NDWI           1212 non-null   int64
 9   Plan           1212 non-null   int64
 10  Precipitation  1212 non-null   int64
 11  Profile        1212 non-null   int64
 12  Slope          1212 non-null   int64
dtypes: int64(13)
memory usage: 123.2 KB


In [13]:
dataset.nunique()

Unnamed: 0,0
Landslide,2
Aspect,5
Curvature,5
Earthquake,3
Elevation,5
Flow,5
Lithology,6
NDVI,5
NDWI,5
Plan,5


In [14]:
dataset.describe()

Unnamed: 0,Landslide,Aspect,Curvature,Earthquake,Elevation,Flow,Lithology,NDVI,NDWI,Plan,Precipitation,Profile,Slope
count,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0,1212.0
mean,0.5,2.962046,2.977723,2.10231,2.436469,2.338284,1.948845,3.042904,2.773927,3.059406,3.813531,3.262376,2.811881
std,0.500206,1.147378,1.099658,0.669812,1.242686,1.112686,1.424345,1.239246,1.29983,1.057287,1.347799,1.039502,1.194229
min,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
25%,0.0,2.0,2.0,2.0,1.0,2.0,1.0,2.0,2.0,2.0,3.0,3.0,2.0
50%,0.5,3.0,3.0,2.0,2.0,2.0,1.0,3.0,3.0,3.0,4.0,3.0,3.0
75%,1.0,4.0,4.0,3.0,3.0,3.0,3.0,4.0,4.0,4.0,5.0,4.0,4.0
max,1.0,5.0,5.0,3.0,5.0,5.0,6.0,5.0,5.0,5.0,5.0,5.0,5.0


In [15]:
dataset.head()

Unnamed: 0,Landslide,Aspect,Curvature,Earthquake,Elevation,Flow,Lithology,NDVI,NDWI,Plan,Precipitation,Profile,Slope
0,0,3,3,2,2,2,1,4,2,2,3,3,2
1,0,1,5,2,3,1,1,4,2,5,5,2,2
2,0,3,4,3,2,2,4,3,2,4,5,2,2
3,0,1,3,3,3,5,1,2,4,3,5,3,3
4,0,5,4,2,1,4,1,2,4,3,3,1,4
