# Pakistan Flood 2022 Data Analysis

# Introduction

<p>In the wake of the devastating floods that struck Pakistan in 2022, this project aims to shed light on the unprecedented climate catastrophe, emphasizing the critical need for immediate action. The calamity, exacerbated by heavier monsoon rains and melting glaciers, underscores the profound impact of climate change on vulnerable regions. Through data-driven insights, we strive to generate awareness, prompt urgent measures, and contribute to understanding and mitigating the far-reaching consequences of such natural disasters.</p>

# Problem Statement

<p>The floods of 2022 in Pakistan pose a multifaceted challenge, causing widespread destruction, loss of life, and displacement. The intertwined factors of climate change, inadequate infrastructure, and economic vulnerabilities amplify the severity of the crisis. Addressing urgent issues such as healthcare, food shortages, and displaced populations becomes paramount. This project aims to unravel the complexities of the disaster, seeking solutions to mitigate its immediate and long-term impacts on affected communities.</p>

# Objectives

1. Uncover the intricate patterns and trends within the 2022 flood data.
2. Examine the socio-economic and health crises triggered by the floods.
3. Highlight the immediate and long-term impacts on infrastructure, agriculture, and livelihoods.
4. Propose insights for effective disaster management and climate resilience.
5. Advocate for awareness and urgent action from local and international stakeholders.

# About The Data

<p>The dataset used in this project encapsulates the aftermath of the 2022 floods in Pakistan, capturing the toll on lives, infrastructure, and the economy. With variables detailing casualties, injuries, damages, and regional impacts, the data serves as a lens into the magnitude of the disaster. Derived from National Disaster Management Authority (NDMA) records and on-the-ground reports, the dataset is a comprehensive repository facilitating in-depth analysis and storytelling to drive understanding and awareness about the floods' profound implications.</p>

# Importing Necessary Libraries

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import plotly.express as px
from wordcloud import WordCloud, STOPWORDS

# Loading Data and Convert Into a DataFrame

In [2]:
ndmaData = pd.read_csv('Dataset/Pakistan_Flood_2022_Data_By_NDMA.csv')

In [3]:
df = pd.DataFrame(ndmaData)

In [4]:
df.head()

Unnamed: 0,Region,Total deaths,M_D,F_D,C_D,total_injured,M_I,F_I,C_I,Roads_damaged(km),Bridges damaged,Total_houses_damaged,livestock_damaged,Affected population
0,AJ&K,48,31,17,0,24,15,9,0,0,0,548,792,53700
1,Balochistan,299,136,73,90,181,92,40,49,1850,22,65997,500000,9182616
2,GB,22,5,11,6,6,3,0,3,16,65,1211,0,51500
3,ICT,1,1,0,0,0,0,0,0,0,0,0,0,0
4,KP,306,149,41,116,369,156,79,134,1575,107,91463,21328,4350490


# Exploratory Data Analysis

In [5]:
df.shape

(7, 14)

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7 entries, 0 to 6
Data columns (total 14 columns):
 #   Column                Non-Null Count  Dtype 
---  ------                --------------  ----- 
 0   Region                7 non-null      object
 1   Total deaths          7 non-null      int64 
 2   M_D                   7 non-null      int64 
 3   F_D                   7 non-null      int64 
 4   C_D                   7 non-null      int64 
 5   total_injured         7 non-null      int64 
 6   M_I                   7 non-null      int64 
 7   F_I                   7 non-null      int64 
 8   C_I                   7 non-null      int64 
 9   Roads_damaged(km)     7 non-null      int64 
 10  Bridges damaged       7 non-null      int64 
 11  Total_houses_damaged  7 non-null      int64 
 12  livestock_damaged     7 non-null      int64 
 13  Affected population   7 non-null      int64 
dtypes: int64(13), object(1)
memory usage: 916.0+ bytes


In [7]:
df.describe()

Unnamed: 0,Total deaths,M_D,F_D,C_D,total_injured,M_I,F_I,C_I,Roads_damaged(km),Bridges damaged,Total_houses_damaged,livestock_damaged,Affected population
count,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0,7.0
mean,220.714286,96.857143,45.0,78.857143,1837.142857,770.428571,350.285714,572.142857,1819.285714,53.571429,277711.1,134844.142857,4720904.0
std,238.393033,94.423564,43.454958,103.813019,3223.82084,1246.739596,821.615424,1197.158222,3001.77718,62.64944,636152.4,188058.016107,5519879.0
min,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,35.0,18.0,14.0,3.0,15.0,9.0,4.5,1.5,8.0,8.0,879.5,396.0,52600.0
50%,191.0,94.0,41.0,50.0,181.0,92.0,40.0,49.0,896.0,22.0,65997.0,21328.0,4350490.0
75%,302.5,142.5,60.0,103.0,2113.5,1164.5,96.0,353.0,1712.5,86.0,79217.0,210894.5,7013434.0
max,678.0,262.0,126.0,290.0,8422.0,2954.0,2211.0,3247.0,8398.0,165.0,1717788.0,500000.0,14563770.0


In [8]:
df.columns

Index(['Region', 'Total deaths', 'M_D', 'F_D', 'C_D', 'total_injured', 'M_I',
       'F_I', 'C_I', 'Roads_damaged(km)', 'Bridges damaged',
       'Total_houses_damaged', 'livestock_damaged', 'Affected population'],
      dtype='object')

In [9]:
df.isnull().sum()

Region                  0
Total deaths            0
M_D                     0
F_D                     0
C_D                     0
total_injured           0
M_I                     0
F_I                     0
C_I                     0
Roads_damaged(km)       0
Bridges damaged         0
Total_houses_damaged    0
livestock_damaged       0
Affected population     0
dtype: int64

# Stories Crafting and Trends Visualizations

## 1. Total Deaths Across Different Regions.

## 2. Deaths Grouped By Gender.

## 3. Total Injuries Across Different Regions.

## 4. Injuries Grouped by Gender.

## 5. Impacts of Flood: Houses Damage Accross Different Regions.

## 6. Impacts of Flood: Affected Population Accross Different Regions.

## 7. Impacts of Flood: Livestock Damage Accross Different Regions.

## 8. Impacts of Flood: Roads Damage Accross Different Regions.

## 9. Impacts of Flood: Bridges Damage Accross Different Regions.

## 10. Scatter Plot: Houses Damaged vs Affected Population.

## 11. Violin Plot: Injuries Distribution Accross Different Regions.

## 12. Area Chart: Total Impacts Over All Regions.