<h1>Analyst : Sophia Joyce Lozada</h1>

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import f_oneway
import warnings
warnings.filterwarnings('ignore')

In [2]:
dengue = pd.read_csv('dengue.csv')

In [3]:
dengue

Unnamed: 0,Month,Year,Region,Dengue_Cases,Dengue_Deaths
0,January,2016,Region I,705,1
1,February,2016,Region I,374,0
2,March,2016,Region I,276,0
3,April,2016,Region I,240,2
4,May,2016,Region I,243,1
...,...,...,...,...,...
1015,August,2020,BARMM,91,0
1016,September,2020,BARMM,16,8
1017,October,2020,BARMM,13,9
1018,November,2020,BARMM,15,1


In [4]:
dengue.isnull().sum()

Month            0
Year             0
Region           0
Dengue_Cases     0
Dengue_Deaths    0
dtype: int64

In [10]:
dengue.dtypes

Month              object
Year               object
Region           category
Dengue_Cases        int64
Dengue_Deaths       int64
dtype: object

In [9]:
dengue['Region'] = dengue['Region'].astype('category')
dengue['Year'] = dengue['Year'].astype('object')



In [11]:
dengue['Month'].value_counts()

Month
January      85
February     85
March        85
April        85
May          85
June         85
July         85
August       85
September    85
October      85
November     85
December     85
Name: count, dtype: int64

In [12]:
dengue['Year'].value_counts()

Year
2016    204
2017    204
2018    204
2019    204
2020    204
Name: count, dtype: int64

In [13]:
dengue['Region'].value_counts()

Region
BARMM          60
CAR            60
NCR            60
Region I       60
Region II      60
Region III     60
Region IV-A    60
Region IV-B    60
Region IX      60
Region V       60
Region VI      60
Region VII     60
Region VIII    60
Region X       60
Region XI      60
Region XII     60
Region XIII    60
Name: count, dtype: int64

<h1>1.Regional Distribution of Dengue Deaths<h1/>
<h3>NCR has the highest dengue-related deaths (4008), likely due to its dense population and high transmission rates, followed by Region XII (2796) and Region VI (1825). The lowest death rates are in Region IV-B (130), Region I (157), and Region II (193), possibly due to better healthcare response or lower infection rates. </h3>

In [64]:
deaths_by_region =dengue.groupby('Region')['Dengue_Deaths'].sum()

In [16]:
deaths_by_region 

Region
BARMM           332
CAR            1008
NCR            4008
Region I        157
Region II       193
Region III      482
Region IV-A     652
Region IV-B     130
Region IX       532
Region V        185
Region VI      1825
Region VII     1760
Region VIII     585
Region X        848
Region XI       385
Region XII     2796
Region XIII     966
Name: Dengue_Deaths, dtype: int64

<h1>2.Case Fatality Rate by Region</h1>
<h3>Region XII has the highest case fatality rate (CFR) of 4.68%, indicating a severe impact of dengue, possibly due to healthcare limitations or delayed interventions. NCR (3.46%) and CAR (3.30%) also have high fatality rates despite having more developed healthcare infrastructure, suggesting high transmission and severe cases. In contrast, Region I (0.27%) and Region III (0.37%) have the lowest CFRs, which may reflect better disease management, early diagnosis, or stronger healthcare systems.</h3>

In [17]:

fatality_rate_region = (dengue.groupby('Region')[['Dengue_Cases', 'Dengue_Deaths']].sum())
fatality_rate_region['CFR'] = fatality_rate_region['Dengue_Deaths'] / fatality_rate_region['Dengue_Cases']



In [18]:
fatality_rate_region

Unnamed: 0_level_0,Dengue_Cases,Dengue_Deaths,CFR
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
BARMM,11537,332,0.028777
CAR,30582,1008,0.032961
NCR,115966,4008,0.034562
Region I,59066,157,0.002658
Region II,45141,193,0.004275
Region III,131064,482,0.003678
Region IV-A,163029,652,0.003999
Region IV-B,30849,130,0.004214
Region IX,47781,532,0.011134
Region V,22167,185,0.008346


<h1>3.Regional Distribution of Dengue Cases </h1>
<h3>Region IV-A (163,029) and Region III (131,064) have the highest dengue cases, likely due to high population density and mosquito-breeding conditions. In contrast, BARMM (11,537) and Region V (22,167) report the lowest cases, possibly due to lower population density or effective vector control measures.</h3>

In [30]:
cases_by_region = dengue.groupby('Region')['Dengue_Cases'].sum()

cases_by_region 

<h1>4.Top 5 Regions with Highest Cases</h1>
<h3>Region IV-A (163,029) has the highest dengue cases, followed by Region III (131,064) and Region VI (117,523), indicating that densely populated and urbanized areas are most affected.</h3>

In [33]:
top_5_cases = cases_by_region.nlargest(5)

In [34]:
top_5_cases

Region
Region IV-A    163029
Region III     131064
Region VI      117523
NCR            115966
Region VII     110683
Name: Dengue_Cases, dtype: int64

<h1>5.Top 5 Regions with Highest Deaths</h1>
<h3>NCR (4,008) has the highest number of dengue-related deaths, likely due to its large population and high transmission rates, while Region XII (2,796) follows with a concerningly high fatality rate. Regions VI (1,825), VII (1,760), and CAR (1,008) also report significant deaths, suggesting possible healthcare challenges or late detection in these areas.</h3>

In [35]:
top_5_deaths = deaths_by_region.nlargest(5)

In [36]:
top_5_deaths

Region
NCR           4008
Region XII    2796
Region VI     1825
Region VII    1760
CAR           1008
Name: Dengue_Deaths, dtype: int64

<h1>6.Yearly Case Fatality Rate</h1>
<h3>The dengue fatality rate significantly declined from 3.88% in 2016 to 0.39% in 2019, despite a sharp increase in cases, indicating improved disease management and healthcare response. However, in 2020, the fatality rate rose to 1.31%, possibly due to healthcare system challenges during the COVID-19 pandemic, which may have affected dengue treatment and hospital capacities.</h3>

In [65]:
yearly_fatality_rate = (dengue.groupby('Year')[['Dengue_Cases', 'Dengue_Deaths']].sum())
yearly_fatality_rate['CFR'] = yearly_fatality_rate['Dengue_Deaths'] / yearly_fatality_rate['Dengue_Cases']

In [21]:
yearly_fatality_rate

Unnamed: 0_level_0,Dengue_Cases,Dengue_Deaths,CFR
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2016,209544,8127,0.038784
2017,154155,4563,0.0296
2018,250783,1226,0.004889
2019,441902,1733,0.003922
2020,91041,1195,0.013126


<h1>7.Yearly Average Cases and Deaths</h1>
<h3>Dengue cases peaked in 2019 with an average of 2,166 cases per reporting period, but the fatality rate remained low, suggesting improved treatment and surveillance. In 2020, average cases dropped significantly to 446.</h3>

In [26]:
yearly_avg = dengue.groupby('Year')[['Dengue_Cases', 'Dengue_Deaths']].mean()

In [27]:
yearly_avg

Unnamed: 0_level_0,Dengue_Cases,Dengue_Deaths
Year,Unnamed: 1_level_1,Unnamed: 2_level_1
2016,1027.176471,39.838235
2017,755.661765,22.367647
2018,1229.328431,6.009804
2019,2166.186275,8.495098
2020,446.279412,5.857843


<h1>8.Correlation Between Cases and Deaths</h1>
<h3>The low correlation (0.038) between dengue cases and deaths suggests that a higher number of cases does not necessarily lead to a proportional increase in deaths. </h3>

In [24]:
case_deaths = dengue[['Dengue_Cases', 'Dengue_Deaths']].corr()

In [25]:
case_deaths

Unnamed: 0,Dengue_Cases,Dengue_Deaths
Dengue_Cases,1.0,0.038322
Dengue_Deaths,0.038322,1.0


<h1> 9. & 10. Seasonal Patterns in Dengue Cases and Deaths </h1>
<h3>Dengue cases peak in August (187,554) and September (177,943), aligning with the rainy season, which creates ideal mosquito breeding conditions. Similarly, deaths are highest in October (6,670) and September (6,148), suggesting that severe cases may take time to progress to fatalities. The lowest cases occur in April (32,508) and May (32,387), indicating reduced transmission during drier months. This pattern highlights the need for intensified mosquito control and public health interventions before and during the peak rainy months.</h3>

In [38]:
seasonal_cases = dengue.groupby('Month')['Dengue_Cases'].sum()
seasonal_deaths = dengue.groupby('Month')['Dengue_Deaths'].sum()

In [39]:
seasonal_cases

Month
April         32508
August       187554
December      88431
February      77801
January       84328
July         138242
June          58110
March         57576
May           32387
November      94900
October      117645
September    177943
Name: Dengue_Cases, dtype: int64

In [40]:
seasonal_deaths

Month
April         200
August        714
December      404
February      315
January       394
July          611
June          322
March         291
May           162
November      613
October      6670
September    6148
Name: Dengue_Deaths, dtype: int64

<h1>11.Longitudinal Analysis by Region</h1>
<h3>Dengue cases showed a sharp increase in 2019 across all regions, with the highest spikes in Region IV-A (76,195), Region VI (60,357), and Region III (37,158), indicating a major outbreak. However, in 2020, cases dropped significantly across all regions, with NCR (7,183), Region VI (4,131), and Region III (15,991) experiencing major declines, likely due to COVID-19 restrictions limiting human movement and mosquito exposure. </h3>

In [41]:
longitudinal_cases = dengue.groupby(['Region', 'Year'])['Dengue_Cases'].sum().unstack()

In [42]:
longitudinal_cases

Year,2016,2017,2018,2019,2020
Region,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
BARMM,2191,485,2460,5393,1008
CAR,9164,4045,7584,8808,981
NCR,13002,26103,29200,40478,7183
Region I,8281,8236,15511,19867,7171
Region II,3891,5310,17678,16634,1628
Region III,20989,25200,31726,37158,15991
Region IV-A,24282,22421,30410,76195,9721
Region IV-B,3999,2770,8019,10984,5077
Region IX,7215,4274,6161,27447,2684
Region V,2532,3225,3548,11141,1721


<h1>12.Analysis of Yearly Variability</h1>
<h3>The highest variability in dengue cases occurred in 2019 (STD = 2880.38), indicating widespread and inconsistent outbreaks across regions, possibly due to environmental and epidemiological factors. In contrast, 2020 had the lowest case variability (STD = 627.29), suggesting a more uniform decline in cases, likely influenced by pandemic-related restrictions. Dengue deaths showed the highest variation in 2016 (STD = 192.47), possibly due to differences in healthcare response across regions, while 2018 had the lowest variability (STD = 6.82), suggesting more consistent fatality rates that year.</h3>

In [43]:
yearly_std = dengue.groupby('Year')[['Dengue_Cases', 'Dengue_Deaths']].std()

In [44]:
yearly_std

Unnamed: 0_level_0,Dengue_Cases,Dengue_Deaths
Year,Unnamed: 1_level_1,Unnamed: 2_level_1
2016,1087.944973,192.469275
2017,894.176991,89.713046
2018,1221.698444,6.821431
2019,2880.384865,12.32623
2020,627.289105,21.104814


<h1>13.Percentage Contribution by Region</h1>
<h3>Region IV-A (14.21%) and Region III (11.42%) contributed the most to dengue cases, likely due to their large populations and favorable conditions for mosquito breeding. Meanwhile, BARMM (1.01%) and Region V (1.93%) had the lowest contributions, possibly due to lower population density or effective vector control measures.</h3>

In [45]:
total_cases = dengue['Dengue_Cases'].sum()
region_contribution = (cases_by_region / total_cases) * 100

In [46]:
region_contribution

Region
BARMM           1.005469
CAR             2.665272
NCR            10.106630
Region I        5.147700
Region II       3.934113
Region III     11.422446
Region IV-A    14.208249
Region IV-B     2.688542
Region IX       4.164194
Region V        1.931891
Region VI      10.242325
Region VII      9.646208
Region VIII     4.530405
Region X        7.184522
Region XI       2.834346
Region XII      5.211844
Region XIII     3.075844
Name: Dengue_Cases, dtype: float64

<h1>14.Comparison of Cases and Deaths per Region</h1>
<h3>Region IV-A (163,029 cases) and Region III (131,064 cases) had the highest dengue cases, yet their fatality counts (652 and 482, respectively) were relatively low, suggesting better healthcare management or lower disease severity. In contrast, Region XII had significantly fewer cases (59,802) but a very high death count (2,796), indicating a much higher fatality rate, possibly due to healthcare limitations or late treatment. NCR (115,966 cases, 4,008 deaths) had both high cases and deaths, reflecting its dense population and increased transmission risk. Meanwhile, Region I (59,066 cases, 157 deaths) and Region II (45,141 cases, 193 deaths) had lower fatality counts relative to their cases, suggesting effective disease control or better access to treatment.</h3>

In [47]:
comparison_bar = cases_by_region.to_frame().merge(deaths_by_region.to_frame(), left_index=True, right_index=True)


In [48]:
comparison_bar 

Unnamed: 0_level_0,Dengue_Cases,Dengue_Deaths
Region,Unnamed: 1_level_1,Unnamed: 2_level_1
BARMM,11537,332
CAR,30582,1008
NCR,115966,4008
Region I,59066,157
Region II,45141,193
Region III,131064,482
Region IV-A,163029,652
Region IV-B,30849,130
Region IX,47781,532
Region V,22167,185


<h1>15.Monthly Comparison of Deaths Across Years</h1>
<h3>Dengue deaths peaked in October and September of 2016 (3,954 and 3,418 deaths), marking the deadliest outbreak period, while subsequent years saw significantly lower fatalities. In 2019, deaths were more evenly distributed, with notable increases in August (280), July (232), and September (377), suggesting a prolonged transmission season. By 2020, deaths dropped sharply across all months, likely due to improved healthcare response and pandemic-related movement restrictions reducing mosquito exposure. Overall, October and September consistently recorded high deaths across years, reinforcing the need for intensified dengue prevention efforts before these peak months.</h3>

In [59]:
monthly_deaths_by_year = dengue.groupby(['Year', 'Month'])['Dengue_Deaths'].sum().unstack()

In [60]:
monthly_deaths_by_year 

Month,April,August,December,February,January,July,June,March,May,November,October,September
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2016,40,132,83,43,73,123,69,47,41,104,3954,3418
2017,43,111,73,64,95,74,42,36,23,78,2098,1826
2018,41,156,135,43,65,153,78,58,31,131,150,185
2019,59,280,91,101,94,232,119,103,56,88,133,377
2020,17,35,22,64,67,29,14,47,11,212,335,342


<h1>16.Monthly Comparison of Cases Across Years</h1>
<h3>Dengue cases peaked in August and September of 2019 (85,038 and 89,642 cases), marking the most severe outbreak year, while 2020 saw a drastic drop across all months, likely due to COVID-19 restrictions limiting mosquito exposure. Across multiple years, August and September consistently recorded the highest cases, aligning with the rainy season, which promotes mosquito breeding. In contrast, April and May consistently had the lowest cases, indicating reduced transmission during drier months. The sharp decline in 2020 cases, especially in peak months, highlights the impact of external factors like movement restrictions on dengue transmission.</h3>

In [61]:
monthly_cases_by_year = dengue.groupby(['Year', 'Month'])['Dengue_Cases'].sum().unstack()

In [62]:
monthly_cases_by_year

Month,April,August,December,February,January,July,June,March,May,November,October,September
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2016,7269,36195,13490,12386,17052,29744,10831,9300,8092,16252,21943,26990
2017,6343,25039,13235,9872,15623,18340,7589,7696,4853,12553,15259,17753
2018,6860,34210,31353,10466,12657,30363,11502,7944,6594,30191,30026,38617
2019,9252,85038,24397,25494,20672,55220,25523,19798,10387,30097,46382,89642
2020,2784,7072,5956,19583,18324,4575,2665,12838,2461,5807,4035,4941


<h1>17.Total Dengue Cases by Year</h1>
<h3>Dengue cases peaked in 2019 with 441,902 cases, marking the most severe outbreak, likely due to increased mosquito activity and favorable environmental conditions. In 2020, cases dropped sharply to 91,041, possibly due to COVID-19 restrictions reducing human movement and mosquito exposure.</h3>

In [51]:
cases_by_year = dengue.groupby('Year')['Dengue_Cases'].sum()

In [52]:
cases_by_year

Year
2016    209544
2017    154155
2018    250783
2019    441902
2020     91041
Name: Dengue_Cases, dtype: int64

<h1>18.Total Dengue Deaths by Year</h1>
<h3>Dengue deaths declined significantly from 8,127 in 2016 to 1,195 in 2020, despite case fluctuations, suggesting improved healthcare and disease management. The sharp drop in 2018 and beyond indicates better early detection, treatment, and possible external factors like COVID-19 restrictions reducing severe infections.</h3>

In [53]:
deaths_by_year = dengue.groupby('Year')['Dengue_Deaths'].sum()

In [54]:
deaths_by_year 

Year
2016    8127
2017    4563
2018    1226
2019    1733
2020    1195
Name: Dengue_Deaths, dtype: int64

<h1>19.Monthly Trends in Dengue Cases</h1>
<h3>Dengue cases peak in August (2,206.52) and September (2,093.45), aligning with the rainy season when mosquito breeding is at its highest. In contrast, April (382.45) and May (381.02) report the lowest cases, likely due to drier conditions reducing mosquito activity.</h3>

In [55]:
monthly_cases = dengue.groupby('Month')['Dengue_Cases'].mean()

In [56]:
monthly_cases

Month
April         382.447059
August       2206.517647
December     1040.364706
February      915.305882
January       992.094118
July         1626.376471
June          683.647059
March         677.364706
May           381.023529
November     1116.470588
October      1384.058824
September    2093.447059
Name: Dengue_Cases, dtype: float64

<h1>20.Monthly Trends in Dengue Deaths</h1>
<h3>Dengue deaths peak in October (78.47) and September (72.33), indicating a lag between peak cases in August and September and severe outcomes. In contrast, May (1.91) and April (2.35) report the lowest death rates, aligning with the lowest case numbers and reduced transmission during drier months.</h3>

In [57]:
monthly_deaths = dengue.groupby('Month')['Dengue_Deaths'].mean()

In [58]:
monthly_deaths

Month
April         2.352941
August        8.400000
December      4.752941
February      3.705882
January       4.635294
July          7.188235
June          3.788235
March         3.423529
May           1.905882
November      7.211765
October      78.470588
September    72.329412
Name: Dengue_Deaths, dtype: float64